Note
Go to the end to download the full example code
Random and age agents for Vampire and iProver¶
Random agent for Vampire¶
To make a gym-saturation
environment, we have to import the package
import gymnasium as gym
import gym_saturation
then we can make a prover environment as any other Gymnasium one
env = gym.make("Vampire-v0")
before using the environment, we should reset it
observation, info = env.reset()
gym-saturation
environments don’t return any info
print(info)
{}
observation is a tuple of JSON representations of logic clauses
from pprint import pprint
pprint(observation)
({'birth_step': 0,
'inference_parents': (),
'inference_rule': 'input',
'label': '1',
'literals': 'mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2)',
'role': 'lemma'},
{'birth_step': 0,
'inference_parents': (),
'inference_rule': 'input',
'label': '2',
'literals': 'mult(e,X0) = X0',
'role': 'lemma'},
{'birth_step': 0,
'inference_parents': (),
'inference_rule': 'input',
'label': '3',
'literals': 'e = mult(inv(X0),X0)',
'role': 'lemma'},
{'birth_step': 0,
'inference_parents': (),
'inference_rule': 'input',
'label': '4',
'literals': 'a = mult(a,a)',
'role': 'lemma'},
{'birth_step': 0,
'inference_parents': (),
'inference_rule': 'input',
'label': '5',
'literals': 'e != a',
'role': 'lemma'})
We can render the environment state in the TPTP format. By default, we are trying to prove a basic group theory lemma: every idempotent element equals the identity
env.render()
cnf(1, lemma, mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2), inference(input, [], [])).
cnf(2, lemma, mult(e,X0) = X0, inference(input, [], [])).
cnf(3, lemma, e = mult(inv(X0),X0), inference(input, [], [])).
cnf(4, lemma, a = mult(a,a), inference(input, [], [])).
cnf(5, lemma, e != a, inference(input, [], [])).
here is an example of an episode during which we play random actions. We set the random seed for reproducibility.
env.action_space.seed(0)
terminated, truncated = False, False
while not (terminated or truncated):
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
env.close()
the episode terminated with positive reward
print(terminated, truncated, reward)
True False 1.0
It means we arrived at a contradiction ($false
) which proves the lemma.
Notice the birth_step
number of a contradiction, it shows how many steps
we did to find proof.
pprint(observation[-1])
{'birth_step': 1077,
'inference_parents': ('17', '5'),
'inference_rule': 'subsumption_resolution',
'label': '18',
'literals': '$false',
'role': 'lemma'}
Age agent for iProver¶
We initialise iProver-based environment in the same way
env = gym.make("iProver-v0")
Instead of a random agent, let’s use Age agent which selects actions in the order they appear
observation, info = env.reset()
terminated, truncated = False, False
action = 0
while not (terminated or truncated):
observation, reward, terminated, truncated, info = env.step(action)
action += 1
env.close()
Loop <_UnixSelectorEventLoop running=False closed=True debug=False> that handles pid 7094 is closed
We still arrive at contradiction but it takes a different number of steps
print(terminated, truncated, reward)
pprint(observation[-1])
True False 1.0
{'birth_step': 1,
'inference_parents': ('c_85', 'c_53'),
'inference_rule': 'forward_subsumption_resolution',
'label': 'c_86',
'literals': '$false',
'role': 'lemma'}
Total running time of the script: (0 minutes 0.825 seconds)