SaturationEnv¶
SaturationEnv is an abstract class for environments guiding the choice of a given clause in the saturation algorithm used to build automated theorem provers. It has two subclasses:
VampireEnvis an environment for guiding the choice of a given clause in the saturation loop of a Vampire prover. Since we focus on guiding the saturation loop here, we don’t use the Avatar [3]IProverEnvis an environment for guiding the choice of a given clause in the saturation loop of the iProver [1]
Action Space |
|
Observation Space |
|
import |
|
import |
|
EXTENDED_ALPHANUMERIC should cover characters used by the TPTP language.
Description¶
The given clause (or saturation) algorithm is the basis of many contemporary provers. See [2] for an excellent introduction and Python code snippets. In short, the algorithm is the following one:
unprocessed_clauses: list[Clause] = get_preprocessed_theorem_statement()
processed_clauses: list[Clause] = []
while EMPTY_CLAUSE not in unprocessed_clauses and unprocessed_clauses:
given_clause: Clause = select_given_clause(unprocessed_clauses)
processed_clauses.append(given_clause)
new_clauses: list[Clause] = apply_inference_rules(
given_clause, processed_clauses
)
unprocessed_clauses.extend(new_clauses)
unprocessed_clauses.remove(given_clause)
get_preprocessed_theorem_statement corresponds to the environment reset, and typically includes parsing, Skolemization, transformation to conjunctive normal form among other things.
apply_inference_rules is a ‘logic engine’ of a prover corresponding to what happens during the environment’s step. It usually includes resolution, superposition and other nasty stuff. To guarantee the algorithm applicability, the inference rules system must be refutation complete.
select_given_clause is a trainable agent policy.
If the EMPTY_CLAUSE (aka falsehood or contradiction) appears among the unprocessed_clauses, it means we arrived at a refutation of the original set of clauses, which gives us a proof by contradiction for our theorem. If the unprocessed_clauses becomes empty, it means we’ve built a counter-example for our proposition.
Action Space¶
Action is a label of a given clause. Different provers use different labelling systems.
Observation Space¶
An observation is a tuple of strings, containing a clause literals in the TPTP syntax, e.g. 'mult(X, mult(Y, Z)) = mult(mult(X, Y), Z)' for each clause belonging to unprocessed_clauses and processed_clauses
Starting State¶
A starting state of the environment depends on a task set (a theorem to prove). If there are N unprocessed clauses in the pre-processed theorem statement, the starting state contains N strings.
By default, the task is a simple theorem from group theory:
cnf(associativity, axiom, mult(X, mult(Y, Z)) = mult(mult(X, Y), Z)).
cnf(left_identity, axiom, mult(e, X) = X).
cnf(left_inverse, axiom, mult(inv(X), X) = e).
cnf(idempotent_element, hypothesis, mult(a, a) = a).
cnf(negated_conjecture, negated_conjecture, ~ a = e).
One can set another task by specifying a filename of a respective TPTP problem:
env.set_task(filename)
Rewards¶
Reward is always 0.0. One has to use reward wrappers
according to their training strategy.
Episode End¶
Termination means the saturation algorithm ended with refutation found or satisfiability established.
There is no default truncation condition (one can add it using wrappers, e.g. TimeLimit)
Information¶
The environment returns no additional information.
Arguments¶
import gymnasium
gymnasium.make(
"Vampire-v0", # or "iProver-v0"
prover_binary_path="vampire", # or "iproveropt"
)
prover_binary_path="vampire" (or "iproveropt"): the path to a prover binary (supposed to be on the $PATH by default)
References¶
Version History¶
v0: Initial version release