SaturationEnv¶
SaturationEnv
is an abstract class for environments guiding the choice of a given clause in the saturation algorithm used to build automated theorem provers. It has two subclasses:
VampireEnv
is an environment for guiding the choice of a given clause in the saturation loop of a Vampire prover. Since we focus on guiding the saturation loop here, we don’t use the Avatar [3]IProverEnv
is an environment for guiding the choice of a given clause in the saturation loop of the iProver [1]
Action Space |
|
Observation Space |
|
import |
|
import |
|
Here Clause(n)
is an alias for
Dict(
'birth_step': Discrete(n),
'inference_parents': Sequence(
Text(1, 256, characters=ALPHANUMERIC_WITH_UNDERSCORE),
stack=False
),
'inference_rule': Text(1, 256, characters=ALPHANUMERIC_WITH_UNDERSCORE),
'label': Text(1, 256, characters=ALPHANUMERIC_WITH_UNDERSCORE),
'literals': Text(1, 4000, characters=EXTENDED_ALPHANUMERIC),
'role': Text(1, 256, characters=ALPHANUMERIC_WITH_UNDERSCORE)
)
and EXTENDED_ALPHANUMERIC
is ALPHANUMERIC_WITH_UNDERSCORE
extended by nine special characters (), |~=!$
. Such a structure corresponds to clauses (logical statements) in the TPTP language.
Description¶
The given clause (or saturation) algorithm is the basis of many contemporary provers. See [2] for an excellent introduction and Python code snippets. In short, the algorithm is the following one:
unprocessed_clauses: list[Clause] = get_preprocessed_theorem_statement()
processed_clauses: list[Clause] = []
while EMPTY_CLAUSE not in unprocessed_clauses and unprocessed_clauses:
given_clause: Clause = select_given_clause(unprocessed_clauses)
processed_clauses.append(given_clause)
new_clauses: list[Clause] = apply_inference_rules(
given_clause, processed_clauses
)
unprocessed_clauses.extend(new_clauses)
unprocessed_clauses.remove(given_clause)
get_preprocessed_theorem_statement
corresponds to the environment reset, and typically includes parsing, Skolemization, transformation to conjunctive normal form among other things.
apply_inference_rules
is a ‘logic engine’ of a prover corresponding to what happens during the environment’s step
. It usually includes resolution, superposition and other nasty stuff. To guarantee the algorithm applicability, the inference rules system must be refutation complete.
select_given_clause
is a trainable agent policy.
If the EMPTY_CLAUSE
(aka falsehood or contradiction) appears among the unprocessed_clauses
, it means we arrived at a refutation of the original set of clauses, which gives us a proof by contradiction for our theorem. If the unprocessed_clauses
becomes empty, it means we’ve built a counter-example for our proposition.
Action Space¶
Action is an index of a given clause. It belongs to a discrete space of size n
. n
is the maximal number of clauses in a proof state (unprocessed_clauses
and processed_clauses
together).
Observation Space¶
An observation is a tuple of strings, containing a clause literals in the TPTP syntax, e.g. 'mult(X, mult(Y, Z)) = mult(mult(X, Y), Z)'
for each clause belonging to unprocessed_clauses
and processed_clauses
Starting State¶
A starting state of the environment depends on a task set (a theorem to prove). If there are N
unprocessed clauses in the pre-processed theorem statement, the starting state contains N
strings.
By default, the task is a simple theorem from group theory:
cnf(associativity, axiom, mult(X, mult(Y, Z)) = mult(mult(X, Y), Z)).
cnf(left_identity, axiom, mult(e, X) = X).
cnf(left_inverse, axiom, mult(inv(X), X) = e).
cnf(idempotent_element, hypothesis, mult(a, a) = a).
cnf(negated_conjecture, negated_conjecture, ~ a = e).
One can set another task by specifying a filename of a respective TPTP problem:
env.set_task(filename)
Rewards¶
Reward is 1.0
after a step iff the saturation algorithm terminated at this step, and 0.0
otherwise.
Episode End¶
Termination means the saturation algorithm ended with refutation found or satisfiability established.
Truncation happens if the number of clauses in the state exceeds
action_space.n
.
Information¶
The environment returns no additional information.
Arguments¶
import gymnasium
gymnasium.make(
"Vampire-v0", # or "iProver-v0"
max_clauses=1000,
render_mode="human",
prover_binary_path="vampire", # or "iproveropt"
)
max_clauses=1000
: the size n
of the action space.
render_mode="human"
: either ansi
(return the clauses from the current proof state in the TPTP format) or human
(print the ansi
rendering to the standard output)
prover_binary_path="vampire"
(or "iproveropt"
): the path to a prover binary (supposed to be on the $PATH
by default)
References¶
Version History¶
v0: Initial version release