Environments¶

Saturation Environment¶

class gym_saturation.envs.saturation_env.SaturationEnv¶

Saturation algorithm in a reinforcement learning friendly way.

It’s an abstract class, so here we have only trivial smoke tests. One should override _do_deductions method in children classes.

Refer to SaturationEnv for more documentation.

>>> class DummyProver(SaturationEnv):
...     def _do_deductions(action):
...         pass

>>> env = DummyProver()

get_task() → str¶

Get the task that the agent is performing in the current environment.

Returns:: a TPTP problem filename

render() → None¶: No render.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[tuple[str, ...], dict[str, Any]]¶

Reset the environment.

Parameters:

seed – seed for compatibility
options – options for compatibility

Returns:

observations and info

set_task(task: str) → None¶

Set the specified task to the current environment.

Parameters:: task – a TPTP problem filename

step(action: Any) → tuple[tuple[str, ...], float, bool, bool, dict[str, Any]]¶

Run one time-step of the environment’s dynamics.

When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, terminated, truncated, info)

Parameters:

action – an action provided by the agent

Returns:

a tuple of four values:

observation: agent’s observation of the current environment
reward: amount of reward returned after previous action
terminated: whether the proof was found
truncated: whether the episode was finished for an external reason (e.g. time limit)
info: contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

Saturation Environment with Vampire back-end¶

class gym_saturation.envs.vampire_env.VampireEnv(prover_binary_path: str = 'vampire')¶

An RL environment wrapper around Vampire prover.

Parameters:: prover_binary_path – a path to Vampire binary; by default we expect it to be in the $PATH

Refer to SaturationEnv for more documentation.

We can run a full Gymnasium environment check:

>>> from gymnasium.utils.env_checker import check_env
>>> import gymnasium as gym
>>> env = gym.make("Vampire-v0").unwrapped
>>> check_env(env)

repeating actions change nothing

>>> env = gym.make("Vampire-v0")
>>> _ = env.reset()
>>> one = env.step("c_1")
>>> two = env.step("c_1")
>>> one == two
True

sometimes Vampire can solve a problem during pre-processing

>>> from gym_saturation.constants import MOCK_TPTP_PROBLEM
>>> trivial_problem = os.path.join(os.path.dirname(MOCK_TPTP_PROBLEM),
...     "TST002-1.p")
>>> env.unwrapped.set_task(trivial_problem)
>>> _, _ = env.reset()
>>> env.unwrapped._terminated
True
>>> _, _, terminated, _, _ = env.step("anything")
>>> terminated
True
>>> env.close()

a test of an unexpected reply from Vampire

>>> from gym_saturation.constants import MOCK_TPTP_FOLDER
>>> vampire_binary = os.path.join(MOCK_TPTP_FOLDER, "..", "vampire-mock")
>>> vampire_env = VampireEnv(prover_binary_path=vampire_binary)
>>> vampire_env.reset()
Traceback (most recent call last):
 ...
ValueError: ('Unexpected response type: ', 'who could expect that?')

close() → None¶: Terminate Vampire process.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[tuple[str, ...], dict[str, Any]]¶

Reset the environment.

Parameters:

seed – seed for compatibility
options – options for compatibility

Returns:

observations and info

Saturation Environment with iProver back-end¶

class gym_saturation.envs.iprover_env.IProverEnv(prover_binary_path: str = 'iproveropt')¶

An RL environment around iProver.

Parameters:: prover_binary_path – a path to iProver binary; by default, we assume it to be iproveropt and in the $PATH

Refer to SaturationEnv for more documentation.

>>> from gymnasium.utils.env_checker import check_env
>>> import gymnasium as gym
>>> env = gym.make(
...     "iProver-v0",
... ).unwrapped
>>> env.relay_server
Traceback (most recent call last):
 ...
ValueError: run ``reset`` first!
>>> check_env(env)

episode is never truncated by default

>>> _ = env.reset()
>>> _, _, _, truncated, _ = env.step("c_49")
>>> truncated
False

close() → None¶: Stop relay server.

property relay_server: RelayServer¶

Return the relay server object.

Raises:: ValueError – if called before reset

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[tuple[str, ...], dict[str, Any]]¶

Reset the environment.

Parameters:

seed – seed for compatibility
options – options for compatibility

Returns:

observations and info