Environments¶
Saturation Environment¶
- class gym_saturation.envs.saturation_env.SaturationEnv¶
Saturation algorithm in a reinforcement learning friendly way.
It’s an abstract class, so here we have only trivial smoke tests. One should override
_do_deductionsmethod in children classes.Refer to SaturationEnv for more documentation.
>>> class DummyProver(SaturationEnv): ... def _do_deductions(action): ... pass
>>> env = DummyProver()
- get_task() str¶
Get the task that the agent is performing in the current environment.
- Returns:
a TPTP problem filename
- render() None¶
No render.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[tuple[str, ...], dict[str, Any]]¶
Reset the environment.
- Parameters:
seed – seed for compatibility
options – options for compatibility
- Returns:
observations and info
- set_task(task: str) None¶
Set the specified task to the current environment.
- Parameters:
task – a TPTP problem filename
- step(action: Any) tuple[tuple[str, ...], float, bool, bool, dict[str, Any]]¶
Run one time-step of the environment’s dynamics.
When end of episode is reached, you are responsible for calling
reset()to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, terminated, truncated, info)- Parameters:
action – an action provided by the agent
- Returns:
a tuple of four values:
observation: agent’s observation of the current environment
reward: amount of reward returned after previous action
terminated: whether the proof was found
truncated: whether the episode was finished for an external reason (e.g. time limit)
info: contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Saturation Environment with Vampire back-end¶
- class gym_saturation.envs.vampire_env.VampireEnv(prover_binary_path: str = 'vampire')¶
An RL environment wrapper around Vampire prover.
- Parameters:
prover_binary_path – a path to Vampire binary; by default we expect it to be in the $PATH
Refer to SaturationEnv for more documentation.
We can run a full Gymnasium environment check:
>>> from gymnasium.utils.env_checker import check_env >>> import gymnasium as gym >>> env = gym.make("Vampire-v0").unwrapped >>> check_env(env)
repeating actions change nothing
>>> env = gym.make("Vampire-v0") >>> _ = env.reset() >>> one = env.step("c_1") >>> two = env.step("c_1") >>> one == two True
sometimes Vampire can solve a problem during pre-processing
>>> from gym_saturation.constants import MOCK_TPTP_PROBLEM >>> trivial_problem = os.path.join(os.path.dirname(MOCK_TPTP_PROBLEM), ... "TST002-1.p") >>> env.unwrapped.set_task(trivial_problem) >>> _, _ = env.reset() >>> env.unwrapped._terminated True >>> _, _, terminated, _, _ = env.step("anything") >>> terminated True >>> env.close()
a test of an unexpected reply from Vampire
>>> from gym_saturation.constants import MOCK_TPTP_FOLDER >>> vampire_binary = os.path.join(MOCK_TPTP_FOLDER, "..", "vampire-mock") >>> vampire_env = VampireEnv(prover_binary_path=vampire_binary) >>> vampire_env.reset() Traceback (most recent call last): ... ValueError: ('Unexpected response type: ', 'who could expect that?')
- close() None¶
Terminate Vampire process.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[tuple[str, ...], dict[str, Any]]¶
Reset the environment.
- Parameters:
seed – seed for compatibility
options – options for compatibility
- Returns:
observations and info
Saturation Environment with iProver back-end¶
- class gym_saturation.envs.iprover_env.IProverEnv(prover_binary_path: str = 'iproveropt')¶
An RL environment around iProver.
- Parameters:
prover_binary_path – a path to iProver binary; by default, we assume it to be
iproveroptand in the $PATH
Refer to SaturationEnv for more documentation.
>>> from gymnasium.utils.env_checker import check_env >>> import gymnasium as gym >>> env = gym.make( ... "iProver-v0", ... ).unwrapped >>> env.relay_server Traceback (most recent call last): ... ValueError: run ``reset`` first! >>> check_env(env)
episode is never truncated by default
>>> _ = env.reset() >>> _, _, _, truncated, _ = env.step("c_49") >>> truncated False
- close() None¶
Stop relay server.
- property relay_server: RelayServer¶
Return the relay server object.
- Raises:
ValueError – if called before
reset
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[tuple[str, ...], dict[str, Any]]¶
Reset the environment.
- Parameters:
seed – seed for compatibility
options – options for compatibility
- Returns:
observations and info