RayEnv

Full path: schola.rllib.env.RayEnv

schola.rllib.env.RayEnv

RayEnv

RayEnv(protocol, simulator, verbosity=0)

Bases: BaseRayEnv, MultiAgentEnv

Schola’s single-environment implementation of MultiAgentEnv for Unreal Engine.

This class manages a single multi-agent environment communicating with Unreal Engine via a protocol/simulator architecture. It is compatible with gymnasium wrappers and always returns dict format (MultiAgentDict).

Inherits from:: BaseRayEnv: Shared protocol, simulator, and space management MultiAgentEnv: RLlib’s single-environment multi-agent interface

Use this class when:

Running with local runner (num_env_runners=0)
Only one parallel environment is needed
Gymnasium wrappers need to be applied

Key Features:

Single Unreal environment (num_envs must equal 1)
Multi-agent support
Protocol-based communication with Unreal Engine
Compatible with gymnasium wrappers (inherits from MultiAgentEnv -> gym.Env)
Always returns MultiAgentDict format

Methods

Item	Description
init	Initialize protocol, simulator, and shared environment infrastructure.
`close_extras`(**kwargs)	Close protocol and stop simulator.
reset	Reset the environment.
step	Step the environment with the given actions.

Attributes

Item	Description
`action_space`	Action space (Dict of agent spaces).
`max_num_agents`	Maximum number of agents that can exist.
`num_agents`	Total number of possible agents (ever seen).
`observation_space`	Observation space (Dict of agent spaces).
`single_action_space`	Single-agent action space.
`single_action_spaces`	Dict mapping agent IDs to action spaces.
`single_observation_space`	Single-agent observation space.
`single_observation_spaces`	Dict mapping agent IDs to observation spaces.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)

init

__init__(protocol, simulator, verbosity=0)

Initialize protocol, simulator, and shared environment infrastructure.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)

reset

reset(*, seed=None, options=None)

Reset the environment.

Parameters

seed (int | None) : Random seed (int)

options (Dict[str, Any] | None) : Optional reset options

Returns

Tuple of (observations, infos) as MultiAgentDict format.

Return type: Tuple[Dict[str, Any], Dict[str, Any]]

step

step(actions)

Step the environment with the given actions.

Parameters

actions (Dict[str, Any]) : Action dict (MultiAgentDict {agent_id: action})

Returns

Tuple of (observations, rewards, terminateds, truncateds, infos) as MultiAgentDict format.

Return type: Tuple[Dict[str, Any], Dict[str, float], Dict[str, bool], Dict[str, bool], Dict[str, Any]]