Skip to content

RayEnv

Full path: schola.rllib.env.RayEnv

schola.rllib.env.RayEnv

RayEnv

RayEnv(protocol, simulator, verbosity=0)

Bases: BaseRayEnv, MultiAgentEnv

Schola’s single-environment implementation of MultiAgentEnv for Unreal Engine.

This class manages a single multi-agent environment communicating with Unreal Engine via a protocol/simulator architecture. It is compatible with gymnasium wrappers and always returns dict format (MultiAgentDict).

  • Inherits from:: BaseRayEnv: Shared protocol, simulator, and space management MultiAgentEnv: RLlib’s single-environment multi-agent interface

Use this class when:

  • Running with local runner (num_env_runners=0)
  • Only one parallel environment is needed
  • Gymnasium wrappers need to be applied

Key Features:

  • Single Unreal environment (num_envs must equal 1)
  • Multi-agent support
  • Protocol-based communication with Unreal Engine
  • Compatible with gymnasium wrappers (inherits from MultiAgentEnv -> gym.Env)
  • Always returns MultiAgentDict format

Methods

ItemDescription
initInitialize protocol, simulator, and shared environment infrastructure.
close_extras(**kwargs)Close protocol and stop simulator.
resetReset the environment.
stepStep the environment with the given actions.

Attributes

ItemDescription
action_spaceAction space (Dict of agent spaces).
max_num_agentsMaximum number of agents that can exist.
num_agentsTotal number of possible agents (ever seen).
observation_spaceObservation space (Dict of agent spaces).
single_action_spaceSingle-agent action space.
single_action_spacesDict mapping agent IDs to action spaces.
single_observation_spaceSingle-agent observation space.
single_observation_spacesDict mapping agent IDs to observation spaces.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)

init

__init__(protocol, simulator, verbosity=0)

Initialize protocol, simulator, and shared environment infrastructure.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)


reset

reset(*, seed=None, options=None)

Reset the environment.

Parameters

seed (int | None) : Random seed (int)

options (Dict[str, Any] | None) : Optional reset options

Returns

Tuple of (observations, infos) as MultiAgentDict format.

Return type: Tuple[Dict[str, Any], Dict[str, Any]]


step

step(actions)

Step the environment with the given actions.

Parameters

actions (Dict[str, Any]) : Action dict (MultiAgentDict {agent_id: action})

Returns

Tuple of (observations, rewards, terminateds, truncateds, infos) as MultiAgentDict format.

Return type: Tuple[Dict[str, Any], Dict[str, float], Dict[str, bool], Dict[str, bool], Dict[str, Any]]