RayVecEnv

Full path: schola.rllib.env.RayVecEnv

schola.rllib.env.RayVecEnv

RayVecEnv

RayVecEnv(*args, **kwargs)

Bases: BaseRayEnv, VectorMultiAgentEnv

Schola’s vectorized implementation of VectorMultiAgentEnv for Unreal Engine.

This class manages multiple parallel multi-agent environments communicating with Unreal Engine via a protocol/simulator architecture. It follows RLlib’s SyncVectorMultiAgentEnv pattern by maintaining a list of MultiAgentEnv instances in self.envs.

Inherits from:: BaseRayEnv: Shared protocol, simulator, and space management VectorMultiAgentEnv: RLlib’s vectorized multi-agent interface

Note: Does NOT inherit from MultiAgentEnv - only uses MultiAgentEnv instances via _SingleEnvWrapper in self.envs list.

Use this class when:

Running with remote runners (num_env_runners >= 1)
Multiple parallel environments are needed
Maximum training throughput is desired

Key Features:

Supports multiple parallel Unreal environments (num_envs >= 1)
Multi-agent support within each environment
Automatic episode reset (autoreset_mode=”next_step”)
Protocol-based communication with Unreal Engine
Always returns List[MultiAgentDict] format
Follows RLlib’s VectorMultiAgentEnv pattern with self.envs list

Methods

Item	Description
init	Initialize protocol, simulator, and shared environment infrastructure.
`close_extras`(**kwargs)	Close protocol and stop simulator.
reset	Reset all sub-environments.
step	Step all sub-environments with the given actions.

Attributes

Item	Description
`action_space`	Action space (Dict of agent spaces).
`max_num_agents`	Maximum number of agents that can exist.
`num_agents`	Total number of possible agents (ever seen).
`observation_space`	Observation space (Dict of agent spaces).
`single_action_space`	Single-agent action space.
`single_action_spaces`	Dict mapping agent IDs to action spaces.
`single_observation_space`	Single-agent observation space.
`single_observation_spaces`	Dict mapping agent IDs to observation spaces.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)

init

__init__(protocol, simulator, verbosity=0)

Initialize protocol, simulator, and shared environment infrastructure.

Parameters

protocol (BaseRLProtocol)

simulator (BaseSimulator)

verbosity (int)

reset

reset(*, seed=None, options=None)

Reset all sub-environments.

Parameters

seed (int | List[int] | None) : Random seed (int or list of ints, one per environment)

options (Dict[str, Any] | None) : Optional reset options

Returns

Tuple of (observations, infos) as List[MultiAgentDict] format.

Return type: Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]

step

step(actions)

Step all sub-environments with the given actions.

Parameters

actions (List[Dict[str, Any]]) : List of action dicts (List[MultiAgentDict])

Returns

Tuple of (observations, rewards, terminateds, truncateds, infos) as List[MultiAgentDict] format.

Return type: Tuple[List[Dict[str, Any]], List[Dict[str, float]], List[Dict[str, bool]], List[Dict[str, bool]], List[Dict[str, Any]]]