RayVecEnv
Full path:
schola.rllib.env.RayVecEnv
schola.rllib.env.RayVecEnv
RayVecEnv
RayVecEnv(*args, **kwargs)Bases: BaseRayEnv, VectorMultiAgentEnv
Schola’s vectorized implementation of VectorMultiAgentEnv for Unreal Engine.
This class manages multiple parallel multi-agent environments communicating with Unreal Engine via a protocol/simulator architecture. It follows RLlib’s SyncVectorMultiAgentEnv pattern by maintaining a list of MultiAgentEnv instances in self.envs.
- Inherits from:: BaseRayEnv: Shared protocol, simulator, and space management VectorMultiAgentEnv: RLlib’s vectorized multi-agent interface
Note: Does NOT inherit from MultiAgentEnv - only uses MultiAgentEnv instances via _SingleEnvWrapper in self.envs list.
Use this class when:
- Running with remote runners (num_env_runners >= 1)
- Multiple parallel environments are needed
- Maximum training throughput is desired
Key Features:
- Supports multiple parallel Unreal environments (num_envs >= 1)
- Multi-agent support within each environment
- Automatic episode reset (autoreset_mode=”next_step”)
- Protocol-based communication with Unreal Engine
- Always returns List[MultiAgentDict] format
- Follows RLlib’s VectorMultiAgentEnv pattern with self.envs list
Methods
| Item | Description |
|---|---|
| init | Initialize protocol, simulator, and shared environment infrastructure. |
close_extras(**kwargs) | Close protocol and stop simulator. |
| reset | Reset all sub-environments. |
| step | Step all sub-environments with the given actions. |
Attributes
| Item | Description |
|---|---|
action_space | Action space (Dict of agent spaces). |
max_num_agents | Maximum number of agents that can exist. |
num_agents | Total number of possible agents (ever seen). |
observation_space | Observation space (Dict of agent spaces). |
single_action_space | Single-agent action space. |
single_action_spaces | Dict mapping agent IDs to action spaces. |
single_observation_space | Single-agent observation space. |
single_observation_spaces | Dict mapping agent IDs to observation spaces. |
Parameters
protocol (BaseRLProtocol)
simulator (BaseSimulator)
verbosity (int)
init
__init__(protocol, simulator, verbosity=0)Initialize protocol, simulator, and shared environment infrastructure.
Parameters
protocol (BaseRLProtocol)
simulator (BaseSimulator)
verbosity (int)
reset
reset(*, seed=None, options=None)Reset all sub-environments.
Parameters
seed (int | List[int] | None)
: Random seed (int or list of ints, one per environment)
options (Dict[str, Any] | None)
: Optional reset options
Returns
Tuple of (observations, infos) as List[MultiAgentDict] format.
Return type: Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]
step
step(actions)Step all sub-environments with the given actions.
Parameters
actions (List[Dict[str, Any]])
: List of action dicts (List[MultiAgentDict])
Returns
Tuple of (observations, rewards, terminateds, truncateds, infos) as List[MultiAgentDict] format.
Return type: Tuple[List[Dict[str, Any]], List[Dict[str, float]], List[Dict[str, bool]], List[Dict[str, bool]], List[Dict[str, Any]]]