Migrating from Schola V1.3 to V2
Migrating from Schola V1.3 to V2
Schola V2 introduces significant improvements to the API, including better separation of concerns, more flexible environment interfaces, and improved Python integration. This guide will help you migrate your existing Schola V1.3 projects to V2.
Overview of Changes
The major changes in Schola V2 include:
-
Unified Environment Interfaces: Single and multi-agent environments now use a consistent interface with better type safety
-
Protocol-Simulator Architecture: Python API now separates communication protocols from simulation management
-
New Actuators & Sensors: Refactored component-based actuators and sensors with cleaner interfaces
-
Improved Training Settings: More structured and flexible training configuration
-
Better Imitation Learning Support: Dedicated interfaces for imitation learning workflows
Python API Changes
Environment Connection
V1.3 Approach:
from schola import UnrealConnection
connection = UnrealConnection(port=50051)env = GymEnv(connection)V2 Approach:
from schola.core.simulators.unreal import UnrealEditorfrom schola.core.protocols.protobuf.gRPC import gRPCProtocolfrom schola.gym import GymEnv
# Define protocol (communication layer)protocol = gRPCProtocol(url="localhost", port=50051)
# Define simulator (Unreal Engine management)simulator = UnrealEditor()
# Create environmentenv = GymEnv(simulator=simulator, protocol=protocol)The new architecture separates:
-
Protocol: How to communicate with Unreal (gRPC, shared memory, etc.)
-
Simulator: How to manage the Unreal Engine process (editor, executable, etc.)
This allows mixing and matching different protocols and simulators as needed.
CLI Changes
V1.3 Commands:
# Stable Baselines 3schola-sb3 PPO --learning-rate 0.0003 --n-steps 2048schola-sb3 SAC --buffer-size 1000000
# RLlibschola-rllib PPO --learning-rate 0.0003V2 Commands:
# Stable Baselines 3schola sb3 train ppo --learning-rate 0.0003 --n-steps 2048schola sb3 train sac --buffer-size 1000000
# RLlibschola rllib train ppo --learning-rate 0.0003
# Or using module invocationpython -m schola.scripts.sb3.train ppopython -m schola.scripts.rllib.train ppoKey changes:
-
V1.3 used separate entry points (
schola-sb3,schola-rllib) -
V2 uses a unified
scholacommand with subcommands -
V2 adds explicit
trainsubcommand for better organization -
Algorithm is now a subcommand for sb3 (
ppo,sac) and RLlib (ppo,sac,impala)
Vectorized Environments
V1.3:
from schola import GymVectorEnv
connection = ...env = GymVectorEnv(connection, num_envs=4)V2:
from schola.gym import GymVectorEnv
simulator = ...protocol = ...# Environment vectorization is now handled by Unreal internally# Just connect to an environment with multiple instancesenv = GymVectorEnv(simulator=simulator, protocol=protocol)Auto-Reset Configuration
V1.3:
In V1.3, auto-reset behavior was implicit and always enabled for vectorized environments. There was no explicit configuration needed:
from schola.gym import VecEnvfrom schola.core.unreal_connections import UnrealEditorConnection
# Auto-reset was automatically enabled for VecEnvconnection = UnrealEditorConnection("localhost", port=8002)env = VecEnv(connection) # Auto-reset implicitly enabledV2:
In V2, auto-reset behavior is explicitly configured when creating the environment:
from schola.gym import GymVectorEnvfrom schola.core.simulators.unreal import UnrealEditorfrom schola.core.protocols.protobuf.gRPC import gRPCProtocolfrom gymnasium.vector.vector_env import AutoresetMode
protocol = gRPCProtocol(url="localhost", port=50051)simulator = UnrealEditor()
# For vectorized environments - explicitly specify autoreset_modeenv = GymVectorEnv( simulator=simulator, protocol=protocol, autoreset_mode=AutoresetMode.SAME_STEP, # Default, but now explicit)
# For single-agent GymEnv, autoreset is always DISABLEDenv_single = GymEnv(simulator=simulator, protocol=protocol)Key Differences:
-
V1.3: Auto-reset was implicit and always-on for vectorized environments
-
V2: Auto-reset is an explicit parameter with three modes:
-
DISABLED: No automatic reset. You must callenv.reset()manually when episodes end -
SAME_STEP: Automatically resets and returns first observation of new episode in the same step (default) -
NEXT_STEP: Resets with the next step (not commonly used) Why This Matters:
V2’s explicit configuration gives you more control over environment behavior, especially when debugging or wanting manual control over episode boundaries.
Unreal Engine API Changes
Environment Interface
The environment interface has been unified and improved for both single and multi-agent scenarios.
V1.3 Multi-Agent Interface:
UFUNCTION(BlueprintNativeEvent)void RegisterAgents(TMap<FString, FAgentDefinition> &OutDefinitions);
UFUNCTION(BlueprintNativeEvent)void Reset(TMap<FString, FObservation> &OutObservations);
UFUNCTION(BlueprintNativeEvent)void Step(const TMap<FString, FAction> &Actions, TMap<FString, FObservation> &OutObservations, TMap<FString, float> &OutRewards, TMap<FString, bool> &OutDones);V2 Multi-Agent Interface:
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Schola|Environment")void InitializeEnvironment( TMap<FString, FInteractionDefinition> &OutAgentDefinitions);
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Schola|Environment")void Reset(TMap<FString, FInitialAgentState> &OutAgentState);
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Schola|Environment")void Step(const TMap<FString, FInstancedStruct> &InActions, TMap<FString, FAgentState> &OutAgentStates);Key changes:
- InitializeEnvironment: Now separate from first reset, includes both observation and action space definitions via
FInteractionDefinition - FAgentState: Consolidated structure containing observations, rewards, terminated, truncated, and info
- FInitialAgentState: Separate structure for reset that includes observations and info only
- Terminated vs Truncated: Now properly separated following Gymnasium conventions
Base Interface - Inheritance vs Interface
V1.3 used an inheritance-based approach with AAbstractScholaEnvironment (an Actor), while V2 uses an interface-based approach with IBaseScholaEnvironment. This is a fundamental architectural change that also eliminates the separate Trainer system.
V1.3 Approach (Inheritance from Actor with Trainers):
In V1.3, environments managed agents through an AAbstractTrainer system:
// V1.3 - Inherit from AAbstractScholaEnvironment ActorUCLASS()class MYGAME_API AMyEnvironment : public AAbstractScholaEnvironment { GENERATED_BODY()
protected: // V1.3 stored trainers internally TMap<int, AAbstractTrainer *> Trainers;
public: // Override virtual functions from base class virtual void RegisterAgents() override; // Registered trainers with environment virtual void ResetEnvironment() override; virtual void InitializeEnvironment() override;};
// Separate AAbstractTrainer classes handled agent logicUCLASS()class MYGAME_API AMyTrainer : public AAbstractTrainer { GENERATED_BODY()
public: virtual void ComputeReward() override; virtual void ComputeStatus() override;};The V1.3 architecture separated concerns:
-
Environment (AAbstractScholaEnvironment): Managed the world state
-
Trainer (AAbstractTrainer): Handled per-agent reward/status logic
-
Static vs Dynamic: Different environment types for fixed vs. runtime agent spawning
V2 Approach (Unified Interface - No Trainers):
In V2, the Trainer system is removed. The environment directly handles all logic including rewards and episode status:
// V2 - Implement IMultiAgentScholaEnvironment interface on ANY Actor or ObjectUCLASS()class MYGAME_API AMyEnvironment : public AActor, public IMultiAgentScholaEnvironment { GENERATED_BODY()
public: // Implement BlueprintNativeEvent functions from interface virtual void InitializeEnvironment_Implementation( TMap<FString, FInteractionDefinition> &OutAgentDefinitions) override;
virtual void Reset_Implementation( TMap<FString, FInitialAgentState> &OutAgentState) override;
// Step now handles EVERYTHING: observations, rewards, terminated, truncated virtual void Step_Implementation(const TMap<FString, FInstancedStruct> &InActions, TMap<FString, FAgentState> &OutAgentStates) override { // 1. Apply actions to agents // 2. Update world state // 3. Collect observations // 4. Compute rewards // 5. Check termination/truncation // 6. Populate OutAgentStates with all of the above }
virtual void SeedEnvironment_Implementation(int Seed) override;
virtual void SetEnvironmentOptions_Implementation( const TMap<FString, FString> &Options) override;};The V2 architecture simplifies to:
-
Environment Interface Only: All logic (observation, action, reward, termination) in one place
-
No Trainer Layer: Reward and status computation happens directly in Step()
-
Unified Agent Handling: No distinction between static/dynamic - all use the same interface
Blueprint Implementation:
V1.3 Blueprint:
-
Inherited from
AbstractScholaEnvironmentBlueprint class -
Created separate
AbstractTrainerBlueprint classes for each agent type -
Environment: Overrode
Register Agentsevent to register trainers -
Trainer: Implemented
Compute RewardandCompute Statusevents
V2 Blueprint:
-
Can use any Actor Blueprint as the base (no fixed inheritance)
-
Add
MultiAgentScholaEnvironmentinterface in Class Settings -
No separate Trainer classes needed
-
Implement all logic in environment interface events:
-
Initialize Environment- Define observation and action spaces -
Reset- Reset environment and return initial observations -
Step- Apply actions, compute observations, rewards, and termination all in one place -
Seed Environment- Handle seeding -
Set Environment Options- Handle configuration options Key Advantages of V2’s Interface Approach:
- Flexibility: Your environment can inherit from any Actor class (APawn, ACharacter, custom base class)
- Multiple Interfaces: Can implement both training and imitation interfaces
- Simpler Architecture: No separate Trainer layer - all logic in one place
- Cleaner Separation: Environment logic is not tied to a specific base class
- Better for Blueprints: Easier to add Schola to existing Blueprint hierarchies
- Unified Agent Handling: No more Static vs Dynamic environment distinction
Interface Discovery:
// V2 - The GymConnector finds environments by the base interfaceUINTERFACE(BlueprintType, Blueprintable)class UBaseScholaEnvironment : public UInterface { GENERATED_BODY()};
// Implement specific interface for your use case:
// Single agent environmentsUINTERFACE(BlueprintType, Blueprintable)class USingleAgentScholaEnvironment : public UBaseScholaEnvironment { /*...*/};
// Multi-agent environmentsUINTERFACE(BlueprintType, Blueprintable)class UMultiAgentScholaEnvironment : public UBaseScholaEnvironment { /*...*/};New Functions
V2 adds several new environment functions:
// Seed the environment for reproducibilityUFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Schola|Environment")void SeedEnvironment(int Seed);
// Set environment options from PythonUFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Schola|Environment")void SetEnvironmentOptions(const TMap<FString, FString> &Options);Actuators and Sensors
Actuators and sensors are now interfaces that can be implemented by any class, not just as actor components.
V1.3 Actuator Pattern:
UCLASS()class UMyActuator : public UActorComponent { // Custom implementation};V2 Actuator Pattern:
UCLASS(BlueprintType, Blueprintable, meta = (BlueprintSpawnableComponent))class UMyActuator : public UActorComponent, public IScholaActuator { GENERATED_BODY()
public: // Interface implementation UFUNCTION(BlueprintNativeEvent, Category = "Schola|Actuator") void GetActionSpace(FInstancedStruct &OutActionSpace) const;
UFUNCTION(BlueprintNativeEvent, Category = "Schola|Actuator") void TakeAction(const FInstancedStruct &Action);
UFUNCTION(BlueprintNativeEvent, Category = "Schola|Actuator") void InitActuator();};V2 Sensor Pattern:
UCLASS(BlueprintType, Blueprintable, meta = (BlueprintSpawnableComponent))class UMySensor : public USceneComponent, public IScholaSensor { GENERATED_BODY()
public: UFUNCTION(BlueprintNativeEvent, Category = "Schola|Sensor") void CollectObservations(FInstancedStruct &OutObservations);
UFUNCTION(BlueprintNativeEvent, Category = "Schola|Sensor") void GetObservationSpace(FInstancedStruct &OutObservationSpace) const;
UFUNCTION(BlueprintNativeEvent, Category = "Schola|Sensor") void InitSensor();};Gym Connector Changes
V1.3 Setup:
In v1.3 and earlier, the gym connector was managed by a hidden Subsystem. In v2, it is now an interactable object, giving you direct control over the lifecycle and timing of the connector.
V2 Setup:
// More specific connector typeUPROPERTY(EditAnywhere)URPCGymConnector *GymConnector; // Attach to some object in the sceneV2 Connector Settings: Previously, the connector settings were available through the plugin settings menu, now they are available on the GymConnector itself.
UPROPERTY(EditAnywhere, Category = "Schola|gRPC")FRPCServerSettings ServerSettings; // Port, address configuration
UPROPERTY(EditAnywhere, Category = "Script Settings")FScriptSettings ScriptSettings; // Python script launch configuration
UPROPERTY(EditAnywhere, Category = "External Gym Connector Settings")FExternalGymConnectorSettings ExternalSettings; // Timeout settingsInitialization
V1.3:
GymConnector->Initialize(Environments);V2:
// Initialize with array of environment interfacesTArray<TScriptInterface<IBaseScholaEnvironment>> Environments;GymConnector->Init(Environments);Training Settings
Training settings are now more structured with separate configuration objects.
V2 RLlib Settings Structure:
UPROPERTY(EditAnywhere)FRLlibTrainingSettings TrainingSettings;{ FRLlibPPOSettings AlgorithmSettings; FRLlibNetworkArchitectureSettings NetworkArchitectureSettings; FRLlibCheckpointSettings CheckpointSettings; FRLlibLoggingSettings LoggingSettings; FRLlibResourceSettings ResourceSettings; FRLlibResumeSettings ResumeSettings;};V2 Stable Baselines 3 Settings Structure:
UPROPERTY(EditAnywhere)FSB3TrainingSettings TrainingSettings;{ FSB3PPOSettings AlgorithmSettings; FSB3NetworkArchitectureSettings NetworkArchitectureSettings; FSB3CheckpointSettings CheckpointSettings; FSB3LoggingSettings LoggingSettings; FSB3ResumeSettings ResumeSettings;};Points and Spaces
The Points and Spaces API remains largely compatible, but now uses FInstancedStruct and TInstancedStruct as oposed to TVariant.
V2:
// Using instanced structs for type erasureFInstancedStruct ActionSpace; // Contains FBoxSpaceFInstancedStruct Action; // Contains FBoxPoint
// Access typed valuesconst FBoxSpace &BoxSpace = ActionSpace.Get<FBoxSpace>();const FBoxPoint &BoxPoint = Action.Get<FBoxPoint>();Imitation Learning
V2 introduces dedicated interfaces for imitation learning (behavior cloning).
Imitation Environment Interface:
Python Imitation API:
from schola.minari.datacollector import ScholaDataCollectorfrom schola.core.protocols.protobuf.offlinegRPC import gRPCImitationProtocol
protocol = gRPCImitationProtocol(url="localhost", port=50051)simulator = UnrealEditor()
collector = ScholaDataCollector(protocol, simulator, seed=123)
for i in range(10): collector.step()
schola_dataset = collector.create_dataset("dataset_name")Migration Checklist
Use this checklist to ensure you’ve covered all necessary changes:
Python Code
☐ Replace UnrealConnection with protocol + simulator pattern
☐ Update import statements to use new module structure
☐ Update CLI commands to use schola entry point
☐ Replace auto_reset settings with AutoresetMode enum
☐ Update vectorized environment usage
☐ Update any custom protocol/connection code
Unreal Engine Code (C++)
☐ Update environment base class from AAbstractScholaEnvironment to IMultiAgentScholaEnvironment or ISingleAgentScholaEnvironment interface
☐ Remove all Trainer classes (AAbstractTrainer and subclasses are no longer used)
☐ Migrate Trainer logic into environment’s Step() function:
Move
ComputeReward()logic intoStep()Move
ComputeStatus()logic intoStep()Consolidate all per-agent logic in one place
☐ Replace RegisterAgents with InitializeEnvironment
☐ Update Reset signature to return FInitialAgentState
☐ Update Step signature to use FAgentState (includes observations, rewards, terminated, truncated)
☐ Add SeedEnvironment implementation
☐ Add SetEnvironmentOptions implementation
☐ Split Done flag into Terminated and Truncated
☐ Update connector type from UGymConnector to specific type (e.g., URPCGymConnector)
☐ Update connector initialization to use Init()
☐ Update training settings to use structured settings objects
Unreal Engine Blueprints
☐ Update environment Blueprint to implement MultiAgentScholaEnvironment or SingleAgentScholaEnvironment interface
☐ Delete all Trainer Blueprints (Trainer classes are no longer needed)
☐ Migrate Trainer logic into environment’s Step event:
Move reward calculation from Trainer into environment
StepMove episode status logic from Trainer into environment
StepRemove
Register Agentsevent that registered trainers
☐ Update actuator components to implement IScholaActuator
☐ Update sensor components to implement IScholaSensor
☐ Replace custom actuators with built-in ones where possible
☐ Update agent state structures in Blueprint nodes
☐ Update Done logic to separate Terminated and Truncated
Common Migration Issues
Issue: “Interface not found”
Problem: Environment doesn’t appear in Schola’s environment list
Solution: Ensure your environment implements IBaseScholaEnvironment (either through ISingleAgentScholaEnvironment or IMultiAgentScholaEnvironment)
Issue: “Type mismatch with FInstancedStruct”
Problem: Compiler errors when working with Points and Spaces
Solution: Use FInstancedStruct wrapper and access typed values with Get<T>()
// Correct V2 approachFInstancedStruct ActionStruct;ActionStruct.InitializeAs<FBoxPoint>();FBoxPoint &Action = ActionStruct.GetMutable<FBoxPoint>();Issue: “Environment step not called”
Problem: Step function isn’t being invoked
Solution:
- Verify connector is properly initialized with
Init() - Check that Python script sent startup message
- Ensure environment is in the connector’s environment list
Issue: “Python connection timeout”
Problem: Python script times out connecting to Unreal
Solution:
- Verify port numbers match between Unreal and Python
- Increase
environment_start_timeoutingRPCProtocolArgs - Check firewall settings
Additional Resources
-
Getting Started with Schola - Setting up Schola V2 from scratch
-
Running Schola - Running training with V2
For more detailed API documentation, see: