Setting Up Inference
This guide will explain how to use your trained RL agents in inference mode (i.e. without connecting to Python).
This guide assumes you have already done a training run using Schola and have either a saved checkpoint or a model exported to Onnx.
Convert a Checkpoint to Onnx
If you did not export to Onnx during training you will need to convert a checkpoint to Onnx. You can use the following scripts to create an Onnx model from your checkpoint:
schola sb3 export —policy-checkpoint-path <CHECKPOINT_PATH> —output-path <ONNX_PATH> —algorithm <ALGORITHM>
schola rllib export —policy-checkpoint-path <CHECKPOINT_DIR> [—output-path <OUTPUT_DIR>]
For SB3, <ALGORITHM> must match how the checkpoint was trained (for example PPO or SAC); allowed values are the same as the export command’s --algorithm choices in schola sb3 export --help.
For RLlib, <CHECKPOINT_DIR> is the algorithm checkpoint directory produced by Ray. --output-path is optional; if omitted, ONNX output is written alongside the checkpoint (see schola rllib export --help).
These commands produce an ONNX model in Schola’s export layout for use in the next section.
Load an Onnx Model into Unreal Engine
Once you have your Onnx model you can import it into Unreal Engine by dragging and dropping the .onnx file into the content browser. This will create a new Onnx model data asset in your project.
Setting up Your Unreal Engine Level
Schola’s inference system consists of three main components:
- Agent - Any object implementing the
IAgentinterface that defines observation and action spaces - Policy - A
UNNEPolicythat loads your trained ONNX model and performs inference - Stepper - A
USimpleStepper(orUPipelinedStepper) that coordinates the observation-inference-action loop
Follow these steps to set up inference in your project:
Step 1: Implement the IAgent Interface
Create a class (Actor, Component, or any UObject) that implements the IAgent interface. You must implement these methods:
-
Define() - Specify the observation and action spaces for your agent
-
Observe() - Collect current observations from the environment
-
Act() - Execute actions provided by the policy
-
GetStatus() / SetStatus() - Manage agent state
Create a Blueprint class and add the Agent interface. Implement the Define, Observe, and Act events.
UCLASS()class AMyAgent : public AActor, public IAgent {GENERATED_BODY()
virtual voidDefine_Implementation(FInteractionDefinition &OutDefinition) override;virtual voidObserve_Implementation(FInstancedStruct &OutObservations) override;virtual void Act_Implementation(const FInstancedStruct &InAction) override;};Step 2: Create and Configure the Policy
Create a UNNEPolicy object and configure it with your ONNX model:
- In your Blueprint or C++, create a
UNNEPolicyobject - Set the
Model Dataproperty to the ONNX model data asset you imported - Set the
Runtime Nameto your desired inference runtime (e.g., “NNERuntimeORTCpu” or “NNERuntimeORTDml”) - Call
Init()with the agent’s interaction definition
-
Add a
UNNEPolicyvariable to your Blueprint -
In
BeginPlay, callDefineon your agent to get the interaction definition -
Call
Initon the policy, passing the interaction definition -
Set the
Model DataandRuntime Nameproperties in the details panel
UNNEPolicy *Policy = NewObject<UNNEPolicy>(this);Policy->ModelData = YourOnnxModelDataAsset;Policy->RuntimeName = TEXT("NNERuntimeORTCpu");
FInteractionDefinition Definition;IAgent::Execute_Define(YourAgent, Definition);Policy->Init(Definition);Step 3: Create and Initialize the Stepper
Create a USimpleStepper to manage the observation-inference-action loop:
- Create a
USimpleStepperobject - Call
Init()with your agent(s) and policy - Call
Step()each frame (e.g., inTick()) to run inference
-
Add a
USimpleSteppervariable to your Blueprint -
In
BeginPlay, callInitwith an array of agents and your policy -
In
Tick, callStepon the stepper
USimpleStepper *Stepper = NewObject<USimpleStepper>(this);
TArray<TScriptInterface<IAgent>> Agents;Agents.Add(YourAgent);
Stepper->Init(Agents, Policy);
// In your Tick function:Stepper->Step();For better performance with slower inference, consider using UPipelinedStepper instead of USimpleStepper. The pipelined stepper overlaps observation collection and action execution with inference.