schola-rllib Command

This script trains an RLlib model using Schola, allowing customization of training, logging, network architecture, and resource allocation through command-line arguments.

Usage

usage: schola-rllib [-h] [--launch-unreal] [--executable-path EXECUTABLE_PATH] [--headless] [-p PORT] [--map MAP] [--fps FPS] [--disable-script] [-t TIMESTEPS]
                    [--learning-rate LEARNING_RATE] [--minibatch-size MINIBATCH_SIZE] [--train-batch-size-per-learner TRAIN_BATCH_SIZE_PER_LEARNER] [--num-sgd-iter NUM_SGD_ITER]
                    [--gamma GAMMA] [-scholav SCHOLA_VERBOSITY] [-rllibv RLLIB_VERBOSITY] [--enable-checkpoints] [--checkpoint-dir CHECKPOINT_DIR] [--save-freq SAVE_FREQ]
                    [--name-prefix NAME_PREFIX_OVERRIDE] [--export-onnx] [--save-final-policy] [--resume-from RESUME_FROM] [--fcnet-hiddens FCNET_HIDDENS [FCNET_HIDDENS ...]]
                    [--num-workers NUM_WORKERS] [--num-envs-per-worker NUM_ENVS_PER_WORKER] [--num-cpus-per-worker NUM_CPUS_PER_WORKER] [--num-gpus NUM_GPUS]
                    {PPO,APPO,IMPALA} ...

Unreal Process Arguments

--launch-unreal - Launch Unreal Engine automatically
Default: False
Required: False
--executable-path - Path to the Unreal Engine executable
Type: str
Required: False
--headless - Run Unreal Engine in headless mode
Default: False
Required: False
-p, --port - Port for Unreal Engine communication
Default: 15151
Type: int
Required: False
--map - Map to load in Unreal Engine
Type: str
Required: False
--fps - Target FPS for Unreal Engine
Default: 60
Type: int
Required: False
--disable-script - Disable script execution in Unreal Engine
Default: False
Required: False

Training Arguments

-t, --timesteps - Number of timesteps to train
Default: 1000000
Type: int
Required: False
--learning-rate - Learning rate for training
Default: 0.0003
Type: float
Required: False
--minibatch-size - Minibatch size for training
Default: 32
Type: int
Required: False
--train-batch-size-per-learner - Training batch size per learner
Default: 500
Type: int
Required: False
--num-sgd-iter - Number of SGD iterations
Default: 10
Type: int
Required: False
--gamma - Discount factor for future rewards
Default: 0.99
Type: float
Required: False

Logging Arguments

-scholav, --schola-verbosity - Verbosity level for the Schola environment
Default: 0
Type: int
Required: False
-rllibv, --rllib-verbosity - Verbosity level for RLlib
Default: 1
Type: int
Required: False

Checkpoint Arguments

--enable-checkpoints - Enable saving checkpoints
Default: False
Required: False
--checkpoint-dir - Directory to save checkpoints
Default: './ckpt'
Type: str
Required: False
--save-freq - Frequency with which to save checkpoints
Default: 100000
Type: int
Required: False
--name-prefix - Override the name prefix for the checkpoint files (e.g. SAC, PPO, etc.)
Type: str
Required: False
--export-onnx - Export the model to ONNX format instead of just saving a checkpoint
Default: False
Required: False
--save-final-policy - Save the final policy after training is complete
Default: False
Required: False
--resume-from - Path to a saved model to resume training from
Type: str
Required: False

Network Architecture Arguments

--fcnet-hiddens - Fully connected network hidden layer sizes
Type: int (multiple values allowed)
Required: False

Resource Arguments

--num-workers - Number of worker processes
Default: 2
Type: int
Required: False
--num-envs-per-worker - Number of environments per worker
Default: 1
Type: int
Required: False
--num-cpus-per-worker - Number of CPUs per worker
Default: 1
Type: int
Required: False
--num-gpus - Number of GPUs to use
Default: 0
Type: int
Required: False

Sub-commands

PPO

Proximal Policy Optimization algorithm.

Optional Arguments

Algorithm-specific arguments for PPO configuration

APPO

Asynchronous Proximal Policy Optimization algorithm.

Optional Arguments

Algorithm-specific arguments for APPO configuration

IMPALA

Importance Weighted Actor-Learner Architecture algorithm.

Optional Arguments

Algorithm-specific arguments for IMPALA configuration