schola.scripts.sb3.settings.SB3ScriptArgs
- class schola.scripts.sb3.settings.SB3ScriptArgs(enable_checkpoints=False, checkpoint_dir=’./ckpt’, save_freq=100000, name_prefix_override=None, export_onnx=False, save_final_policy=False, launch_unreal=False, port=None, executable_path=None, headless=False, map=None, fps=None, disable_script=False, timesteps=3000, pbar=False, disable_eval=False, enable_tensorboard=False, log_dir=’./logs’, log_freq=10, callback_verbosity=0, schola_verbosity=0, sb3_verbosity=1, save_replay_buffer=False, save_vecnormalize=False, resume_from=None, load_vecnormalize=None, load_replay_buffer=None, reset_timestep=False, policy_parameters=None, critic_parameters=None, activation=ActivationFunctionEnum.ReLU, algorithm_settings=<factory>, plugins=<factory>)[source]
-
Bases:
ScriptArgs
Top level dataclass for configuring the script arguments used in the SB3 launcher. This dataclass extends ScriptArgs and includes various settings for training algorithms, logging, and other configurations. It allows for easy customization of the training process by specifying parameters such as timesteps, logging options, network architectures, and algorithm-specific settings.
Methods
__init__
([enable_checkpoints, …])make_unreal_connection
()Create an Unreal Engine connection based on the script arguments.
Attributes
Activation function to use in the policy and critic networks.
Verbosity level for callbacks.
checkpoint_dir
Enable saving checkpoints.
A list of layer widths representing the critic (value function) network architecture.
Whether to disable running evaluation after training.
disable_script
Flag indicating if the autolaunch script setting in the Unreal Engine Schola Plugin should be disabled.
enable_checkpoints
Enable saving checkpoints
Whether to enable TensorBoard logging.
executable_path
Path to the standalone executable, when launching a standalone Environment
export_onnx
Whether to export the model to ONNX format instead of just saving a checkpoint.
fps
Fixed FPS to use when running standalone, if None no fixed timestep is used
headless
Flag indicating if the standalone Unreal Engine process should run in headless mode
launch_unreal
Flag indicating if the script should launch a standalone Unreal Engine process
Path to a saved replay buffer to load when resuming training.
Path to a saved vector normalization statistics file to load when resuming training.
Directory to save TensorBoard logs.
Frequency of logging training metrics to TensorBoard.
map
Map to load when launching a standalone Unreal Engine process
name_prefix_override
Override the name prefix for the checkpoint files (e.g. SAC, PPO, etc.).
Whether to display a progress bar during training.
A list of layer widths representing the policy network architecture.
port
Port to connect to the Unreal Engine process, if None an open port will be automatically selected when running standalone.
Whether to reset the internal timestep counter when resuming training from a saved model.
Path to a saved model to resume training from.
save_final_policy
Whether to save the final policy after training is complete.
save_freq
Frequency with which to save checkpoints.
Whether to save the replay buffer when saving a checkpoint.
Whether to save the vector normalization statistics when saving a checkpoint.
Verbosity level for Stable Baselines3 logging.
Verbosity level for Schola-specific logging.
Total number of timesteps to train the agent.
The settings for the training algorithm to use.
A list of Plugins that can be used to extend the behaviour of launch.py
- Parameters:
-
-
enable_checkpoints (bool)
-
checkpoint_dir (str)
-
save_freq (int)
-
name_prefix_override (str)
-
export_onnx (bool)
-
save_final_policy (bool)
-
launch_unreal (bool)
-
port (int | None)
-
executable_path (str | None)
-
headless (bool)
-
map (str | None)
-
fps (int | None)
-
disable_script (bool)
-
timesteps (int)
-
pbar (bool)
-
disable_eval (bool)
-
enable_tensorboard (bool)
-
log_dir (str)
-
log_freq (int)
-
callback_verbosity (int)
-
schola_verbosity (int)
-
sb3_verbosity (int)
-
save_replay_buffer (bool)
-
save_vecnormalize (bool)
-
resume_from (str)
-
load_vecnormalize (str)
-
load_replay_buffer (str)
-
reset_timestep (bool)
-
activation (ActivationFunctionEnum)
-
algorithm_settings (PPOSettings | SACSettings)
-
plugins (List[Sb3LauncherExtension])
-
- __init__(enable_checkpoints=False, checkpoint_dir=’./ckpt’, save_freq=100000, name_prefix_override=None, export_onnx=False, save_final_policy=False, launch_unreal=False, port=None, executable_path=None, headless=False, map=None, fps=None, disable_script=False, timesteps=3000, pbar=False, disable_eval=False, enable_tensorboard=False, log_dir=’./logs’, log_freq=10, callback_verbosity=0, schola_verbosity=0, sb3_verbosity=1, save_replay_buffer=False, save_vecnormalize=False, resume_from=None, load_vecnormalize=None, load_replay_buffer=None, reset_timestep=False, policy_parameters=None, critic_parameters=None, activation=ActivationFunctionEnum.ReLU, algorithm_settings=<factory>, plugins=<factory>)
-
- Parameters:
-
-
enable_checkpoints (bool)
-
checkpoint_dir (str)
-
save_freq (int)
-
name_prefix_override (str | None)
-
export_onnx (bool)
-
save_final_policy (bool)
-
launch_unreal (bool)
-
port (int | None)
-
executable_path (str | None)
-
headless (bool)
-
map (str | None)
-
fps (int | None)
-
disable_script (bool)
-
timesteps (int)
-
pbar (bool)
-
disable_eval (bool)
-
enable_tensorboard (bool)
-
log_dir (str)
-
log_freq (int)
-
callback_verbosity (int)
-
schola_verbosity (int)
-
sb3_verbosity (int)
-
save_replay_buffer (bool)
-
save_vecnormalize (bool)
-
resume_from (str | None)
-
load_vecnormalize (str | None)
-
load_replay_buffer (str | None)
-
reset_timestep (bool)
-
activation (ActivationFunctionEnum)
-
algorithm_settings (PPOSettings | SACSettings)
-
plugins (List[Sb3LauncherExtension])
-
- Return type:
-
None
- activation: ActivationFunctionEnum = ‘relu’
-
Activation function to use in the policy and critic networks. This determines the non-linear activation function applied to each layer of the neural networks. Common options include ReLU, Tanh, and Sigmoid. The choice of activation function can affect the performance of the model and may depend on the specific characteristics of the environment. Default is ReLU, but you can choose others based on your needs.
- algorithm_settings: PPOSettings | SACSettings
-
The settings for the training algorithm to use. This can be either PPOSettings or SACSettings, depending on the chosen algorithm. This property allows for easy switching between different algorithms (e.g., PPO or SAC) by simply changing the instance of the settings class. The default is PPOSettings, which is suitable for most environments unless specified otherwise.
- callback_verbosity: int = 0
-
Verbosity level for callbacks. This controls the level of detail in the output from any callbacks used during training.
- critic_parameters: List[int] = None
-
A list of layer widths representing the critic (value function) network architecture. This defines the number of neurons in each hidden layer of the critic network. For example, [64, 64] would create a critic network with two hidden layers, each containing 64 neurons. This is only applicable for algorithms that use a critic (e.g., SAC). If set to None, it will use the default architecture defined by the algorithm.
- disable_eval: bool = False
-
Whether to disable running evaluation after training. When set to True, it will skip evaluation after training completes.
- enable_tensorboard: bool = False
-
Whether to enable TensorBoard logging.
- load_replay_buffer: str = None
-
Path to a saved replay buffer to load when resuming training. This allows for loading a previously saved replay buffer, which can be useful for continuing training with the same set of experiences. The path should point to a valid replay buffer file created by Stable Baselines3. If set to None, it will not load any replay buffer, and a new one will be created instead.
- load_vecnormalize: str = None
-
Path to a saved vector normalization statistics file to load when resuming training. This allows for loading the normalization statistics from a previous training session, ensuring that the observations are normalized consistently when resuming training. If set to None, it will not load any vector normalization statistics.
- log_dir: str = ‘./logs’
-
Directory to save TensorBoard logs.
- log_freq: int = 10
-
Frequency of logging training metrics to TensorBoard. This determines how often (in terms of training steps) the training metrics will be logged to TensorBoard. A value of 10 means that every 10 training steps, the metrics will be recorded.
- pbar: bool = False
-
Whether to display a progress bar during training. Requires TQDM and Rich to be installed.
- plugins: List[Sb3LauncherExtension]
-
A list of Plugins that can be used to extend the behaviour of launch.py
- policy_parameters: List[int] = None
-
A list of layer widths representing the policy network architecture. This defines the number of neurons in each hidden layer of the policy network. For example, [64, 64] would create a policy network with two hidden layers, each containing 64 neurons. If set to None, it will use the default architecture defined by the algorithm.
- reset_timestep: bool = False
-
Whether to reset the internal timestep counter when resuming training from a saved model. When set to True, it will reset the timestep counter to 0.
- resume_from: str = None
-
Path to a saved model to resume training from. This allows for continuing training from a previously saved checkpoint. The path should point to a valid model file created by Stable Baselines3. If set to None, training will start from scratch.
- save_replay_buffer: bool = False
-
Whether to save the replay buffer when saving a checkpoint. This allows for resuming training from the same state of the replay buffer.
- save_vecnormalize: bool = False
-
Whether to save the vector normalization statistics when saving a checkpoint. This is useful for environments where observations need to be normalized, and it allows for consistent normalization when resuming training.
- sb3_verbosity: int = 1
-
Verbosity level for Stable Baselines3 logging. This controls the level of detail in the output from Stable Baselines3 components during training.
- schola_verbosity: int = 0
-
Verbosity level for Schola-specific logging. This controls the level of detail in the output from Schola-related components during training.
- timesteps: int = 3000
-
Total number of timesteps to train the agent. This is the total number of environment steps that will be used for training. This should be set based on the complexity of the environment and the desired training duration. A higher value will typically lead to better performance but will also increase training time.