FSB3SACSettings

struct FSB3SACSettings : public FTrainingSettings

A struct to hold SAC settings for an SB3 training script.

Note: This is a partial implementation of the SAC settings, and is not exhaustive

Dependencies: FScriptArgBuilder, FTrainingSettings

Inherits from: public FTrainingSettings

Public Interface

Destructor:

~FSB3SACSettings

virtual ~FSB3SACSettings()

Attributes: virtual

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 82, column 9)

Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp (lines 31)

Public Functions:

GenerateTrainingArgs

virtual void GenerateTrainingArgs(int Port, FScriptArgBuilder &ArgBuilder) const const

Generate the training arguments for the script using the ArgBuilder.

Note: port is supplied as it is a common argument to pass to scripts, and is set at a high level but might be needed by specific subsettings

Parameters:

Port (int) – [in] The port to use for the script
ArgBuilder (FScriptArgBuilder &) – [in] The builder to use to generate the arguments

Attributes: const, virtual

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 80, column 6)

Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp (lines 6-30)

Public Members:

float LearningRate

float LearningRate = = 0.0003

The learning rate for the SAC algorithm.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 22, column 7)

int BufferSize

int BufferSize = = 1000000

The buffer size for the SAC algorithm.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 26, column 5)

int LearningStarts

int LearningStarts = = 100

The number of steps to take before learning starts.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 30, column 5)

int BatchSize

int BatchSize = = 256

The batch size to use during gradient descent.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 34, column 5)

float Tau

float Tau = = 0.005

The Tau value for the SAC algorithm.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 38, column 7)

float Gamma

float Gamma = = 0.99

The gamma value for the SAC algorithm.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 42, column 7)

int TrainFreq

int TrainFreq = = 1

The frequency to update the target network, in steps.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 46, column 5)

int GradientSteps

int GradientSteps = = 1

The number of gradient steps to take during training.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 50, column 5)

bool OptimizeMemoryUsage

bool OptimizeMemoryUsage = = false

Optimize memory usage.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 54, column 6)

bool LearnEntCoef

bool LearnEntCoef = = true

Should we learn the entropy coefficient during training.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 58, column 6)

float InitialEntCoef

float InitialEntCoef = = 1.0

The initial entropy coefficient for the SAC algorithm.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 62, column 7)

int TargetUpdateInterval

int TargetUpdateInterval = = 1

The interval at which we update the target network.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 66, column 5)

FString TargetEntropy

FString TargetEntropy = = "auto"

The target entropy for the SAC algorithm.

use auto to learn the target entropy

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 70, column 9)

bool UseSDE

bool UseSDE = = false

Use state dependent entropy noise.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 74, column 6)

int SDESampleFreq

int SDESampleFreq = = -1

The frequency to sample the state dependent entropy noise.

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 78, column 5)

Used By: FSB3TrainingSettings

Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 15, column 1)