FSB3SACSettings
struct FSB3SACSettings : public FTrainingSettingsA struct to hold SAC settings for an SB3 training script.
Note: This is a partial implementation of the SAC settings, and is not exhaustive
Dependencies: FScriptArgBuilder, FTrainingSettings
Inherits from: public FTrainingSettings
Public Interface
Destructor:
~FSB3SACSettings
virtual ~FSB3SACSettings()Attributes: virtual
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 82, column 9)
Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp (lines 31)
Public Functions:
GenerateTrainingArgs
virtual void GenerateTrainingArgs(int Port, FScriptArgBuilder &ArgBuilder) const constGenerate the training arguments for the script using the ArgBuilder.
Note: port is supplied as it is a common argument to pass to scripts, and is set at a high level but might be needed by specific subsettings
Parameters:
Port(int) – [in] The port to use for the scriptArgBuilder(FScriptArgBuilder &) – [in] The builder to use to generate the arguments
Attributes: const, virtual
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 80, column 6)
Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp (lines 6-30)
Public Members:
float LearningRate
float LearningRate = = 0.0003The learning rate for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 22, column 7)
int BufferSize
int BufferSize = = 1000000The buffer size for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 26, column 5)
int LearningStarts
int LearningStarts = = 100The number of steps to take before learning starts.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 30, column 5)
int BatchSize
int BatchSize = = 256The batch size to use during gradient descent.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 34, column 5)
float Tau
float Tau = = 0.005The Tau value for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 38, column 7)
float Gamma
float Gamma = = 0.99The gamma value for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 42, column 7)
int TrainFreq
int TrainFreq = = 1The frequency to update the target network, in steps.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 46, column 5)
int GradientSteps
int GradientSteps = = 1The number of gradient steps to take during training.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 50, column 5)
bool OptimizeMemoryUsage
bool OptimizeMemoryUsage = = falseOptimize memory usage.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 54, column 6)
bool LearnEntCoef
bool LearnEntCoef = = trueShould we learn the entropy coefficient during training.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 58, column 6)
float InitialEntCoef
float InitialEntCoef = = 1.0The initial entropy coefficient for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 62, column 7)
int TargetUpdateInterval
int TargetUpdateInterval = = 1The interval at which we update the target network.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 66, column 5)
FString TargetEntropy
FString TargetEntropy = = "auto"The target entropy for the SAC algorithm.
use auto to learn the target entropy
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 70, column 9)
bool UseSDE
bool UseSDE = = falseUse state dependent entropy noise.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 74, column 6)
int SDESampleFreq
int SDESampleFreq = = -1The frequency to sample the state dependent entropy noise.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 78, column 5)
Used By: FSB3TrainingSettings
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h (line 15, column 1)