FSB3SACSettings
struct FSB3SACSettings : public FTrainingSettings
A struct to hold SAC settings for an SB3 training script.
Note: This is a partial implementation of the SAC settings, and is not exhaustive
Dependencies: FScriptArgBuilder, FTrainingSettings
Inherits from: public FTrainingSettings
Public Interface
Destructor:
~FSB3SACSettings
virtual ~FSB3SACSettings()
Attributes: virtual
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 82, column 9)
Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp
(lines 31)
Public Functions:
GenerateTrainingArgs
virtual void GenerateTrainingArgs(int Port, FScriptArgBuilder &ArgBuilder) const const
Generate the training arguments for the script using the ArgBuilder.
Note: port is supplied as it is a common argument to pass to scripts, and is set at a high level but might be needed by specific subsettings
Parameters:
Port
(int
) – [in] The port to use for the scriptArgBuilder
(FScriptArgBuilder &
) – [in] The builder to use to generate the arguments
Attributes: const
, virtual
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 80, column 6)
Implementation: Schola/Source/Schola/Private/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.cpp
(lines 6-30)
Public Members:
float LearningRate
float LearningRate = = 0.0003
The learning rate for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 22, column 7)
int BufferSize
int BufferSize = = 1000000
The buffer size for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 26, column 5)
int LearningStarts
int LearningStarts = = 100
The number of steps to take before learning starts.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 30, column 5)
int BatchSize
int BatchSize = = 256
The batch size to use during gradient descent.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 34, column 5)
float Tau
float Tau = = 0.005
The Tau value for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 38, column 7)
float Gamma
float Gamma = = 0.99
The gamma value for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 42, column 7)
int TrainFreq
int TrainFreq = = 1
The frequency to update the target network, in steps.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 46, column 5)
int GradientSteps
int GradientSteps = = 1
The number of gradient steps to take during training.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 50, column 5)
bool OptimizeMemoryUsage
bool OptimizeMemoryUsage = = false
Optimize memory usage.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 54, column 6)
bool LearnEntCoef
bool LearnEntCoef = = true
Should we learn the entropy coefficient during training.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 58, column 6)
float InitialEntCoef
float InitialEntCoef = = 1.0
The initial entropy coefficient for the SAC algorithm.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 62, column 7)
int TargetUpdateInterval
int TargetUpdateInterval = = 1
The interval at which we update the target network.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 66, column 5)
FString TargetEntropy
FString TargetEntropy = = "auto"
The target entropy for the SAC algorithm.
use auto to learn the target entropy
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 70, column 9)
bool UseSDE
bool UseSDE = = false
Use state dependent entropy noise.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 74, column 6)
int SDESampleFreq
int SDESampleFreq = = -1
The frequency to sample the state dependent entropy noise.
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 78, column 5)
Used By: FSB3TrainingSettings
Source: Schola/Source/Schola/Public/Subsystem/SubsystemSettings/StableBaselines/Algorithms/SB3SACSettings.h
(line 15, column 1)