schola.scripts.ray.settings.IMPALASettings
class schola.scripts.ray.settings.IMPALASettings(vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0)
: Bases: RLLibAlgorithmSpecificSettings
Dataclass for IMPALA (Importance Weighted Actor-Learner Architecture) algorithm specific settings. This class defines the parameters used in the IMPALA algorithm, including V-trace settings for off-policy correction.
Methods
__init__ ([vtrace, …]) | |
get_parser () | Add the settings to the parser or subparser |
get_settings_dict () | Get the settings as a dictionary keyed by the correct parameter name in Ray |
Attributes
name | |
rllib_config | |
vtrace | Whether to use the V-trace algorithm for off-policy correction in the IMPALA algorithm. |
vtrace_clip_pg_rho_threshold | The clip threshold for V-trace rho values in the policy gradient. |
vtrace_clip_rho_threshold | The clip threshold for V-trace rho values. |
Parameters: : - vtrace (bool)
__init__(vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0) : Parameters: : - vtrace (bool)
Return type: : None
classmethod get_parser() : Add the settings to the parser or subparser
get_settings_dict() : Get the settings as a dictionary keyed by the correct parameter name in Ray
property name*: str*
property rllib_config*: Type[IMPALAConfig]*
vtrace*: bool* = True : Whether to use the V-trace algorithm for off-policy correction in the IMPALA algorithm. V-trace is a method to correct the bias introduced by using off-policy data for training. It helps to ensure that the value estimates are more accurate and stable.
vtrace_clip_pg_rho_threshold*: float* = 1.0 : The clip threshold for V-trace rho values in the policy gradient.
vtrace_clip_rho_threshold*: float* = 1.0 : The clip threshold for V-trace rho values.