schola.scripts.ray.settings.APPOSettings

Class Definition

class schola.scripts.ray.settings.APPOSettings(gae_lambda=0.95, clip_param=0.2, use_gae=True, vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0)

Bases: IMPALASettings, PPOSettings

Dataclass for APPO (Asynchronous Proximal Policy Optimization) algorithm specific settings. This class inherits from both IMPALASettings and PPOSettings to combine the settings for both algorithms. This allows for the use of both V-trace for off-policy correction and PPO for policy optimization in a single algorithm.

Parameters

gae_lambda

Type: float

clip_param

Type: float

use_gae

Type: bool

vtrace

Type: bool

vtrace_clip_rho_threshold

Type: float

vtrace_clip_pg_rho_threshold

Type: float

Attributes

clip_param

The clip parameter for the PPO algorithm.

gae_lambda

The lambda parameter for Generalized Advantage Estimation (GAE).

name

Type: str

rllib_config

Type: Type[APPOConfig]

use_gae

Whether to use Generalized Advantage Estimation (GAE) for advantage calculation.

vtrace

Whether to use the V-trace algorithm for off-policy correction in the IMPALA algorithm.

vtrace_clip_pg_rho_threshold

The clip threshold for V-trace rho values in the policy gradient.

vtrace_clip_rho_threshold

The clip threshold for V-trace rho values.

Methods

init

__init__(gae_lambda=0.95, clip_param=0.2, use_gae=True, vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0)

Return type: None

get_parser

classmethod get_parser()

Add the settings to the parser or subparser

get_settings_dict

get_settings_dict()

Get the settings as a dictionary keyed by the correct parameter name in Ray