Skip to content

schola.scripts.ray.settings.PPOSettings

Class Definition

class schola.scripts.ray.settings.PPOSettings(gae_lambda=0.95, clip_param=0.2, use_gae=True)

Bases: RLLibAlgorithmSpecificSettings

Dataclass for PPO (Proximal Policy Optimization) algorithm specific settings. This class defines the parameters used in the PPO algorithm, including GAE lambda, clip parameter, and whether to use GAE.

Parameters

gae_lambda

Type: float

clip_param

Type: float

use_gae

Type: bool

Attributes

clip_param

Type: float
Default: 0.2

The clip parameter for the PPO algorithm. This is the epsilon value used in the clipped surrogate objective function. It helps to limit the policy update step size to prevent large changes that could lead to performance collapse.

gae_lambda

Type: float
Default: 0.95

The lambda parameter for Generalized Advantage Estimation (GAE). This controls the trade-off between bias and variance in the advantage estimation.

name

Type: str

rllib_config

Type: Type[PPOConfig]

use_gae

Type: bool
Default: True

Whether to use Generalized Advantage Estimation (GAE) for advantage calculation. GAE is a method to reduce the variance of the advantage estimates while keeping bias low. If set to False, the standard advantage calculation will be used instead.

Methods

__init__

__init__(gae_lambda=0.95, clip_param=0.2, use_gae=True)

Return type: None

get_parser

classmethod get_parser()

Add the settings to the parser or subparser

get_settings_dict

get_settings_dict()

Get the settings as a dictionary keyed by the correct parameter name in Ray--- title: “schola.scripts.ray.settings.PPOSettings” description: “Bases: RLLibAlgorithmSpecificSettings.” sidebar: label: “PPOSettings”

Class Definition

class schola.scripts.ray.settings.PPOSettings(gae_lambda=0.95, clip_param=0.2, use_gae=True)

Bases: RLLibAlgorithmSpecificSettings

Dataclass for PPO (Proximal Policy Optimization) algorithm specific settings. This class defines the parameters used in the PPO algorithm, including GAE lambda, clip parameter, and whether to use GAE.

Parameters

gae_lambda

Type: float

clip_param

Type: float

use_gae

Type: bool

Attributes

clip_param

Type: float
Default: 0.2

The clip parameter for the PPO algorithm. This is the epsilon value used in the clipped surrogate objective function. It helps to limit the policy update step size to prevent large changes that could lead to performance collapse.

gae_lambda

Type: float
Default: 0.95

The lambda parameter for Generalized Advantage Estimation (GAE). This controls the trade-off between bias and variance in the advantage estimation.

name

Type: str

rllib_config

Type: Type[PPOConfig]

use_gae

Type: bool
Default: True

Whether to use Generalized Advantage Estimation (GAE) for advantage calculation. GAE is a method to reduce the variance of the advantage estimates while keeping bias low. If set to False, the standard advantage calculation will be used instead.

Methods

__init__

__init__(gae_lambda=0.95, clip_param=0.2, use_gae=True)

Return type: None

get_parser

classmethod get_parser()

Add the settings to the parser or subparser

get_settings_dict

get_settings_dict()

Get the settings as a dictionary keyed by the correct parameter name in Ray