schola.scripts.ray.settings.NetworkArchitectureSettings
- class schola.scripts.ray.settings.NetworkArchitectureSettings(fcnet_hiddens=<factory>, activation=ActivationFunctionEnum.ReLU, use_attention=False, attention_dim=64)[source]
-
Bases:
object
Dataclass for network architecture settings used in the RLlib training process. This class defines the parameters for the neural network architecture used for policy and value function approximation. This includes the hidden layer sizes, activation functions, and whether to use an attention mechanism. These settings help to control the complexity and capacity of the neural network model used in the training process.
Methods
__init__
([fcnet_hiddens, activation, …])populate_arg_group
(args_group)Attributes
The activation function to use for the fully connected network.
The dimension of the attention layer.
Whether to use an attention mechanism in the model.
The hidden layer architecture for the fully connected network.
- Parameters:
-
-
activation (ActivationFunctionEnum)
-
use_attention (bool)
-
attention_dim (int)
- __init__(fcnet_hiddens=<factory>, activation=ActivationFunctionEnum.ReLU, use_attention=False, attention_dim=64)
-
- Parameters:
-
-
activation (ActivationFunctionEnum)
-
use_attention (bool)
-
attention_dim (int)
- Return type:
-
None
- activation: ActivationFunctionEnum = ‘relu’
-
The activation function to use for the fully connected network. This specifies the non-linear activation function applied to each neuron in the hidden layers of the neural network. The default is ReLU (Rectified Linear Unit), which is a commonly used activation function in deep learning due to its simplicity and effectiveness. Other options may include Tanh, Sigmoid, etc. This can be adjusted based on the specific requirements of the problem and the architecture of the neural network.
- attention_dim: int = 64
-
The dimension of the attention layer. This specifies the size of the output from the attention mechanism if use_attention is set to True. The attention dimension determines how many features will be used to represent the output of the attention layer. A larger value may allow for more complex representations but will also increase the computational cost. The default is 64, which is a common choice for many applications.
- fcnet_hiddens: List[int]
-
The hidden layer architecture for the fully connected network. This specifies the number of neurons in each hidden layer of the neural network used for the policy and value function approximation. The default is [512, 512], which means two hidden layers with 512 neurons each. This can be adjusted based on the complexity of the problem and the size of the input state space.
- property name: str
- classmethod populate_arg_group(args_group)[source]
- use_attention: bool = False
-
Whether to use an attention mechanism in the model. This specifies whether to include an attention layer in the neural network architecture. Note, this attends does not attend over the inputs but rather the timestep dimension.