schola.scripts.ray.settings.ResourceSettings
- class schola.scripts.ray.settings.ResourceSettings(num_gpus=0, num_cpus=1, num_learners=0, num_cpus_for_main_process=1, num_cpus_per_learner=1, num_gpus_per_learner=0)[source]
-
Bases:
object
Dataclass for resource settings used in the RLlib training process. This class defines the parameters for allocating computational resources, including the number of GPUs and CPUs to use for the training job. These settings help to control how resources are allocated for the training process, which can impact performance and training times. This is especially important when running on a cluster or distributed environment.
Methods
__init__
([num_gpus, num_cpus, num_learners, …])populate_arg_group
(args_group)Attributes
The total number of CPUs to use for the training process.
The number of CPUs to allocate for the main process.
The number of CPUs to allocate for each learner process.
The number of GPUs to use for the training process.
Optional[bool] = False
The number of learner processes to use for the training job.
- Parameters:
- __init__(num_gpus=0, num_cpus=1, num_learners=0, num_cpus_for_main_process=1, num_cpus_per_learner=1, num_gpus_per_learner=0)
- property name: str
- num_cpus: int | None = 1
-
The total number of CPUs to use for the training process. This specifies how many CPU cores are available for the RLlib training job. This can be used to parallelize the training process across multiple CPU cores, which can help to speed up training times.
- num_cpus_for_main_process: int | None = 1
-
The number of CPUs to allocate for the main process. This is the number of CPU cores that will be allocated to the main process that manages the training job. This can be used to ensure that the main process has enough resources to handle the workload and manage the learner processes effectively.
- num_cpus_per_learner: int | None = 1
-
The number of CPUs to allocate for each learner process. This specifies how many CPU cores will be allocated to each individual learner process that is used for training. This can be used to ensure that each learner has enough resources to handle its workload and process the training data efficiently.
- num_gpus: int | None = 0
-
The number of GPUs to use for the training process. This specifies how many GPUs are available for the RLlib training job. If set to 0, it will default to CPU training. This can be used to leverage GPU acceleration for faster training times if available.
- num_gpus_per_learner: int | None = 0
-
Optional[bool] = False
- Type:
-
The number of GPUs to allocate for each learner process. This specifies how many GPUs will be allocated to each individual learner process that is used for training. uster
- num_learners: int | None = 0
-
The number of learner processes to use for the training job. This specifies how many parallel learner processes will be used to train the model. Each learner will process a portion of the training data and update the model weights independently. This can help to speed up training times by leveraging multiple CPU cores or GPUs.
- classmethod populate_arg_group(args_group)[source]