schola.sb3.action_space_patch.HybridDistribution

class schola.sb3.action_space_patch.HybridDistribution(distributions, discrete_norm_factor=1.0, continuous_norm_factor=1.0)[source]

Bases: DiagGaussianDistribution

A composite distribution supporting discrete and continuous sub-distributions.

Parameters:

distributions (OrderedDict[str,Distribution]) – A dictionary of distributions to use for the composite distribution.
discrete_norm_factor (float, default=1.0) – The normalization factor for discrete actions, by default 1.0
continuous_norm_factor (float, default=1.0) – The normalization factor for continuous actions, by default 1.0

distributions

A dictionary of distributions to use for the composite distribution.

Type:: OrderedDict[str,Distribution]

Methods

`__init__`(distributions[, …])
`action_generator`(action)	Takes an Action Sampled from this distribution and generates the actions corresponding to each branch of the distribution (e.g. if we have 2 box spaces, it generates a sequence of 2 values sampled from those distributions).
`actions_from_params`(action_logits, log_std)	Returns samples from the probability distribution given its parameters.
`entropy`()	Returns Shannon’s entropy of the probability
`get_actions`([deterministic])	Return actions according to the probability distribution.
`log_prob`(actions)	Get the log probabilities of actions according to the distribution.
`log_prob_from_params`(mean_actions, log_std)	Compute the log probability of taking an action given the distribution parameters.
`map_dists`(func[, normalize])	Maps a function over the distributions in the composite distribution.
`mode`()	Returns the most likely action (deterministic output) from the probability distribution
`proba_distribution`(mean_actions, log_std)	Create the distribution given its parameters (mean, std)
`proba_distribution_net`(latent_dim[, …])	Create the layers and parameter that represent the distribution: one output will be the mean of the Gaussian, the other parameter will be the standard deviation (log std in fact to allow negative values)
`sample`()	Returns a sample from the probability distribution

Attributes

`action_dim`	The size of the action tensor corresponding to this distribution.
`action_dims`	The size of the action tensor corresponding to each branch of the distribution.
`layer_dim`	The neurons required for this distribution.
`layer_dims`	The number of neurons required for each branch of the distribution.
`log_std_dim`	The number of neurons required for the log standard deviation.
`log_std_dims`	The number of neurons required for the log standard deviation of each branch.

__init__(distributions, discrete_norm_factor=1.0, continuous_norm_factor=1.0)[source]

Parameters:: distributions (OrderedDict)

property action_dim: int

The size of the action tensor corresponding to this distribution.

Returns:: The size of the action tensor corresponding to this distribution.
Return type:: int

property action_dims: Dict[str, int]

The size of the action tensor corresponding to each branch of the distribution.

Returns:: A dictionary mapping branch of the distribution to the size of the action tensor corresponding to that branch.
Return type:: Dict[str,int]

action_generator(action)[source]

Takes an Action Sampled from this distribution and generates the actions corresponding to each branch of the distribution (e.g. if we have 2 box spaces, it generates a sequence of 2 values sampled from those distributions)

Parameters:: action (th.Tensor) – The action to generate the sub-actions from.
Yields:: th.Tensor – The sub-action corresponding to a branch of the distribution.
Return type:: Iterable[Tensor]

actions_from_params(action_logits, log_std, deterministic=False)[source]

Returns samples from the probability distribution given its parameters.

Returns:

actions

Parameters:

action_logits (Tensor)
log_std (Tensor)
deterministic (bool)

Return type:

Tensor

entropy()[source]

Returns Shannon’s entropy of the probability

Returns:: the entropy, or None if no analytical form is known
Return type:: Tensor

property layer_dim: int

The neurons required for this distribution.

Returns:: The number of neurons required for this distribution
Return type:: int

property layer_dims: Dict[str, int]

The number of neurons required for each branch of the distribution.

Returns:: A dictionary mapping branch of the distribution to the number of neurons required.
Return type:: Dict[str,int]

log_prob(actions)[source]

Get the log probabilities of actions according to the distribution. Note that you must first call the proba_distribution() method.

Parameters:: actions
Returns:
Return type:: Tensor

log_prob_from_params(mean_actions, log_std)[source]

Compute the log probability of taking an action given the distribution parameters.

Parameters:

mean_actions (Tensor)
log_std (Tensor)

Returns:

Return type:

Tuple[Tensor, Tensor]

property log_std_dim: int

The number of neurons required for the log standard deviation.

Returns:: The number of neurons required for the log standard deviation.
Return type:: int

property log_std_dims: Dict[str, int]

The number of neurons required for the log standard deviation of each branch.

Returns:: A dictionary mapping branch of the distribution to the number of neurons required for the log standard deviation.
Return type:: Dict[str,int]

map_dists(func, normalize=False)[source]

Maps a function over the distributions in the composite distribution.

Parameters:

func (Callable[[Distribution], Any]) – The function to map over the distributions.
normalize (bool, optional) – Whether to normalize the output of the function using the norm factors, by default False

mode()[source]

Returns the most likely action (deterministic output) from the probability distribution

Returns:: the stochastic action

proba_distribution(mean_actions, log_std)[source]

Create the distribution given its parameters (mean, std)

Parameters:

mean_actions (Tensor)
log_std (Tensor)

Returns:

proba_distribution_net(latent_dim, log_std_init=0.0)[source]

Create the layers and parameter that represent the distribution: one output will be the mean of the Gaussian, the other parameter will be the standard deviation (log std in fact to allow negative values)

Parameters:

latent_dim – Dimension of the last layer of the policy (before the action layer)
log_std_init (float) – Initial value for the log standard deviation

Returns:

sample()[source]

Returns a sample from the probability distribution

Returns:: the stochastic action
Return type:: Tensor

Engines and APIs

Engines and APIs

Hybrid RT and samples

Hybrid RT and samples

Our SDKs and libraries

Our SDKs and libraries

Content creation

Content creation

Blogs and videos

Blogs and videos

Performance guides

Performance guides

Reference

Reference

Latest news

Latest news

Events

Events

Research

Research

Getting started with:

Getting started with:

Site directory

Manuals home

schola.sb3.action_space_patch.HybridDistribution

Related pages

Module code – schola.sb3.action_space_patch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.patched_get_action_dim (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.make_hybrid_dist (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.PatchedPPO (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.ActionSpacePatch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.reshape_nonbatch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.patched_with_norm (Schola)

schola.scripts.sb3.launch – schola.scripts.sb3.launch.SB3ScriptArgs (Schola)

Looking for more documentation on GPUOpen?

Find out more about our software!

Engines and APIs

Hybrid RT and samples

Our SDKs and libraries

Content creation

Blogs and videos

Performance guides

Reference

Latest news

Events

Research

Getting started with:

AMD Schola documentation

schola.sb3.action_space_patch.HybridDistribution

Related pages

Module code – schola.sb3.action_space_patch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.patched_get_action_dim (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.make_hybrid_dist (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.PatchedPPO (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.ActionSpacePatch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.reshape_nonbatch (Schola)

schola.sb3.action_space_patch – schola.sb3.action_space_patch.patched_with_norm (Schola)

schola.scripts.sb3.launch – schola.scripts.sb3.launch.SB3ScriptArgs (Schola)

Looking for more documentation on GPUOpen?

Find out more about our software!