mpcrl.core.exploration.OrnsteinUhlenbeckExploration#

class mpcrl.core.exploration.OrnsteinUhlenbeckExploration(mean, sigma, theta=0.15, dt=1.0, initial_noise=None, hook='on_update', mode='gradient-based', seed=None)[source]#

Bases: ExplorationStrategy

Exploration based on the Ornstein-Uhlenbeck Brownian motion with friction.

Inspired by stable_baselines3.common.noise.OrnsteinUhlenbeckActionNoise.

Parameters:

meanscheduler or array/supports-algebraic-operations

Mean of the stochastic process. Should have the same shape as the action.

sigmascheduler or array/supports-algebraic-operations

Standard deviation of the stochastic process. Should have the same shape as the action.

thetafloat, optional

Coefficient of attraction of the process towards mean, by default 0.15.

dtfloat, optional

Time step of the process, by default 1.0.

initial_noisearray-like, optional

A default initial noise. By default None, in which case it is set to zero.

hook{“on_update”, “on_episode_end”, “on_timestep_end”}, optional

Specifies to which callback to hook onto, i.e., when to step the exploration’s schedulers (if any) to, e.g., decay the chances of exploring or the perturbation strength (see step also). The options are

"on_update", which steps the exploration after each agent’s update
"on_episode_end", which steps the exploration after each episode ends
"on_timestep_end", which steps the exploration after each env’s timestep.

By default, "on_update" is selected.

mode{“gradient-based”, “additive”} optional

Mode of application of explorative perturbations to the MPC. If "additive", then the drawn pertubation is added to the optimal action computed by the MPC solver. By default, "gradient-based" is selected, and in this mode the pertubations enter directly in the MPC objective and is multiplied by the first action, thus affecting its gradient.

seedNone, int, array_like of ints, SeedSequence, BitGenerator, Generator

Number to seed the numpy.random.Generator used for randomizing the exploration. By default, None.

Methods

`can_explore`()	Computes whether, according to the exploration strategy, the agent should explore or not now, at the current instant.
`perturbation`(_, size, *__)	Returns a random perturbation.
`reset`([seed])	Resets the exploration status, in case it is non-deterministic.
`step`(_, *__)	Updates (i.e., decays or increases) the mean and standard deviation of the perturbation according to their schedulers.

Attributes

`hook`	Gets which callback the exploration is hooked on, i.e., when to step the exploration's schedulers (if any) to, e.g., decay the chances of exploring or the perturbation strength (see `step` also).
`mode`	Gets the mode of application of explorative perturbations to the MPC.

can_explore()[source]#

Computes whether, according to the exploration strategy, the agent should explore or not now, at the current instant.

Returns:

bool: True if the agent should explore according to this strategy; otherwise, False.

Return type:

bool

property hook: Literal['on_update', 'on_episode_end', 'on_timestep_end'] | None#: Gets which callback the exploration is hooked on, i.e., when to step the exploration’s schedulers (if any) to, e.g., decay the chances of exploring or the perturbation strength (see step also). Can be None in case no hook is needed.

property mode: Literal['gradient-based', 'additive']#: Gets the mode of application of explorative perturbations to the MPC.

perturbation(*_, size, **__)[source]#

Returns a random perturbation.

Return type:: ndarray[tuple[Any, ...], dtype[floating]]

reset(seed=None)[source]#

Resets the exploration status, in case it is non-deterministic.

Return type:: None

step(*_, **__)[source]#

Updates (i.e., decays or increases) the mean and standard deviation of the perturbation according to their schedulers.

Return type:: None

Examples using `mpcrl.core.exploration.OrnsteinUhlenbeckExploration`#

On-policy Deterministic Policy Gradient

mpcrl.core.exploration.OrnsteinUhlenbeckExploration#

Examples using mpcrl.core.exploration.OrnsteinUhlenbeckExploration#

This Page

Examples using `mpcrl.core.exploration.OrnsteinUhlenbeckExploration`#