mpcrl.core.callbacks.AgentCallbackMixin#
- class mpcrl.core.callbacks.AgentCallbackMixin[source]#
Bases:
CallbackMixinClass with callbacks for agents.
In particular, this class defines the following callbacks:
on_mpc_failure, invoked when an MPC solver failson_validation_start, invoked when validation starts (seempcrl.Agent.evaluate)on_validation_end, invoked when validation endson_episode_start, invoked when a training or validation episode startson_episode_end, invoked when a training or validation episode endson_env_step, invoked when a training or validation episode steps, i.e., aftergymnasium.Env.stepon_timestep_end, invoked when the current simulation’s time step reaches an end, i.e., after having stepped the environment and done all the internal computations according to the algorithm.
Methods
on_env_step(env, episode, timestep)Callback called after each call to
gymnasium.Env.step.on_episode_end(env, episode, rewards)Callback called at the end of each episode in the training or evaluation process (see
mpcrl.Agent.evaluate,mpcrl.LearningAgent.trainandmpcrl.LearningAgent.train_offpolicy).on_episode_start(env, episode, state)Callback called at the beginning of each episode in the training or validation process (see
mpcrl.Agent.evaluate,mpcrl.LearningAgent.trainandmpcrl.LearningAgent.train_offpolicy).on_mpc_failure(episode, timestep, status, raises)Callback in case of failure of the MPC solver.
on_timestep_end(env, episode, timestep)Callback called at the end of each time iteration.
on_validation_end(env, returns)Callback called at the end of the validation process (see
mpcrl.Agent.evaluate).on_validation_start(env)Callback called at the beginning of the validation process (see
mpcrl.Agent.evaluate)- on_env_step(env, episode, timestep)[source]#
Callback called after each call to
gymnasium.Env.step.- Parameters:
- envgym env
A gym environment where the agent is being trained on.
- episodeint
Number of the training episode.
- timestepint
Time instant of the current training episode.
- Return type:
- on_episode_end(env, episode, rewards)[source]#
Callback called at the end of each episode in the training or evaluation process (see
mpcrl.Agent.evaluate,mpcrl.LearningAgent.trainandmpcrl.LearningAgent.train_offpolicy).- Parameters:
- envgym env
A gym environment where the agent is being trained on.
- episodeint
Number of the training episode.
- rewardsfloat
Cumulative rewards for this episode.
- Return type:
- on_episode_start(env, episode, state)[source]#
Callback called at the beginning of each episode in the training or validation process (see
mpcrl.Agent.evaluate,mpcrl.LearningAgent.trainandmpcrl.LearningAgent.train_offpolicy).- Parameters:
- envgym env
A gym environment where the agent is being trained on.
- episodeint
Number of the training episode.
- stateObsType
Starting state for this episode.
- Return type:
- on_mpc_failure(episode, timestep, status, raises)[source]#
Callback in case of failure of the MPC solver.
- Parameters:
- episodeint
Number of the episode when the failure happened.
- timestepint or None
Timestep of the current episode when the failure happened. Can be
None, in case the error occurs inter-episodically or no notion of time step is available.- statusstr
Status of the solver that failed.
- raisesbool
Whether the failure should be raised as exception (
True) or as a warning (False).
- Return type:
- on_timestep_end(env, episode, timestep)[source]#
Callback called at the end of each time iteration. It is called with the same frequency as
on_env_step, but with different timing.- Parameters:
- envgym env
A gym environment where the agent is being trained on.
- episodeint
Number of the training episode.
- timestepint
Time instant of the current training episode.
- Return type:
- on_validation_end(env, returns)[source]#
Callback called at the end of the validation process (see
mpcrl.Agent.evaluate).- Parameters:
- envgym env
A gym environment where the agent has been validated on.
- returnsarray of double
Each episode’s cumulative rewards.
- Return type:
- on_validation_start(env)[source]#
Callback called at the beginning of the validation process (see
mpcrl.Agent.evaluate)- Parameters:
- envgym env
A gym environment where the agent is being validated on.
- Return type: