.. _sphx_glr_auto_examples_gradient-based-offpolicy:
Gradient-based off-policy learning agents
-----------------------------------------
The following example showcases how to use gradient-based Reinforcement Learning
techniques (in particular, Q-learning) to train a Model Predictive Controller (MPC)
scheme for a simple task in an off-policy way.
.. raw:: html
.. thumbnail-parent-div-open
.. raw:: html
.. only:: html
.. image:: /auto_examples/gradient-based-offpolicy/images/thumb/sphx_glr_q_learning_offpolicy_thumb.png
:alt:
:ref:`sphx_glr_auto_examples_gradient-based-offpolicy_q_learning_offpolicy.py`
.. raw:: html
Off-policy Q-learning
.. thumbnail-parent-div-close
.. raw:: html
.. toctree::
:hidden:
/auto_examples/gradient-based-offpolicy/q_learning_offpolicy