.. _sphx_glr_auto_examples_gradient-based-offpolicy: Gradient-based off-policy learning agents ----------------------------------------- The following example showcases how to use gradient-based Reinforcement Learning techniques (in particular, Q-learning) to train a Model Predictive Controller (MPC) scheme for a simple task in an off-policy way. .. raw:: html
.. thumbnail-parent-div-open .. raw:: html
.. only:: html .. image:: /auto_examples/gradient-based-offpolicy/images/thumb/sphx_glr_q_learning_offpolicy_thumb.png :alt: :ref:`sphx_glr_auto_examples_gradient-based-offpolicy_q_learning_offpolicy.py` .. raw:: html
Off-policy Q-learning
.. thumbnail-parent-div-close .. raw:: html
.. toctree:: :hidden: /auto_examples/gradient-based-offpolicy/q_learning_offpolicy