==================================================== Reinforcement Learning with Model Predictive Control ==================================================== .. automodule:: mpcrl |PyPI version| |Source Code License| |Python 3.9| |Tests| |Docs| |Downloads| |Maintainability| |Code Coverage| |Ruff| ------------------ Short introduction ------------------ This framework, also referred to as *RL with/using MPC*, was first proposed in :cite:`gros_datadriven_2020` and has so far been shown effective in various applications, with different learning algorithms and more sound theory, e.g., :cite:`cai_mpcbased_2021,esfahani_approximate_2021,zanon_safe_2021,gros_learning_2022`. It merges two powerful control techinques into a single data-driven one - MPC, a well-known control methodology that exploits a prediction model to predict the future behaviour of the environment and compute the optimal action - and RL, a Machine Learning paradigm that showed many successes in recent years (with games such as chess, Go, etc.) and is highly adaptable to unknown and complex-to-model environments. The figure below shows the main idea behind this learning-based control approach. The MPC controller, parametrized in its objective, predictive model and constraints (or a subset of these), acts both as policy provider (i.e., providing an action to the environment, given the current state) and as function approximation for the state and action value functions (i.e., predicting the expected return following the current control policy from the given state and state-action pair). Concurrently, an RL algorithm is employed to tune this parametrization of the MPC in such a way to increase the controller's performance and achieve an (sub)optimal policy. For this purpose, different algorithms can be employed, two of the most successful being Q-learning :cite:`esfahani_approximate_2021` and Deterministic Policy Gradient (DPG) :cite:`cai_mpcbased_2021`. .. figure:: _static/mpcrl.diagram.light.png :alt: Main idea behind the MPC-RL framework :align: center :width: 50% :class: only-light Diagram of the MPC-RL framework. .. figure:: _static/mpcrl.diagram.dark.png :alt: Main idea behind the MPC-RL framework :align: center :width: 50% :class: only-dark Diagram of the MPC-RL framework. ------ Author ------ `Filippo Airaldi `_, PhD Candidate [f.airaldi@tudelft.nl | filippoairaldi@gmail.com] at `Delft Center for Systems and Control `_ in `Delft University of Technology `_. Copyright (c) 2025 Filippo Airaldi. Copyright notice: Technische Universiteit Delft hereby disclaims all copyright interest in the program “mpcrl” (Reinforcement Learning with Model Predictive Control) written by the Author(s). Prof. Dr. Ir. Fred van Keulen, Dean of ME. .. toctree:: :hidden: User guide Examples Module reference Changelog ------------------ Indices and tables ------------------ * :ref:`genindex` * :ref:`modindex` * :ref:`search` ---------- References ---------- .. bibliography:: references.bib :style: plain .. |PyPI version| image:: https://badge.fury.io/py/mpcrl.svg :target: https://badge.fury.io/py/mpcrl :alt: PyPI version .. |Source Code License| image:: https://img.shields.io/badge/license-MIT-blueviolet :target: https://github.com/FilippoAiraldi/mpc-reinforcement-learning/blob/main/LICENSE :alt: MIT License .. |Python 3.9| image:: https://img.shields.io/badge/python-%3E=3.9-green.svg :alt: Python 3.9 .. |Tests| image:: https://github.com/FilippoAiraldi/mpc-reinforcement-learning/actions/workflows/tests.yml/badge.svg :target: https://github.com/FilippoAiraldi/mpc-reinforcement-learning/actions/workflows/tests.yml :alt: Tests .. |Docs| image:: https://readthedocs.org/projects/mpc-reinforcement-learning/badge/?version=stable :target: https://mpc-reinforcement-learning.readthedocs.io/en/stable/?badge=stable :alt: Docs .. |Downloads| image:: https://static.pepy.tech/badge/mpcrl :target: https://www.pepy.tech/projects/mpcrl :alt: Downloads .. |Maintainability| image:: https://qlty.sh/gh/FilippoAiraldi/projects/mpc-reinforcement-learning/maintainability.svg :target: https://qlty.sh/gh/FilippoAiraldi/projects/mpc-reinforcement-learning :alt: Maintainability .. |Code Coverage| image:: https://qlty.sh/gh/FilippoAiraldi/projects/mpc-reinforcement-learning/coverage.svg :target: https://qlty.sh/gh/FilippoAiraldi/projects/mpc-reinforcement-learning :alt: Test Coverage .. |Ruff| image:: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json :target: https://docs.astral.sh/ruff :alt: ruff