Learning feedback Nash strategies for nonlinear port-Hamiltonian systems
Authors
Lukas Kölsch, Pol Jané Soneira, Albertus Johannes Malan, Sören Hohmann
Abstract
This paper presents an adaptive control strategy for solving multi-player noncooperative differential games with dynamics modelled as general nonlinear input-state-output port-Hamiltonian systems. The proposed controller is obtained by extending an existing single-player feedback Nash strategy to N players and by using the Hamiltonian of the port-Hamiltonian system as an admissible control-Lyapunov function for each player. Necessary and sufficient conditions for the stability of the resulting controlled system are provided by employing Lyapunov stability theory. Furthermore, the N player feedback strategy is extended by adaptively weighting the individual value functions to ensure convergence to the Nash solution. Finally, numerical simulations demonstrate the effectiveness of the proposed explicit control laws.
Citation
- Journal: International Journal of Control
- Year: 2023
- Volume: 96
- Issue: 1
- Pages: 201–213
- Publisher: Informa UK Limited
- DOI: 10.1080/00207179.2021.1986233
BibTeX
@article{K_lsch_2021,
title={{Learning feedback Nash strategies for nonlinear port-Hamiltonian systems}},
volume={96},
ISSN={1366-5820},
DOI={10.1080/00207179.2021.1986233},
number={1},
journal={International Journal of Control},
publisher={Informa UK Limited},
author={Kölsch, Lukas and Jané Soneira, Pol and Malan, Albertus Johannes and Hohmann, Sören},
year={2021},
pages={201--213}
}
References
- Abouheaf, M. I. & Lewis, F. L. Multi-agent differential graphical games: Nash online adaptive learning solutions. 52nd IEEE Conference on Decision and Control 5803–5809 (2013) doi:10.1109/cdc.2013.6760804 – 10.1109/cdc.2013.6760804
- Altmann, R. & Schulze, P. A port-Hamiltonian formulation of the Navier–Stokes equations for reactive flows. Systems & Control Letters vol. 100 51–55 (2017) – 10.1016/j.sysconle.2016.12.005
- Aoues, S., Cardoso-Ribeiro, F. L., Matignon, D. & Alazard, D. Modeling and Control of a Rotating Flexible Spacecraft: A Port-Hamiltonian Approach. IEEE Transactions on Control Systems Technology vol. 27 355–362 (2019) – 10.1109/tcst.2017.2771244
- Arrow K. J., Studies in linear and non-linear programming (1958)
- Avila-Becerril S., International Journal of Control (2020)
- Başar T., Dynamic noncooperative game theory (1999)
- Bressan, A. Noncooperative Differential Games. Milan Journal of Mathematics vol. 79 357–427 (2011) – 10.1007/s00032-011-0163-6
- Engwerda J. C., Lq dynamic optimization and differential games (2006)
- Engwerda, J. C. & Salmah. Necessary and Sufficient Conditions for Feedback Nash Equilibria for the Affine-Quadratic Differential Game. Journal of Optimization Theory and Applications vol. 157 552–563 (2012) – 10.1007/s10957-012-0188-1
- Fiaz, S., Zonetti, D., Ortega, R., Scherpen, J. M. A. & van der Schaft, A. J. A port-Hamiltonian approach to power network modeling and analysis. European Journal of Control vol. 19 477–485 (2013) – 10.1016/j.ejcon.2013.09.002
- Gheibi, A., Ghiasi, A. R., Ghaemi, S. & Badamchizadeh, M. A. Designing of robust adaptive passivity-based controller based on reinforcement learning for nonlinear port-Hamiltonian model with disturbance. International Journal of Control vol. 93 1754–1764 (2018) – 10.1080/00207179.2018.1532607
- Golub G., Journal of the Society of Industrial and Applied Mathematics: Series B, Numerical Analysis (1965)
- Horn, R. A. & Johnson, C. R. Matrix Analysis. (2012) doi:10.1017/cbo9781139020411 – 10.1017/cbo9781139020411
- Kölsch, L., Jané Soneira, P., Strehle, F. & Hohmann, S. Optimal control of port-Hamiltonian systems: A continuous-time learning approach. Automatica vol. 130 109725 (2021) – 10.1016/j.automatica.2021.109725
- Krstic, M. & Kokotovic, P. V. Adaptive nonlinear design with controller-identifier separation and swapping. IEEE Transactions on Automatic Control vol. 40 426–440 (1995) – 10.1109/9.376055
- Lewis, F. L., Vrabie, D. L. & Syrmos, V. L. Optimal Control. (2012) doi:10.1002/9781118122631 – 10.1002/9781118122631
- Liu, D., Li, H. & Wang, D. Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics. IEEE Transactions on Systems, Man, and Cybernetics: Systems vol. 44 1015–1027 (2014) – 10.1109/tsmc.2013.2295351
- Macchelli, A., Melchiorri, C. & Stramigioli, S. Port-Based Modeling and Simulation of Mechanical Systems With Rigid and Flexible Links. IEEE Transactions on Robotics vol. 25 1016–1029 (2009) – 10.1109/tro.2009.2026504
- Mazouchi, M., Naghibi-Sistani, M. B. & Sani, S. K. H. A novel distributed optimal adaptive control algorithm for nonlinear multi-agent differential graphical games. IEEE/CAA Journal of Automatica Sinica vol. 5 331–341 (2018) – 10.1109/jas.2017.7510784
- Milano, F. Continuous Newton’s Method for Power Flow Analysis. IEEE Transactions on Power Systems vol. 24 50–57 (2009) – 10.1109/tpwrs.2008.2004820
- Modares, H., Lewis, F. L. & Sistani, M. N. Online solution of nonquadratic two‐player zero‐sum games arising in the H ∞ control of constrained input systems. International Journal of Adaptive Control and Signal Processing vol. 28 232–254 (2012) – 10.1002/acs.2348
- Mullins, S. H., Charlesworth, W. W. & Anderson, D. C. A New Method for Solving Mixed Sets of Equality and Inequality Constraints. Journal of Mechanical Design vol. 117 322–328 (1995) – 10.1115/1.2826142
- Oliveira, T. R., Rodrigues, V. H. P., Krstić, M. & Başar, T. Nash Equilibrium Seeking in Quadratic Noncooperative Games Under Two Delayed Information-Sharing Schemes. Journal of Optimization Theory and Applications vol. 191 700–735 (2020) – 10.1007/s10957-020-01757-z
- Oliveira, T. R., Hugo Pereira Rodrigues, V., Krstic, M. & Basar, T. Nash Equilibrium Seeking with Arbitrarily Delayed Player Actions. 2020 59th IEEE Conference on Decision and Control (CDC) 150–155 (2020) doi:10.1109/cdc42340.2020.9303894 – 10.1109/cdc42340.2020.9303894
- Oliveira, T. R., Rodrigues, V. H. P., Krstic, M. & Basar, T. Nash Equilibrium Seeking with Players Acting Through Heat PDE Dynamics. 2021 American Control Conference (ACC) 684–689 (2021) doi:10.23919/acc50511.2021.9483114 – 10.23919/acc50511.2021.9483114
- Ortega, R., Liu, Z. & Su, H. Control via interconnection and damping assignment of linear time-invariant systems: a tutorial. International Journal of Control vol. 85 603–611 (2012) – 10.1080/00207179.2012.660734
- Papavassilopoulos, G. P., Medanic, J. V. & Cruz, J. B., Jr. On the existence of Nash strategies and solutions to coupled riccati equations in linear-quadratic games. Journal of Optimization Theory and Applications vol. 28 49–76 (1979) – 10.1007/bf00933600
- Pfeifer, M. et al. Explicit port-Hamiltonian formulation of multi-bond graphs for an automated model generation. Automatica vol. 120 109121 (2020) – 10.1016/j.automatica.2020.109121
- Polyak, R. A. Regularized Newton method for unconstrained convex optimization. Mathematical Programming vol. 120 125–145 (2007) – 10.1007/s10107-007-0143-3
- Sackmann, M. S. & Krebs, V. G. Modified Optimal Control: Global Asymptotic Stabilization of Nonlinear Systems. IFAC Proceedings Volumes vol. 33 199–204 (2000) – 10.1016/s1474-6670(17)37190-2
- Sontag, E. D. A ‘universal’ construction of Artstein’s theorem on nonlinear stabilization. Systems & Control Letters vol. 13 117–123 (1989) – 10.1016/0167-6911(89)90028-5
- Starr, A. W. & Ho, Y. C. Nonzero-sum differential games. Journal of Optimization Theory and Applications vol. 3 184–206 (1969) – 10.1007/bf00929443
- Tang, D., Chen, L., Tian, Z. F. & Hu, E. Modified value-function-approximation for synchronous policy iteration with single-critic configuration for nonlinear optimal control. International Journal of Control vol. 94 1321–1333 (2019) – 10.1080/00207179.2019.1648874
- Vamvoudakis, K. G. & Lewis, F. L. Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica vol. 46 878–888 (2010) – 10.1016/j.automatica.2010.02.018
- Vamvoudakis, K. G. & Lewis, F. L. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations. Automatica vol. 47 1556–1569 (2011) – 10.1016/j.automatica.2011.03.005
- Vamvoudakis, K. G., Miranda, M. F. & Hespanha, J. P. Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation. IEEE Transactions on Neural Networks and Learning Systems vol. 27 2386–2398 (2016) – 10.1109/tnnls.2015.2487972
- van der Schaft, A. L2-Gain and Passivity Techniques in Nonlinear Control. Communications and Control Engineering (Springer International Publishing, 2017). doi:10.1007/978-3-319-49992-5 – 10.1007/978-3-319-49992-5
- van der Schaft, A. & Jeltsema, D. Port-Hamiltonian Systems Theory: An Introductory Overview. Foundations and Trends® in Systems and Control vol. 1 173–378 (2014) – 10.1561/2600000002
- van der Schaft, A. J., Rao, S. & Jayawardhana, B. A network dynamics approach to chemical reaction networks. International Journal of Control vol. 89 731–745 (2015) – 10.1080/00207179.2015.1095353
- Weeren, A. J. T. M., Schumacher, J. M. & Engwerda, J. C. Asymptotic Analysis of Linear Feedback Nash Equilibria in Nonzero-Sum Linear-Quadratic Differential Games. Journal of Optimization Theory and Applications vol. 101 693–722 (1999) – 10.1023/a:1021798322597
- Wu, Y., Hamroun, B., Le Gorrec, Y. & Maschke, B. Reduced order LQG control design for port Hamiltonian systems. Automatica vol. 95 86–92 (2018) – 10.1016/j.automatica.2018.05.003
- Huaguang Zhang, Lili Cui & Yanhong Luo. Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP. IEEE Transactions on Cybernetics vol. 43 206–216 (2013) – 10.1109/tsmcb.2012.2203336