Optimal Energy Shaping via Neural Approximators
Authors
Stefano Massaroli, Michael Poli, Federico Califano, Jinkyoo Park, Atsushi Yamashita, Hajime Asama
Abstract
We introduce optimal energy shaping as an enhancement of classical passivity-based control methods. A promising feature of passivity theory, alongside stability, has traditionally been claimed to be intuitive performance tuning along the execution of a given task. However, a systematic approach to adjust performance within a passive control framework has yet to be developed, as each method relies on few and problem-specific practical insights. Here, we cast the classic energy-shaping control design process in an optimal control framework; once a task-dependent performance metric is defined, an optimal solution is systematically obtained through an iterative procedure relying on neural networks and gradient-based optimization. The proposed method is validated on state-regulation tasks.
Citation
- Journal: SIAM Journal on Applied Dynamical Systems
- Year: 2022
- Volume: 21
- Issue: 3
- Pages: 2126–2147
- Publisher: Society for Industrial & Applied Mathematics (SIAM)
- DOI: 10.1137/21m1414279
BibTeX
@article{Massaroli_2022,
title={{Optimal Energy Shaping via Neural Approximators}},
volume={21},
ISSN={1536-0040},
DOI={10.1137/21m1414279},
number={3},
journal={SIAM Journal on Applied Dynamical Systems},
publisher={Society for Industrial & Applied Mathematics (SIAM)},
author={Massaroli, Stefano and Poli, Michael and Califano, Federico and Park, Jinkyoo and Yamashita, Atsushi and Asama, Hajime},
year={2022},
pages={2126--2147}
}
References
- Abadi M., 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (2016)
- Arimoto S., Proceedings of Robotics Research, 1st International Symposium on Robotics Research (1984)
- Arulkumaran, K., Deisenroth, M. P., Brundage, M. & Bharath, A. A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Processing Magazine vol. 34 26–38 (2017) – 10.1109/msp.2017.2743240
- Baydin A. G., J. Mach. Learn. Res. (2017)
- Berkenkamp F., Advances in Neural Information Processing Systems (2017)
- Bertsekas D. P., Dynamic Programming and Optimal Control (1995)
- Bradbury J., JAX: Composable Transformations of Python+NumPy Programs (2018)
- Buşoniu, L., de Bruin, T., Tolić, D., Kober, J. & Palunko, I. Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control vol. 46 8–28 (2018) – 10.1016/j.arcontrol.2018.09.005
- Byrnes, C. I., Isidori, A. & Willems, J. C. Passivity, feedback equivalence, and the global stabilization of minimum phase nonlinear systems. IEEE Transactions on Automatic Control vol. 36 1228–1240 (1991) – 10.1109/9.100932
- Chen R. T., Advances in Neural Information Processing Systems (2018)
- Cranmer M., Lagrangian Neural Networks, preprint, https://arxiv.org/abs/2003.04630 (2020)
- Dimeas, F. & Aspragathos, N. Online Stability in Human-Robot Cooperation with Admittance Control. IEEE Transactions on Haptics vol. 9 267–278 (2016) – 10.1109/toh.2016.2518670
- Dormand, J. R. & Prince, P. J. A family of embedded Runge-Kutta formulae. Journal of Computational and Applied Mathematics vol. 6 19–26 (1980) – 10.1016/0771-050x(80)90013-3
- Duindam, V., Macchelli, A., Stramigioli, S. & Bruyninckx, H. Modeling and Control of Complex Physical Systems. (Springer Berlin Heidelberg, 2009). doi:10.1007/978-3-642-03196-0 – 10.1007/978-3-642-03196-0
- Dulac-Arnold G., Challenges of Real-World Reinforcement Learning, preprint, https://arxiv.org/abs/1904.12901 (2019)
- Glorot X., Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (2010)
- Grathwohl W., FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models, preprint, https://arxiv.org/abs/1810.01367 (2018)
- Greydanus S., Advances in Neural Information Processing Systems (2019)
- Groothuis S. S., Front. Robotics AI (2018)
- Innes M., Fashionable Modelling with Flux, preprint, https://arxiv.org/abs/1811.01457 (2018)
- Kidger P., Neural Controlled Differential Equations for Irregular Time Series, preprint, https://arxiv.org/abs/2005.08926 (2020)
- Kingma D. P., Adam: A Method for Stochastic Optimization, preprint, https://arxiv.org/abs/1412.6980 (2014)
- Kölsch L., Optimal Control of Port-Hamiltonian Systems: A Time-Continuous Learning Approach, preprint, https://arxiv.org/abs/2007.08645 (2020)
- Lutter M., Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning, preprint, https://arxiv.org/abs/1907.04490 (2019)
- Massaroli S., 1st IFAC Workshop on Robot Control (WROCO) (2019)
- Massaroli S., Stable Neural Flows, preprint, https://arxiv.org/abs/2003.08063 (2020)
- Massaroli, S. et al. Port–Hamiltonian Approach to Neural Network Training. 2019 IEEE 58th Conference on Decision and Control (CDC) 6799–6806 (2019) doi:10.1109/cdc40024.2019.9030017 – 10.1109/cdc40024.2019.9030017
- Massaroli S., Dissecting Neural ODEs, preprint, https://arxiv.org/abs/2002.08071 (2020)
- Nageshrao, S. P., Lopes, G. A. D., Jeltsema, D. & Babuska, R. Port-Hamiltonian Systems in Adaptive and Learning Control: A Survey. IEEE Transactions on Automatic Control vol. 61 1223–1238 (2016) – 10.1109/tac.2015.2458491
- Ortega, R., van der Schaft, A., Castanos, F. & Astolfi, A. Control by Interconnection and Standard Passivity-Based Control of Port-Hamiltonian Systems. IEEE Transactions on Automatic Control vol. 53 2527–2542 (2008) – 10.1109/tac.2008.2006930
- Putting energy back in control. IEEE Control Systems vol. 21 18–33 (2001) – 10.1109/37.915398
- Paszke A., Advances in Neural Information Processing Systems (2019)
- Poli M., TorchDyn: A Neural Differential Equations Library, preprint, https://arxiv.org/abs/2009.09346 (2020)
- Pontryagin L. S., The Mathematical Theory of Optimal Processes, translated from the Russian by K. N. Trirogoff, edited by L. W. Neustadt (1962)
- Robbins, H. & Monro, S. A Stochastic Approximation Method. The Annals of Mathematical Statistics vol. 22 400–407 (1951) – 10.1214/aoms/1177729586
- Secchi C., STAR 29 (2007)
- Smyrlis, G. & Zisis, V. Local convergence of the steepest descent method in Hilbert spaces. Journal of Mathematical Analysis and Applications vol. 300 436–453 (2004) – 10.1016/j.jmaa.2004.06.051
- Sprangers, O., Babuska, R., Nageshrao, S. P. & Lopes, G. A. D. Reinforcement Learning for Port-Hamiltonian Systems. IEEE Transactions on Cybernetics vol. 45 1017–1027 (2015) – 10.1109/tcyb.2014.2343194
- Stramigioli S., Modeling and IPC Control of Interactive Mechanical Systems—A Coordinate-Free Approach (2001)
- Stramigioli S., Mathematical Control Theory I
- Sutton R. S., Reinforcement Learning: An Introduction (2018)
- A. J. Van der Schaft, $L_{2}$-Gain and Passivity Techniques in Nonlinear Control (2000)
- Vos, E., Scherpen, J. M. A., Schaft, A. J. van der & Postma, A. Formation Control of Wheeled Robots in the Port-Hamiltonian Framework. IFAC Proceedings Volumes vol. 47 6662–6667 (2014) – 10.3182/20140824-6-za-1003.00394
- Zhong Y. D., Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control, preprint, https://arxiv.org/abs/1909.12077 (2019)
- Zhong Y. D., Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data, preprint, https://arxiv.org/abs/2012.02334 (2020)
- Zhong Y. D., Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning, preprint, https://arxiv.org/abs/2002.08860 (2020)
- Zhong Y. D., Advances in Neural Information Processing Systems (2020)
- Zhuang J., Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE, preprint, https://arxiv.org/abs/2006.02493 (2020)