Control by interconnection of a manipulator arm using reinforcement learning
Authors
S. P. Nageshrao, G. A. D. Lopes, D. Jeltsema, R. Babuska
Abstract
Control by interconnection (CbI) is a dynamic output-feedback approach used to control port-Hamiltonian (PH) systems. Here, both the plant and the controller are modelled in PH form, in terms of their own Hamiltonians. However, obtaining an appropriate controller Hamiltonian is generally difficult. In this paper, we address this issue by using reinforcement learning (RL). Additionally due to the semi-supervised optimization nature of the RL algorithms, a performance criterion can be readily included in CbI. We demonstrate the usefulness of the proposed learning algorithm for stabilization of a manipulator arm.
Citation
- Journal: 2015 IEEE International Symposium on Intelligent Control (ISIC)
- Year: 2015
- Volume:
- Issue:
- Pages: 47–52
- Publisher: IEEE
- DOI: 10.1109/isic.2015.7307278
BibTeX
@inproceedings{Nageshrao_2015,
title={{Control by interconnection of a manipulator arm using reinforcement learning}},
DOI={10.1109/isic.2015.7307278},
booktitle={{2015 IEEE International Symposium on Intelligent Control (ISIC)}},
publisher={IEEE},
author={Nageshrao, S. P. and Lopes, G. A. D. and Jeltsema, D. and Babuska, R.},
year={2015},
pages={47--52}
}
References
- Busoniu, L., Babuska, R., De Schutter, B. & Ernst, D. Reinforcement Learning and Dynamic Programming Using Function Approximators. (2017) doi:10.1201/9781439821091 – 10.1201/9781439821091
- Vrabie, D., Vamvoudakis, K. G. & Lewis, F. L. Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. (Institution of Engineering and Technology, 2012). doi:10.1049/pbce081e – 10.1049/pbce081e
- bertsekas, Dynamic Programming and Optimal Control 3rd ed (2011)
- Nageshrao, S. P., Lopes, G. A. D., Jeltsema, D. & Babuška, R. Passivity-based reinforcement learning control of a 2-DOF manipulator arm. Mechatronics 24, 1001–1007 (2014) – 10.1016/j.mechatronics.2014.10.005
- Benosman, M. & Atınç, G. M. Extremum seeking-based adaptive control for electromagnetic actuators. International Journal of Control 88, 517–530 (2014) – 10.1080/00207179.2014.964779
- Beard, R. W., Saridis, G. N. & Wen, J. T. Approximate Solutions to the Time-Invariant Hamilton–Jacobi–Bellman Equation. Journal of Optimization Theory and Applications 96, 589–626 (1998) – 10.1023/a:1022664528457
- Vrabie, D. & Lewis, F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks 22, 237–246 (2009) – 10.1016/j.neunet.2009.03.008
- Willems, J. C. Dissipative dynamical systems part I: General theory. Arch. Rational Mech. Anal. 45, 321–351 (1972) – 10.1007/bf00276493
- Marsden, J. E. & Ratiu, T. S. Introduction to Mechanics and Symmetry. Texts in Applied Mathematics (Springer New York, 1999). doi:10.1007/978-0-387-21792-5 – 10.1007/978-0-387-21792-5
- Grondman, I., Busoniu, L., Lopes, G. A. D. & Babuska, R. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients. IEEE Trans. Syst., Man, Cybern. C 42, 1291–1307 (2012) – 10.1109/tsmcc.2012.2218595
- Ortega, R., van der Schaft, A., Castanos, F. & Astolfi, A. Control by Interconnection and Standard Passivity-Based Control of Port-Hamiltonian Systems. IEEE Trans. Automat. Contr. 53, 2527–2542 (2008) – 10.1109/tac.2008.2006930
- Ortega, R., Loría, A., Nicklasson, P. J. & Sira-Ramírez, H. Passivity-Based Control of Euler-Lagrange Systems. Communications and Control Engineering (Springer London, 1998). doi:10.1007/978-1-4471-3603-3 – 10.1007/978-1-4471-3603-3
- Putting energy back in control. IEEE Control Syst. 21, 18–33 (2001) – 10.1109/37.915398
- koopman, Casimir-based control beyond the dissipation obstacle. IFAC Work on Lagrangian and Hamiltonian Methods in Nonlinear Control (2012)
- van der Schaft, A. & Jeltsema, D. Port-Hamiltonian Systems Theory: An Introductory Overview. (2014) doi:10.1561/9781601987877 – 10.1561/9781601987877
- Ortega, R., van der Schaft, A., Maschke, B. & Escobar, G. Interconnection and damping assignment passivity-based control of port-controlled Hamiltonian systems. Automatica 38, 585–596 (2002) – 10.1016/s0005-1098(01)00278-3
- Duindam, V., Macchelli, A., Stramigioli, S. & Bruyninckx, H. Modeling and Control of Complex Physical Systems. (Springer Berlin Heidelberg, 2009). doi:10.1007/978-3-642-03196-0 – 10.1007/978-3-642-03196-0
- van der Schaft, A. L2 - Gain and Passivity Techniques in Nonlinear Control. Communications and Control Engineering (Springer London, 2000). doi:10.1007/978-1-4471-0507-7 – 10.1007/978-1-4471-0507-7
- sutton, Reinforcement Learning An Introduction (1998)
- grondman, Online model learning algorithms for actor-critic control. (2015)
- Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M. & Lee, M. Natural actor–critic algorithms. Automatica 45, 2471–2482 (2009) – 10.1016/j.automatica.2009.07.008