Control by interconnection of a manipulator arm using reinforcement learning

Authors

S. P. Nageshrao, G. A. D. Lopes, D. Jeltsema, R. Babuska

Abstract

Control by interconnection (CbI) is a dynamic output-feedback approach used to control port-Hamiltonian (PH) systems. Here, both the plant and the controller are modelled in PH form, in terms of their own Hamiltonians. However, obtaining an appropriate controller Hamiltonian is generally difficult. In this paper, we address this issue by using reinforcement learning (RL). Additionally due to the semi-supervised optimization nature of the RL algorithms, a performance criterion can be readily included in CbI. We demonstrate the usefulness of the proposed learning algorithm for stabilization of a manipulator arm.

Citation

Journal: 2015 IEEE International Symposium on Intelligent Control (ISIC)
Year: 2015
Volume:
Issue:
Pages: 47–52
Publisher: IEEE
DOI: 10.1109/isic.2015.7307278

BibTeX

@inproceedings{Nageshrao_2015,
  title={{Control by interconnection of a manipulator arm using reinforcement learning}},
  DOI={10.1109/isic.2015.7307278},
  booktitle={{2015 IEEE International Symposium on Intelligent Control (ISIC)}},
  publisher={IEEE},
  author={Nageshrao, S. P. and Lopes, G. A. D. and Jeltsema, D. and Babuska, R.},
  year={2015},
  pages={47--52}
}

Download the bib file

References

Busoniu, L., Babuska, R., De Schutter, B. & Ernst, D. Reinforcement Learning and Dynamic Programming Using Function Approximators. (2017) doi:10.1201/9781439821091 – 10.1201/9781439821091
Vrabie, D., Vamvoudakis, K. G. & Lewis, F. L. Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. (Institution of Engineering and Technology, 2012). doi:10.1049/pbce081e – 10.1049/pbce081e
bertsekas, Dynamic Programming and Optimal Control 3rd ed (2011)
Nageshrao, S. P., Lopes, G. A. D., Jeltsema, D. & Babuška, R. Passivity-based reinforcement learning control of a 2-DOF manipulator arm. Mechatronics 24, 1001–1007 (2014) – 10.1016/j.mechatronics.2014.10.005
Benosman, M. & Atınç, G. M. Extremum seeking-based adaptive control for electromagnetic actuators. International Journal of Control 88, 517–530 (2014) – 10.1080/00207179.2014.964779
Beard, R. W., Saridis, G. N. & Wen, J. T. Approximate Solutions to the Time-Invariant Hamilton–Jacobi–Bellman Equation. Journal of Optimization Theory and Applications 96, 589–626 (1998) – 10.1023/a:1022664528457
Vrabie, D. & Lewis, F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks 22, 237–246 (2009) – 10.1016/j.neunet.2009.03.008
Willems, J. C. Dissipative dynamical systems part I: General theory. Arch. Rational Mech. Anal. 45, 321–351 (1972) – 10.1007/bf00276493
Marsden, J. E. & Ratiu, T. S. Introduction to Mechanics and Symmetry. Texts in Applied Mathematics (Springer New York, 1999). doi:10.1007/978-0-387-21792-5 – 10.1007/978-0-387-21792-5
Grondman, I., Busoniu, L., Lopes, G. A. D. & Babuska, R. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients. IEEE Trans. Syst., Man, Cybern. C 42, 1291–1307 (2012) – 10.1109/tsmcc.2012.2218595
Ortega, R., van der Schaft, A., Castanos, F. & Astolfi, A. Control by Interconnection and Standard Passivity-Based Control of Port-Hamiltonian Systems. IEEE Trans. Automat. Contr. 53, 2527–2542 (2008) – 10.1109/tac.2008.2006930
Ortega, R., Loría, A., Nicklasson, P. J. & Sira-Ramírez, H. Passivity-Based Control of Euler-Lagrange Systems. Communications and Control Engineering (Springer London, 1998). doi:10.1007/978-1-4471-3603-3 – 10.1007/978-1-4471-3603-3
Putting energy back in control. IEEE Control Syst. 21, 18–33 (2001) – 10.1109/37.915398
koopman, Casimir-based control beyond the dissipation obstacle. IFAC Work on Lagrangian and Hamiltonian Methods in Nonlinear Control (2012)
van der Schaft, A. & Jeltsema, D. Port-Hamiltonian Systems Theory: An Introductory Overview. (2014) doi:10.1561/9781601987877 – 10.1561/9781601987877
Ortega, R., van der Schaft, A., Maschke, B. & Escobar, G. Interconnection and damping assignment passivity-based control of port-controlled Hamiltonian systems. Automatica 38, 585–596 (2002) – 10.1016/s0005-1098(01)00278-3
Duindam, V., Macchelli, A., Stramigioli, S. & Bruyninckx, H. Modeling and Control of Complex Physical Systems. (Springer Berlin Heidelberg, 2009). doi:10.1007/978-3-642-03196-0 – 10.1007/978-3-642-03196-0
van der Schaft, A. L2 - Gain and Passivity Techniques in Nonlinear Control. Communications and Control Engineering (Springer London, 2000). doi:10.1007/978-1-4471-0507-7 – 10.1007/978-1-4471-0507-7
sutton, Reinforcement Learning An Introduction (1998)
grondman, Online model learning algorithms for actor-critic control. (2015)
Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M. & Lee, M. Natural actor–critic algorithms. Automatica 45, 2471–2482 (2009) – 10.1016/j.automatica.2009.07.008