Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning

Ruofan Wu; Zhikai Yao; Jennie Si; He (Helen) Huang

doi:10.1109/JAS.2021.1004272

Volume 9 Issue 1

Jan. 2022

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2022 > 9(1): 19-30

R. F. Wu, Z. K. Yao, J. Si, and H. Huang, “Robotic knee tracking control to mimic the intact human knee profile based on actor-critic reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 19–30, Jan. 2022. doi: 10.1109/JAS.2021.1004272

Citation:

R. F. Wu, Z. K. Yao, J. Si, and H. Huang, “Robotic knee tracking control to mimic the intact human knee profile based on actor-critic reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 19–30, Jan. 2022. doi: 10.1109/JAS.2021.1004272

Citation:

PDF( 1175 KB)

Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning

doi: 10.1109/JAS.2021.1004272

Ruofan Wu^1
,,
Zhikai Yao^1
,,
Jennie Si^{1
,
,},
He (Helen) Huang^{2, 3
,}

1.
School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85287 USA
2.
Department of Biomedical Engineering, North Carolina State University, Raleigh, NC 27695 USA
3.
University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA

Funds: This work was partly supported by the National Science Foundation (1563921, 1808752, 1563454, 1808898)

More Information

Author Bio:
Ruofan Wu received the B.S. degree in telecommunications engineering from Beijing University of Posts and Telecommunications, China, in 2014, and the M.S. degree in Electrical Engineering from Arizona State University, USA, in 2017. He is currently a Ph.D. candidate at the School of Electrical, Computer and Energy Engineering, Arizona State University, USA. His research interests include adaptive control, reinforcement learning and robotics control

Zhikai Yao received the B.Tech. degree in 2014 from the Qingdao University of Technology, Qingdao, China, and the Ph.D. degree in 2021 from Nanjing University of Science and Technology, Nanjing, China. He has been an exchange student with the School of Electrical, Computer and Energy Engineering, Arizona State University, USA, since 2019. His research interests include intelligent control of mechanical systems, reinforcement learning and human-robot interaction

Jennie Si (Fellow, IEEE) received the B.S. degree in 1985, the M.S. degree in 1988 from Tsinghua University, China, and the Ph.D. degree in 1992 from the University of Notre Dame, USA. She has been a Faculty Member with the School of Electrical, Computer and Energy Engineering, Arizona State University, USA, since 1991. Her research focuses on reinforcement learning control which utilizes machine learning and neural network approximations. She is also interested in fundamental neuroscience of the frontal cortex and its role in decision and control processes. Dr. Si was a Recipient of the NSF/White House Presidential Faculty Fellow Award in 1995 and the Motorola Engineering Excellence Award in 1995. She is a Distinguished Lecturer of the IEEE Computational Intelligence Society. She consulted for Intel, Arizona Public Service, and Medtronic. She has served on several professional organizations’ executive boards and international conference committees. She was an Advisor to the NSF Social Behavioral and Economical Directory. She served on several proposal review panels. She was an Associate Editor for the IEEE Transactions on Semiconductor Manufacturing, the IEEE Transactions on Automatic Control, and an Action Editor for Neural Networks. She is currently an Associate Editor for the IEEE Transactions on Neural Networks and Learning Systems

He (Helen) Huang (Senior Member, IEEE) received the B.S. degree in electronic and information engineering from Xi’an Jiaotong University in 2000, and the M.S. and Ph.D. degrees in biomedical engineering from Arizona State University, USA, in 2002 and 2006, respectively. She is currently the Jackson Family Distinguished Professor with the joint Department of Biomedical Engineering, North Carolina State University, USA, and the University of North Carolina at Chapel Hill, USA, and the Director for the Closed-Loop Engineering for Advanced Rehabilitation (CLEAR) core. Her research interest lies in neural-machine interfaces for prostheses and exoskeletons, human-robot interaction, adaptive and optimal control of wearable robots, and human movement control. Dr. Huang was the Recipient of the Delsys Prize for Innovation in Electromyography, the Mary E. Switzer Fellowship with NIDRR, and a NSF CAREER Award. She is a Fellow of AIMBE and Member of the Society for Neuroscience, BMES, and ASB. She is currently an Associate Editor for the IEEE Transactions on Neural Systems and Rehabilitation Engineering, the Journal of Neuroengineering and Rehabilitation, and the Wearable Techologies, and is a Guest Associate Editor for the IEEE Transactions on Robotics
Corresponding author: Jennie Si, e-mail: si@asu.edu
Ruofan Wu and Zhikai Yao contributed equally to this work.
Received Date: 2021-06-11
Accepted Date: 2021-07-14

Available Online: 2021-08-12

Abstract

Abstract

We address a state-of-the-art reinforcement learning (RL) control approach to automatically configure robotic prosthesis impedance parameters to enable end-to-end, continuous locomotion intended for transfemoral amputee subjects. Specifically, our actor-critic based RL provides tracking control of a robotic knee prosthesis to mimic the intact knee profile. This is a significant advance from our previous RL based automatic tuning of prosthesis control parameters which have centered on regulation control with a designer prescribed robotic knee profile as the target. In addition to presenting the tracking control algorithm based on direct heuristic dynamic programming (dHDP), we provide a control performance guarantee including the case of constrained inputs. We show that our proposed tracking control possesses several important properties, such as weight convergence of the learning networks, Bellman (sub) optimality of the cost-to-go value function and control input, and practical stability of the human-robot system. We further provide a systematic simulation of the proposed tracking control using a realistic human-robot system simulator, the OpenSim, to emulate how the dHDP enables level ground walking, walking on different terrains and at different paces. These results show that our proposed dHDP based tracking control is not only theoretically suitable, but also practically useful.

FullText(HTML)

Ruofan Wu and Zhikai Yao contributed equally to this work.

References(50)

References

[1]	E. C. Martinez-Villalpando and H. Herr, “Agonist-antagonist active knee prosthesis: A preliminary study in level-ground walking,” Journal of Rehabilitation Research &Development, vol. 46, no. 3, 2009.
[2]	J. L. Johansson, D. M. Sherrill, P. O. Riley, P. Bonato, and H. Herr, “A clinical comparison of variable-damping and mechanically passive prosthetic knee devices,” American Journal of Physical Medicine &Rehabilitation, vol. 84, no. 8, pp. 563–575, 2005.
[3]	E. J. Rouse, L. J. Hargrove, E. J. Perreault, and T. A. Kuiken, “Estimation of human ankle impedance during the stance phase of walking,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 22, no. 4, pp. 870–878, 2014. doi: 10.1109/TNSRE.2014.2307256
[4]	S. Pfeifer, H. Vallery, M. Hardegger, R. Riener, and E. J. Perreault, “Model-based estimation of knee stiffness,” IEEE Trans. Biomedical Engineering, vol. 59, no. 9, pp. 2604–2612, 2012. doi: 10.1109/TBME.2012.2207895
[5]	R. D. Gregg, T. Lenzi, N. P. Fey, L. J. Hargrove, and J. W. Sensinger, “Experimental effective shape control of a powered transfemoral prosthesis,” in Proc. IEEE 13th Int. Conf. Rehabilitation Robotics, 2013, pp. 1–7.
[6]	M. F. Eilenberg, H. Geyer, and H. Herr, “Control of a powered ankle–foot prosthesis based on a neuromuscular model,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 18, no. 2, pp. 164–173, 2010. doi: 10.1109/TNSRE.2009.2039620
[7]	H. Huang, D. L. Crouch, M. Liu, G. S. Sawicki, and D. Wang, “A cyber expert system for auto-tuning powered prosthesis impedance control parameters,” Annals of Biomedical Engineering, vol. 44, no. 5, pp. 1613–1624, 2016. doi: 10.1007/s10439-015-1464-7
[8]	C. Lu, J. Si, and X. Xie, “Direct heuristic dynamic programming for damping oscillations in a large power system,” IEEE Trans. Systems,Man,and Cybernetics,Part B (Cybernetics), vol. 38, no. 4, pp. 1008–1013, 2008. doi: 10.1109/TSMCB.2008.923157
[9]	W. Guo, F. Liu, J. Si, D. He, R. Harley, and S. Mei, “Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability,” Neurocomputing, vol. 170, pp. 417–427, 2015. doi: 10.1016/j.neucom.2015.03.089
[10]	W. Guo, F. Liu, J. Si, D. He, R. Harley, and S. Mei, “Online supplementary ADP learning controller design and application to power system frequency control with large-scale wind energy integration,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 8, pp. 1748–1761, 2015.
[11]	R. Enns and J. Si, “Helicopter flight control design using a learning control approach,” in Proc. 39th IEEE Conf. Decision and Control, vol. 2, 2000, pp. 1754–1759.
[12]	R. Enns and J. Si, “Apache helicopter stabilization using neural dynamic programming,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 1, pp. 19–25, 2002.
[13]	R. Enns and J. Si, “Helicopter trimming and tracking control using direct neural dynamic programming,” IEEE Trans. Neural Networks, vol. 14, no. 4, pp. 929–939, 2003.
[14]	Y. Wen, X. Gao, J. Si, A. Brandt, M. Li, and H. H. Huang, “Robotic knee prosthesis real-time control using reinforcement learning with human in the loop,” in Proc. Int. Conf. Cognitive Systems and Signal Processing. Springer, 2018, pp. 463–473.
[15]	Y. Wen, J. Si, X. Gao, S. Huang, and H. H. Huang, “A new powered lower limb prosthesis control framework based on adaptive dynamic programming,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 9, pp. 2215–2220, 2016.
[16]	Y. Wen, J. Si, A. Brandt, X. Gao, and H. Huang, “Online reinforcement learning control for the personalization of a robotic knee prosthesis,” IEEE Trans. Cybernetics, vol. 50, no. 6, pp. 2346–2356, 2020.
[17]	Y. Wen, M. Li, J. Si, and H. H. Huang, “Wearer-prosthesis interaction for symmetrical gait: A study enabled by reinforcement learning prosthesis control,” IEEE Trans. Neural Systems and Rehabilitation, vol. 28, no. 4, pp. 904–913, 2020.
[18]	X. Gao, Y. Wen, M. Li, J. Si, and H. H. Huang, “Robotic knee parameter tuning using approximate policy iteration,” in Proc. Int. Conf. Cognitive Systems and Signal Processing. Springer, 2018, pp. 554–563.
[19]	M. Li, X. Gao, Y. Wen, J. Si, and H. H. Huang, “Offline policy iteration based reinforcement learning controller for online robotic knee prosthesis parameter tuning,” in Proc. IEEE Int. Conf. Robotics and Automation, 2019, pp. 2831–2837.
[20]	M. Li, Y. Wen, X. Gao, J. Si, and H. Huang, “Toward expedited impedance tuning of a robotic prosthesis for personalized gait assistance by reinforcement learning control,” IEEE Trans. Robotics, 2021. DOI: 10.1109/TRO.2021.3078317
[21]	X. Gao, J. Si, Y. Wen, M. Li, and H. Huang, “Reinforcement learning control of robotic knee with human-in-the-loop by flexible policy iteration,” IEEE Trans. Neural Networks and Learning Systems, 2021. DOI: 10.1109/TNNLS.2021.3071727
[22]	B. Pietraszewski, S. Winiarski, and S. Jaroszczuk, “Three-dimensional human gait pattern–reference data for normal men,” Acta of Bioengineering and Biomechanics, vol. 14, no. 3, pp. 9–16, 2012.
[23]	A. N. Lay, C. J. Hass, and R. J. Gregor, “The effects of sloped surfaces on locomotion: A kinematic and kinetic analysis,” Journal of Biomechanics, vol. 39, no. 9, pp. 1621–1628, 2006. doi: 10.1016/j.jbiomech.2005.05.005
[24]	A. S. Voloshina, A. D. Kuo, M. A. Daley, and D. P. Ferris, “Biomechanics and energetics of walking on uneven terrain,” Journal of Experimental Biology, vol. 216, no. 21, pp. 3963–3970, 2013.
[25]	G. D’Angelo, Y. Thibaudier, A. Telonio, M.-F. Hurteau, V. Kuczynski, C. Dambreville, and A. Frigon, “Modulation of phase durations, phase variations, and temporal coordination of the four limbs during quadrupedal split-belt locomotion in intact adult cats,” Journal of Neurophysiology, vol. 112, no. 8, pp. 1825–1837, 2014. doi: 10.1152/jn.00160.2014
[26]	M. G. Bernal-Torres, H. I. Medellín-Castillo, and J. C. Arellano-González, “Design and control of a new biomimetic transfemoral knee prosthesis using an echo-control scheme,” Journal of Healthcare Engineering, vol. 2018, 2018. doi: 10.1155/2018/8783642
[27]	D. Joshi, R. Singh, R. Ribeiro, S. Srivastava, U. Singh, and S. Anand, “Development of echo control strategy for AK prosthesis: An embedded system approach,” in Proc. IEEE Int. Conf. Systems in Medicine and Biology, 2010, pp. 143–147.
[28]	S. Sahoo, D. K. Pratihar, and S. Mukhopadhyay, “A novel supervisory control scheme to tackle variations in step length for walking with powered ankle prosthesis,” Biomedical Signal Processing and Control, vol. 46, pp. 212–220, 2018. doi: 10.1016/j.bspc.2018.08.001
[29]	S. Kumar, A. Mohammadi, D. Quintero, S. Rezazadeh, N. Gans, and R. D. Gregg, “Extremum seeking control for model-free auto-tuning of powered prosthetic legs,” IEEE Trans. Control Systems Technology, vol. 28, no. 6, pp. 2120–2135, 2020.
[30]	E. R. Westervelt, J. W. Grizzle, and D. E. Koditschek, “Hybrid zero dynamics of planar biped walkers,” IEEE Trans. Automatic Control, vol. 48, no. 1, pp. 42–56, 2003. doi: 10.1109/TAC.2002.806653
[31]	R. Wu, M. Li, Z. Yao, J. Si, et al., “Reinforcement learning enabled automatic impedance control of a robotic knee prosthesis to mimic the intact knee motion in a co-adapting environment,” arXiv preprint arXiv: 2101.03487, 2021.
[32]	C. T. Leonard, R. Craik, and C. Oatis, “The neurophysiology of human locomotion,” in Gait Analysis: Theory and Application. Mosby-Year, 1995.
[33]	N. Hogan, “Impedance control: An approach to manipulation: Part II–Implementation,” Journal of Dynamic Systems,Measurement,and Control, vol. 107, no. 1, pp. 8–16, 1985. doi: 10.1115/1.3140713
[34]	G. Taga, Y. Yamaguchi, and H. Shimizu, “Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment,” Biological Cybernetics, vol. 65, no. 3, pp. 147–159, 1991. doi: 10.1007/BF00198086
[35]	H. Geyer, A. Seyfarth, and R. Blickhan, “Positive force feedback in bouncing gaits?” Proc. the Royal Society of London. Series B:Biological Sciences, vol. 270, no. 1529, pp. 2173–2183, 2003. doi: 10.1098/rspb.2003.2454
[36]	K. Shamaei, G. S. Sawicki, and A. M. Dollar, “Estimation of quasistiffness of the human knee in the stance phase of walking,” PloS One, vol. 8, no. 3, p. e59993, 2013.
[37]	S. Huang, J. P. Wensman, and D. P. Ferris, “Locomotor adaptation by transtibial amputees walking with an experimental powered prosthesis under continuous myoelectric control,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 24, no. 5, pp. 573–581, 2015.
[38]	J. Si and Y.-T. Wang, “Online learning control by association and reinforcement,” IEEE Trans. Neural Networks, vol. 12, no. 2, pp. 264–276, 2001. doi: 10.1109/72.914523
[39]	Z. Yao, J. Si, R. Wu, and J. Yao, “Toward reliable designs of data-driven reinforcement learning tracking control for euler-lagrange systems,” arXiv preprint arXiv: 2101.00068, 2020.
[40]	A. N. Michel and R. K. Miller, Qualitative Analysis of Large Scale Dynamical Systems. Academic Press, 1977.
[41]	S. L. Delp, F. C. Anderson, A. S. Arnold, P. Loan, A. Habib, C. T. John, E. Guendelman, and D. G. Thelen, “OpenSim: Open-source software to create and analyze dynamic simulations of movement,” IEEE Trans. Biomedical Engineering, vol. 54, no. 11, pp. 1940–1950, 2007. doi: 10.1109/TBME.2007.901024
[42]	D. Jacobs, “From the ground up: Building a passive dynamic walker model,” 2014. [Online]. Available: https://simtkconfluence.stanford.edu:8443/display/OpenSim33
[43]	M. P. Kadaba, H. Ramakrishnan, and M. Wootten, “Measurement of lower extremity kinematics during level walking,” Journal of Orthopaedic Research, vol. 8, no. 3, pp. 383–392, 1990. doi: 10.1002/jor.1100080310
[44]	M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Control Design. John Wiley & Sons, Inc., 1995.
[45]	Z.-P. Jiang and H. Nijmeijer, “Tracking control of mobile robots: A case study in backstepping,” Automatica, vol. 33, no. 7, pp. 1393–1399, 1997. doi: 10.1016/S0005-1098(97)00055-1
[46]	K. D. Do, Z.-P. Jiang, and J. Pan, “Simultaneous tracking and stabilization of mobile robots: An adaptive approach,” IEEE Trans. Automatic Control, vol. 49, no. 7, pp. 1147–1151, 2004. doi: 10.1109/TAC.2004.831139
[47]	H. K. Khalil and J. W. Grizzle, Nonlinear Systems. Prentice Hall Upper Saddle River, NJ, 2002, vol. 3.
[48]	H. Nijmeijer and A. Van der Schaft, Nonlinear Dynamical Control Systems. Springer, 1990, vol. 175.
[49]	A. Isidori, Nonlinear Control Systems. Springer Science & Business Media, 2013.
[50]	Z.-S. Hou and Z. Wang, “From model-based control to data-driven control: Survey, classification and perspective,” Information Sciences, vol. 235, pp. 3–35, 2013. doi: 10.1016/j.ins.2012.07.014

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(5) / Tables(3)

Get Citation

PDF

XML

Article Metrics

Article views (5570) PDF downloads(127)

Highlights

The first real-time tracking control of a wearable robotic prosthesis with human in the loop
Performance guarantees on learning convergence,solution optimality, practical stability
Systematic performance evaluation of human-robot system during different walking tasks

Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning

doi: 10.1109/JAS.2021.1004272

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content