A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 9 Issue 1
Jan.  2022

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
R. F. Wu, Z. K. Yao, J. Si, and H. Huang, “Robotic knee tracking control to mimic the intact human knee profile based on actor-critic reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 19–30, Jan. 2022. doi: 10.1109/JAS.2021.1004272
Citation: R. F. Wu, Z. K. Yao, J. Si, and H. Huang, “Robotic knee tracking control to mimic the intact human knee profile based on actor-critic reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 1, pp. 19–30, Jan. 2022. doi: 10.1109/JAS.2021.1004272

Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning

doi: 10.1109/JAS.2021.1004272
Funds:  This work was partly supported by the National Science Foundation (1563921, 1808752, 1563454, 1808898)
More Information
  • We address a state-of-the-art reinforcement learning (RL) control approach to automatically configure robotic prosthesis impedance parameters to enable end-to-end, continuous locomotion intended for transfemoral amputee subjects. Specifically, our actor-critic based RL provides tracking control of a robotic knee prosthesis to mimic the intact knee profile. This is a significant advance from our previous RL based automatic tuning of prosthesis control parameters which have centered on regulation control with a designer prescribed robotic knee profile as the target. In addition to presenting the tracking control algorithm based on direct heuristic dynamic programming (dHDP), we provide a control performance guarantee including the case of constrained inputs. We show that our proposed tracking control possesses several important properties, such as weight convergence of the learning networks, Bellman (sub) optimality of the cost-to-go value function and control input, and practical stability of the human-robot system. We further provide a systematic simulation of the proposed tracking control using a realistic human-robot system simulator, the OpenSim, to emulate how the dHDP enables level ground walking, walking on different terrains and at different paces. These results show that our proposed dHDP based tracking control is not only theoretically suitable, but also practically useful.


  • loading
  • Ruofan Wu and Zhikai Yao contributed equally to this work.
  • [1]
    E. C. Martinez-Villalpando and H. Herr, “Agonist-antagonist active knee prosthesis: A preliminary study in level-ground walking,” Journal of Rehabilitation Research &Development, vol. 46, no. 3, 2009.
    J. L. Johansson, D. M. Sherrill, P. O. Riley, P. Bonato, and H. Herr, “A clinical comparison of variable-damping and mechanically passive prosthetic knee devices,” American Journal of Physical Medicine &Rehabilitation, vol. 84, no. 8, pp. 563–575, 2005.
    E. J. Rouse, L. J. Hargrove, E. J. Perreault, and T. A. Kuiken, “Estimation of human ankle impedance during the stance phase of walking,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 22, no. 4, pp. 870–878, 2014. doi: 10.1109/TNSRE.2014.2307256
    S. Pfeifer, H. Vallery, M. Hardegger, R. Riener, and E. J. Perreault, “Model-based estimation of knee stiffness,” IEEE Trans. Biomedical Engineering, vol. 59, no. 9, pp. 2604–2612, 2012. doi: 10.1109/TBME.2012.2207895
    R. D. Gregg, T. Lenzi, N. P. Fey, L. J. Hargrove, and J. W. Sensinger, “Experimental effective shape control of a powered transfemoral prosthesis,” in Proc. IEEE 13th Int. Conf. Rehabilitation Robotics, 2013, pp. 1–7.
    M. F. Eilenberg, H. Geyer, and H. Herr, “Control of a powered ankle–foot prosthesis based on a neuromuscular model,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 18, no. 2, pp. 164–173, 2010. doi: 10.1109/TNSRE.2009.2039620
    H. Huang, D. L. Crouch, M. Liu, G. S. Sawicki, and D. Wang, “A cyber expert system for auto-tuning powered prosthesis impedance control parameters,” Annals of Biomedical Engineering, vol. 44, no. 5, pp. 1613–1624, 2016. doi: 10.1007/s10439-015-1464-7
    C. Lu, J. Si, and X. Xie, “Direct heuristic dynamic programming for damping oscillations in a large power system,” IEEE Trans. Systems,Man,and Cybernetics,Part B (Cybernetics), vol. 38, no. 4, pp. 1008–1013, 2008. doi: 10.1109/TSMCB.2008.923157
    W. Guo, F. Liu, J. Si, D. He, R. Harley, and S. Mei, “Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability,” Neurocomputing, vol. 170, pp. 417–427, 2015. doi: 10.1016/j.neucom.2015.03.089
    W. Guo, F. Liu, J. Si, D. He, R. Harley, and S. Mei, “Online supplementary ADP learning controller design and application to power system frequency control with large-scale wind energy integration,” IEEE Trans. Neural Networks and Learning Systems, vol. 27, no. 8, pp. 1748–1761, 2015.
    R. Enns and J. Si, “Helicopter flight control design using a learning control approach,” in Proc. 39th IEEE Conf. Decision and Control, vol. 2, 2000, pp. 1754–1759.
    R. Enns and J. Si, “Apache helicopter stabilization using neural dynamic programming,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 1, pp. 19–25, 2002.
    R. Enns and J. Si, “Helicopter trimming and tracking control using direct neural dynamic programming,” IEEE Trans. Neural Networks, vol. 14, no. 4, pp. 929–939, 2003.
    Y. Wen, X. Gao, J. Si, A. Brandt, M. Li, and H. H. Huang, “Robotic knee prosthesis real-time control using reinforcement learning with human in the loop,” in Proc. Int. Conf. Cognitive Systems and Signal Processing. Springer, 2018, pp. 463–473.
    Y. Wen, J. Si, X. Gao, S. Huang, and H. H. Huang, “A new powered lower limb prosthesis control framework based on adaptive dynamic programming,” IEEE Trans. Neural Networks and Learning Systems, vol. 28, no. 9, pp. 2215–2220, 2016.
    Y. Wen, J. Si, A. Brandt, X. Gao, and H. Huang, “Online reinforcement learning control for the personalization of a robotic knee prosthesis,” IEEE Trans. Cybernetics, vol. 50, no. 6, pp. 2346–2356, 2020.
    Y. Wen, M. Li, J. Si, and H. H. Huang, “Wearer-prosthesis interaction for symmetrical gait: A study enabled by reinforcement learning prosthesis control,” IEEE Trans. Neural Systems and Rehabilitation, vol. 28, no. 4, pp. 904–913, 2020.
    X. Gao, Y. Wen, M. Li, J. Si, and H. H. Huang, “Robotic knee parameter tuning using approximate policy iteration,” in Proc. Int. Conf. Cognitive Systems and Signal Processing. Springer, 2018, pp. 554–563.
    M. Li, X. Gao, Y. Wen, J. Si, and H. H. Huang, “Offline policy iteration based reinforcement learning controller for online robotic knee prosthesis parameter tuning,” in Proc. IEEE Int. Conf. Robotics and Automation, 2019, pp. 2831–2837.
    M. Li, Y. Wen, X. Gao, J. Si, and H. Huang, “Toward expedited impedance tuning of a robotic prosthesis for personalized gait assistance by reinforcement learning control,” IEEE Trans. Robotics, 2021. DOI: 10.1109/TRO.2021.3078317
    X. Gao, J. Si, Y. Wen, M. Li, and H. Huang, “Reinforcement learning control of robotic knee with human-in-the-loop by flexible policy iteration,” IEEE Trans. Neural Networks and Learning Systems, 2021. DOI: 10.1109/TNNLS.2021.3071727
    B. Pietraszewski, S. Winiarski, and S. Jaroszczuk, “Three-dimensional human gait pattern–reference data for normal men,” Acta of Bioengineering and Biomechanics, vol. 14, no. 3, pp. 9–16, 2012.
    A. N. Lay, C. J. Hass, and R. J. Gregor, “The effects of sloped surfaces on locomotion: A kinematic and kinetic analysis,” Journal of Biomechanics, vol. 39, no. 9, pp. 1621–1628, 2006. doi: 10.1016/j.jbiomech.2005.05.005
    A. S. Voloshina, A. D. Kuo, M. A. Daley, and D. P. Ferris, “Biomechanics and energetics of walking on uneven terrain,” Journal of Experimental Biology, vol. 216, no. 21, pp. 3963–3970, 2013.
    G. D’Angelo, Y. Thibaudier, A. Telonio, M.-F. Hurteau, V. Kuczynski, C. Dambreville, and A. Frigon, “Modulation of phase durations, phase variations, and temporal coordination of the four limbs during quadrupedal split-belt locomotion in intact adult cats,” Journal of Neurophysiology, vol. 112, no. 8, pp. 1825–1837, 2014. doi: 10.1152/jn.00160.2014
    M. G. Bernal-Torres, H. I. Medellín-Castillo, and J. C. Arellano-González, “Design and control of a new biomimetic transfemoral knee prosthesis using an echo-control scheme,” Journal of Healthcare Engineering, vol. 2018, 2018. doi: 10.1155/2018/8783642
    D. Joshi, R. Singh, R. Ribeiro, S. Srivastava, U. Singh, and S. Anand, “Development of echo control strategy for AK prosthesis: An embedded system approach,” in Proc. IEEE Int. Conf. Systems in Medicine and Biology, 2010, pp. 143–147.
    S. Sahoo, D. K. Pratihar, and S. Mukhopadhyay, “A novel supervisory control scheme to tackle variations in step length for walking with powered ankle prosthesis,” Biomedical Signal Processing and Control, vol. 46, pp. 212–220, 2018. doi: 10.1016/j.bspc.2018.08.001
    S. Kumar, A. Mohammadi, D. Quintero, S. Rezazadeh, N. Gans, and R. D. Gregg, “Extremum seeking control for model-free auto-tuning of powered prosthetic legs,” IEEE Trans. Control Systems Technology, vol. 28, no. 6, pp. 2120–2135, 2020.
    E. R. Westervelt, J. W. Grizzle, and D. E. Koditschek, “Hybrid zero dynamics of planar biped walkers,” IEEE Trans. Automatic Control, vol. 48, no. 1, pp. 42–56, 2003. doi: 10.1109/TAC.2002.806653
    R. Wu, M. Li, Z. Yao, J. Si, et al., “Reinforcement learning enabled automatic impedance control of a robotic knee prosthesis to mimic the intact knee motion in a co-adapting environment,” arXiv preprint arXiv: 2101.03487, 2021.
    C. T. Leonard, R. Craik, and C. Oatis, “The neurophysiology of human locomotion,” in Gait Analysis: Theory and Application. Mosby-Year, 1995.
    N. Hogan, “Impedance control: An approach to manipulation: Part II–Implementation,” Journal of Dynamic Systems,Measurement,and Control, vol. 107, no. 1, pp. 8–16, 1985. doi: 10.1115/1.3140713
    G. Taga, Y. Yamaguchi, and H. Shimizu, “Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment,” Biological Cybernetics, vol. 65, no. 3, pp. 147–159, 1991. doi: 10.1007/BF00198086
    H. Geyer, A. Seyfarth, and R. Blickhan, “Positive force feedback in bouncing gaits?” Proc. the Royal Society of London. Series B:Biological Sciences, vol. 270, no. 1529, pp. 2173–2183, 2003. doi: 10.1098/rspb.2003.2454
    K. Shamaei, G. S. Sawicki, and A. M. Dollar, “Estimation of quasistiffness of the human knee in the stance phase of walking,” PloS One, vol. 8, no. 3, p. e59993, 2013.
    S. Huang, J. P. Wensman, and D. P. Ferris, “Locomotor adaptation by transtibial amputees walking with an experimental powered prosthesis under continuous myoelectric control,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 24, no. 5, pp. 573–581, 2015.
    J. Si and Y.-T. Wang, “Online learning control by association and reinforcement,” IEEE Trans. Neural Networks, vol. 12, no. 2, pp. 264–276, 2001. doi: 10.1109/72.914523
    Z. Yao, J. Si, R. Wu, and J. Yao, “Toward reliable designs of data-driven reinforcement learning tracking control for euler-lagrange systems,” arXiv preprint arXiv: 2101.00068, 2020.
    A. N. Michel and R. K. Miller, Qualitative Analysis of Large Scale Dynamical Systems. Academic Press, 1977.
    S. L. Delp, F. C. Anderson, A. S. Arnold, P. Loan, A. Habib, C. T. John, E. Guendelman, and D. G. Thelen, “OpenSim: Open-source software to create and analyze dynamic simulations of movement,” IEEE Trans. Biomedical Engineering, vol. 54, no. 11, pp. 1940–1950, 2007. doi: 10.1109/TBME.2007.901024
    D. Jacobs, “From the ground up: Building a passive dynamic walker model,” 2014. [Online]. Available: https://simtkconfluence.stanford.edu:8443/display/OpenSim33
    M. P. Kadaba, H. Ramakrishnan, and M. Wootten, “Measurement of lower extremity kinematics during level walking,” Journal of Orthopaedic Research, vol. 8, no. 3, pp. 383–392, 1990. doi: 10.1002/jor.1100080310
    M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Control Design. John Wiley & Sons, Inc., 1995.
    Z.-P. Jiang and H. Nijmeijer, “Tracking control of mobile robots: A case study in backstepping,” Automatica, vol. 33, no. 7, pp. 1393–1399, 1997. doi: 10.1016/S0005-1098(97)00055-1
    K. D. Do, Z.-P. Jiang, and J. Pan, “Simultaneous tracking and stabilization of mobile robots: An adaptive approach,” IEEE Trans. Automatic Control, vol. 49, no. 7, pp. 1147–1151, 2004. doi: 10.1109/TAC.2004.831139
    H. K. Khalil and J. W. Grizzle, Nonlinear Systems. Prentice Hall Upper Saddle River, NJ, 2002, vol. 3.
    H. Nijmeijer and A. Van der Schaft, Nonlinear Dynamical Control Systems. Springer, 1990, vol. 175.
    A. Isidori, Nonlinear Control Systems. Springer Science & Business Media, 2013.
    Z.-S. Hou and Z. Wang, “From model-based control to data-driven control: Survey, classification and perspective,” Information Sciences, vol. 235, pp. 3–35, 2013. doi: 10.1016/j.ins.2012.07.014


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(3)

    Article Metrics

    Article views (5473) PDF downloads(120) Cited by()


    • The first real-time tracking control of a wearable robotic prosthesis with human in the loop
    • Performance guarantees on learning convergence,solution optimality, practical stability
    • Systematic performance evaluation of human-robot system during different walking tasks


    DownLoad:  Full-Size Img  PowerPoint