A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 19.2, Top 1 (SCI Q1)
    CiteScore: 28.2, Top 1% (Q1)
    Google Scholar h5-index: 95, TOP 5
Turn off MathJax
Article Contents
Y. Xiu, Z. Shi, G. Liu, R. Law, D. Li, A. Song, and E. Q. Wu, “Reinforcement learning-based adaptive optimal control for a snake robot,” IEEE/CAA J. Autom. Sinica, early access, 2026. doi: 10.1109/JAS.2025.125762
Citation: Y. Xiu, Z. Shi, G. Liu, R. Law, D. Li, A. Song, and E. Q. Wu, “Reinforcement learning-based adaptive optimal control for a snake robot,” IEEE/CAA J. Autom. Sinica, early access, 2026. doi: 10.1109/JAS.2025.125762

Reinforcement Learning-Based Adaptive Optimal Control for a Snake Robot

doi: 10.1109/JAS.2025.125762
Funds:  This work was supported in part by the National Natural Science Foundation of China (62303117, T2325018, and 62171274), the Military Science and Technology Commission Science and Technology Innovation Project (C1692), and the Fujian Provincial Natural Science Foundation (2024J01278)
More Information
  • Due to the difficulty of accurately modeling snake robots, model-based control schemes are ineffective, and the constraints of motion velocity and energy consumption pose challenges to meandering gait. In this work, a two-layer reinforcement learning-based adaptive optimal control framework for snake robots is proposed to achieve trajectory tracking motion of optimal energy efficiency gait. A multi-objective problem for gait amplitude, frequency, and phase is established in the optimization layer, which balances minimizing energy consumption and maximizing velocity by weighted summation. Multiple matching results of gait parameters and performance are obtained through proximal policy optimization, allowing users to select the optimal combination. In the control layer, an actor-critic-identifier neural network-based reinforcement learning optimal controller is designed by considering the difficulty in solving dynamics unknowns and Bellman equation. It adaptively fits the cost function and control policy, reducing the dependence on an accurate model and avoiding computational complexity. Theoretical analysis demonstrates that the proposed method can guarantee stability of tracking errors for snake robots, with optimal cost. Comparative simulation experiment results show the effectiveness and superiority of this method.

     

  • loading
  • [1]
    D. Li, B. Zhang, Y. Xiu, H. Deng, M. Zhang, W. Tong, R. Law, G. Zhu, E. Wu, and L. Zhu, “Snake robots play an important role in social services and military needs,” Innovation, vol. 3, no. 6, p. 100333, Nov. 2022.
    [2]
    J. Seetohul and M. Shafiee, “Snake robots for surgical applications: A review,” Robotics, vol. 11, no. 3, p. 57, May 2022. doi: 10.3390/robotics11030057
    [3]
    D. Li, L. Zeng, Y. Xiu, Z. Pan, D. Zhang, and H. Deng, “Sideslip elimination and coefficient approximation based trajectory tracking control for snake robots,” IEEE Trans. Ind. Inf., vol. 19, no. 8, pp. 8754–8764, Aug. 2023. doi: 10.1109/TII.2022.3220846
    [4]
    D. Li, B. Zhang, P. Li, E. Wu, R. Law, X. Xu, A. Song, and L. Zhu, “Parameter estimation and anti-sideslip line-of-sight method-based adaptive path-following controller for a multijoint snake robot,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 53, no. 8, pp. 4776–4788, Aug. 2023. doi: 10.1109/TSMC.2023.3256383
    [5]
    P. Liljebäck, K. Y. Pettersen, Ø. Stavdahl, and J. T. Gravdahl, “A simplified model of planar snake robot locomotion,” in Proc. IEEE-RSJ Int. Conf. Intelligent Robots and Systems, Taipei, China, 2010, pp. 2868–2875.
    [6]
    G. Wang, W. Yang, Y. Shen, H. Shao, and C. Wang, “Adaptive path following of underactuated snake robot on unknown and varied frictions ground: Theory and validations,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 4273–4280, Oct. 2018. doi: 10.1109/LRA.2018.2864602
    [7]
    E. Kelasidi, K. Y. Pettersen, and J. T. Gravdahl, “Energy efficiency of underwater snake robot locomotion,” in Proc. 23rd Mediterranean Conf. Control and Automation, Torremolinos, Spain, 2015, pp. 1124−1131.
    [8]
    B. Xu, M. Jiao, X. Zhang, and D. Zhang, “Path tracking of an underwater snake robot and locomotion efficiency optimization based on improved pigeon-inspired algorithm,” J. Mar. Sci. Eng., vol. 10, no. 1, p. 47, Jan. 2022. doi: 10.3390/jmse10010047
    [9]
    D. Zhang, H. Yuan, and Z. Cao, “Environmental adaptive control of a snake-like robot with variable stiffness actuators,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 3, pp. 745–751, May 2020. doi: 10.1109/JAS.2020.1003144
    [10]
    P. Liljeback, I. U. Haugstuen, and K. Y. Pettersen, “Path following control of planar snake robots using a cascaded approach,” IEEE Trans. Control Syst. Technol., vol. 20, no. 1, pp. 111–126, Jan. 2012.
    [11]
    W. Yang, G. Wang, H. Shao, and Y. Shen, “Spline based curve path following of underactuated snake robots,” in Proc. Int. Conf. Robotics and Automation, Montreal, Canada, 2019, pp. 5352−5358.
    [12]
    D. Li, Z. Pan, H. Deng, and L. Hu, “Adaptive path following controller of a multijoint snake robot based on the improved serpenoid curve,” IEEE Trans. Ind. Electron., vol. 69, no. 4, pp. 3831–3842, Apr. 2022. doi: 10.1109/TIE.2021.3075851
    [13]
    D. Li, Y. Zhang, W. Tong, P. Li, R. Law, X. Xu, L. Zhu, and E. Wu, “Anti-disturbance path-following control for snake robots with spiral motion,” IEEE Trans. Ind. Inf., vol. 19, no. 12, pp. 11929–11940, Dec. 2023. doi: 10.1109/TII.2023.3254534
    [14]
    Y. Xiu, D. Li, M. Zhang, H. Deng, R. Law, Y. Huang, E. Q. Wu, and X. Xu, “Finite-time sideslip differentiator-based LOS guidance for robust path following of snake robots,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 1, pp. 239–253, Jan. 2023. doi: 10.1109/JAS.2022.106052
    [15]
    H. Fukushima, T. Yanagiya, Y. Ota, M. Katsumoto, and F. Matsuno, “Model predictive path-following control of snake robots using an averaged model,” IEEE Trans. Control Syst. Technol., vol. 29, no. 6, pp. 2444–2456, Nov. 2021. doi: 10.1109/TCST.2020.3043446
    [16]
    H. Li, Q. Zhang, and D. Zhao, “Deep reinforcement learning-based automatic exploration for navigation in unknown environment,” IEEE Trans. Neural Networks Learn. Syst., vol. 31, no. 6, pp. 2064–2076, Jun. 2020. doi: 10.1109/TNNLS.2019.2927869
    [17]
    H. Shen, Y. Wang, J. Wang, and J. H. Park, “A fuzzy-model-based approach to optimal control for nonlinear Markov jump singularly perturbed systems: A novel integral reinforcement learning scheme,” IEEE Trans. Fuzzy Syst., vol. 31, no. 10, pp. 3734–3740, Oct. 2023. doi: 10.1109/TFUZZ.2023.3265666
    [18]
    D. Li, B. Zhang, R. Law, E. Q. Wu, and X. Xu, “Error constrained-formation path-following method with disturbance elimination for multisnake robots,” IEEE Trans. Ind. Electron., vol. 71, no. 5, pp. 4987–4998, May 2024. doi: 10.1109/TIE.2023.3288202
    [19]
    J. Mukherjee, S. Roy, I. N. Kar, and S. Mukherjee, “Maneuvering control of planar snake robot: An adaptive robust approach with artificial time delay,” Int. J. Robust Nonlinear Control, vol. 31, no. 9, pp. 3982–3999, Mar. 2021. doi: 10.1002/rnc.5430
    [20]
    D. Li, Y. Zhang, P. Li, R. Law, Z. Xiang, X. Xu, L. Zhu, and E. Wu, “Position errors and interference prediction-based trajectory tracking for snake robots,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 9, pp. 1810–1821, Sep. 2023. doi: 10.1109/JAS.2023.123612
    [21]
    D. Li, J. Zhou, Y. Huang, D. Zhang, P. Li, and A. Song, “Integral line of sight guidance scheme-based tracking method for snake robots,” IEEE Trans. Autom. Sci. Eng., vol. 22, pp. 4537–4547, 2025. doi: 10.1109/TASE.2023.3327958
    [22]
    L. Chen, C. Dong, and S.-L. Dai, “Reinforcement learning-based finite-time optimal containment control for underactuated surface vehicles with guaranteed performance,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 54, no. 12, pp. 7206–7217, Dec. 2024. doi: 10.1109/TSMC.2024.3449343
    [23]
    S. Zuo, Y. Song, F. L. Lewis, and A. Davoudi, “Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning,” IEEE Trans. Cybern., vol. 48, no. 11, pp. 3197–3207, Nov. 2018. doi: 10.1109/TCYB.2017.2761878
    [24]
    Y. Xiu, D. Li, H. Deng, S. Jiang, and E. Q. Wu, “Path-following based on fuzzy line-of-sight guidance for a bionic snake robot with unknowns,” IEEE/ASME Trans. Mechatron., vol. 28, no. 6, pp. 3167–3179, Dec. 2023. doi: 10.1109/TMECH.2023.3254817
    [25]
    Y. Xiu, Y. Zhang, H. Deng, H. Li, and Y. Xu, “Collaborative line-of-sight guidance-based robust formation control for a multi-snake robot,” IEEE Trans. Autom. Sci. Eng., vol. 22, pp. 4514–4524, 2025. doi: 10.1109/TASE.2023.3348469
    [26]
    C. Wang, Y. Shi, Y. Wang, S. Xu, and M. Liang, “Event-triggered adaptive fuzzy output feedback tracking control for pneumatic servo system with input voltage saturation and position constraint,” IEEE Trans. Ind. Inf., vol. 20, no. 3, pp. 4360–4369, Mar. 2024. doi: 10.1109/TII.2023.3316222
    [27]
    X.-Q. Cai, P. Zhang, L. Zhao, J. Bian, M. Sugiyama, and A. J. Llorens, “Distributional Pareto-optimal multi-objective reinforcement learning,” in Proc. 37th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2024, pp. 686.
    [28]
    J. Wu, J. Zhang, B. Nie, Y. Liu, and X. He, “Adaptive control of PMSM servo system for steering-by-wire system with disturbances observation,” IEEE Trans. Transp. Electrif., vol. 8, no. 2, pp. 2015–2028, Jun. 2022. doi: 10.1109/TTE.2021.3128429
    [29]
    H. K. Khalil, Nonlinear Systems. 3rd ed. Upper Saddle River, USA: Prentice Hall, 2002.
    [30]
    J. Zhao, Y. Lv, Z. Zhao, and Z. Wang, “Adaptive optimal tracking control of servo mechanisms via generalized policy learning,” IEEE Trans. Instrum. Meas., vol. 73, p. 3002311, Sep. 2024.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(21)  / Tables(2)

    Article Metrics

    Article views (20) PDF downloads(5) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return