| Citation: | Y. Xiu, Z. Shi, G. Liu, R. Law, D. Li, A. Song, and E. Q. Wu, “Reinforcement learning-based adaptive optimal control for a snake robot,” IEEE/CAA J. Autom. Sinica, early access, 2026. doi: 10.1109/JAS.2025.125762 |
| [1] |
D. Li, B. Zhang, Y. Xiu, H. Deng, M. Zhang, W. Tong, R. Law, G. Zhu, E. Wu, and L. Zhu, “Snake robots play an important role in social services and military needs,” Innovation, vol. 3, no. 6, p. 100333, Nov. 2022.
|
| [2] |
J. Seetohul and M. Shafiee, “Snake robots for surgical applications: A review,” Robotics, vol. 11, no. 3, p. 57, May 2022. doi: 10.3390/robotics11030057
|
| [3] |
D. Li, L. Zeng, Y. Xiu, Z. Pan, D. Zhang, and H. Deng, “Sideslip elimination and coefficient approximation based trajectory tracking control for snake robots,” IEEE Trans. Ind. Inf., vol. 19, no. 8, pp. 8754–8764, Aug. 2023. doi: 10.1109/TII.2022.3220846
|
| [4] |
D. Li, B. Zhang, P. Li, E. Wu, R. Law, X. Xu, A. Song, and L. Zhu, “Parameter estimation and anti-sideslip line-of-sight method-based adaptive path-following controller for a multijoint snake robot,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 53, no. 8, pp. 4776–4788, Aug. 2023. doi: 10.1109/TSMC.2023.3256383
|
| [5] |
P. Liljebäck, K. Y. Pettersen, Ø. Stavdahl, and J. T. Gravdahl, “A simplified model of planar snake robot locomotion,” in Proc. IEEE-RSJ Int. Conf. Intelligent Robots and Systems, Taipei, China, 2010, pp. 2868–2875.
|
| [6] |
G. Wang, W. Yang, Y. Shen, H. Shao, and C. Wang, “Adaptive path following of underactuated snake robot on unknown and varied frictions ground: Theory and validations,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 4273–4280, Oct. 2018. doi: 10.1109/LRA.2018.2864602
|
| [7] |
E. Kelasidi, K. Y. Pettersen, and J. T. Gravdahl, “Energy efficiency of underwater snake robot locomotion,” in Proc. 23rd Mediterranean Conf. Control and Automation, Torremolinos, Spain, 2015, pp. 1124−1131.
|
| [8] |
B. Xu, M. Jiao, X. Zhang, and D. Zhang, “Path tracking of an underwater snake robot and locomotion efficiency optimization based on improved pigeon-inspired algorithm,” J. Mar. Sci. Eng., vol. 10, no. 1, p. 47, Jan. 2022. doi: 10.3390/jmse10010047
|
| [9] |
D. Zhang, H. Yuan, and Z. Cao, “Environmental adaptive control of a snake-like robot with variable stiffness actuators,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 3, pp. 745–751, May 2020. doi: 10.1109/JAS.2020.1003144
|
| [10] |
P. Liljeback, I. U. Haugstuen, and K. Y. Pettersen, “Path following control of planar snake robots using a cascaded approach,” IEEE Trans. Control Syst. Technol., vol. 20, no. 1, pp. 111–126, Jan. 2012.
|
| [11] |
W. Yang, G. Wang, H. Shao, and Y. Shen, “Spline based curve path following of underactuated snake robots,” in Proc. Int. Conf. Robotics and Automation, Montreal, Canada, 2019, pp. 5352−5358.
|
| [12] |
D. Li, Z. Pan, H. Deng, and L. Hu, “Adaptive path following controller of a multijoint snake robot based on the improved serpenoid curve,” IEEE Trans. Ind. Electron., vol. 69, no. 4, pp. 3831–3842, Apr. 2022. doi: 10.1109/TIE.2021.3075851
|
| [13] |
D. Li, Y. Zhang, W. Tong, P. Li, R. Law, X. Xu, L. Zhu, and E. Wu, “Anti-disturbance path-following control for snake robots with spiral motion,” IEEE Trans. Ind. Inf., vol. 19, no. 12, pp. 11929–11940, Dec. 2023. doi: 10.1109/TII.2023.3254534
|
| [14] |
Y. Xiu, D. Li, M. Zhang, H. Deng, R. Law, Y. Huang, E. Q. Wu, and X. Xu, “Finite-time sideslip differentiator-based LOS guidance for robust path following of snake robots,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 1, pp. 239–253, Jan. 2023. doi: 10.1109/JAS.2022.106052
|
| [15] |
H. Fukushima, T. Yanagiya, Y. Ota, M. Katsumoto, and F. Matsuno, “Model predictive path-following control of snake robots using an averaged model,” IEEE Trans. Control Syst. Technol., vol. 29, no. 6, pp. 2444–2456, Nov. 2021. doi: 10.1109/TCST.2020.3043446
|
| [16] |
H. Li, Q. Zhang, and D. Zhao, “Deep reinforcement learning-based automatic exploration for navigation in unknown environment,” IEEE Trans. Neural Networks Learn. Syst., vol. 31, no. 6, pp. 2064–2076, Jun. 2020. doi: 10.1109/TNNLS.2019.2927869
|
| [17] |
H. Shen, Y. Wang, J. Wang, and J. H. Park, “A fuzzy-model-based approach to optimal control for nonlinear Markov jump singularly perturbed systems: A novel integral reinforcement learning scheme,” IEEE Trans. Fuzzy Syst., vol. 31, no. 10, pp. 3734–3740, Oct. 2023. doi: 10.1109/TFUZZ.2023.3265666
|
| [18] |
D. Li, B. Zhang, R. Law, E. Q. Wu, and X. Xu, “Error constrained-formation path-following method with disturbance elimination for multisnake robots,” IEEE Trans. Ind. Electron., vol. 71, no. 5, pp. 4987–4998, May 2024. doi: 10.1109/TIE.2023.3288202
|
| [19] |
J. Mukherjee, S. Roy, I. N. Kar, and S. Mukherjee, “Maneuvering control of planar snake robot: An adaptive robust approach with artificial time delay,” Int. J. Robust Nonlinear Control, vol. 31, no. 9, pp. 3982–3999, Mar. 2021. doi: 10.1002/rnc.5430
|
| [20] |
D. Li, Y. Zhang, P. Li, R. Law, Z. Xiang, X. Xu, L. Zhu, and E. Wu, “Position errors and interference prediction-based trajectory tracking for snake robots,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 9, pp. 1810–1821, Sep. 2023. doi: 10.1109/JAS.2023.123612
|
| [21] |
D. Li, J. Zhou, Y. Huang, D. Zhang, P. Li, and A. Song, “Integral line of sight guidance scheme-based tracking method for snake robots,” IEEE Trans. Autom. Sci. Eng., vol. 22, pp. 4537–4547, 2025. doi: 10.1109/TASE.2023.3327958
|
| [22] |
L. Chen, C. Dong, and S.-L. Dai, “Reinforcement learning-based finite-time optimal containment control for underactuated surface vehicles with guaranteed performance,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 54, no. 12, pp. 7206–7217, Dec. 2024. doi: 10.1109/TSMC.2024.3449343
|
| [23] |
S. Zuo, Y. Song, F. L. Lewis, and A. Davoudi, “Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning,” IEEE Trans. Cybern., vol. 48, no. 11, pp. 3197–3207, Nov. 2018. doi: 10.1109/TCYB.2017.2761878
|
| [24] |
Y. Xiu, D. Li, H. Deng, S. Jiang, and E. Q. Wu, “Path-following based on fuzzy line-of-sight guidance for a bionic snake robot with unknowns,” IEEE/ASME Trans. Mechatron., vol. 28, no. 6, pp. 3167–3179, Dec. 2023. doi: 10.1109/TMECH.2023.3254817
|
| [25] |
Y. Xiu, Y. Zhang, H. Deng, H. Li, and Y. Xu, “Collaborative line-of-sight guidance-based robust formation control for a multi-snake robot,” IEEE Trans. Autom. Sci. Eng., vol. 22, pp. 4514–4524, 2025. doi: 10.1109/TASE.2023.3348469
|
| [26] |
C. Wang, Y. Shi, Y. Wang, S. Xu, and M. Liang, “Event-triggered adaptive fuzzy output feedback tracking control for pneumatic servo system with input voltage saturation and position constraint,” IEEE Trans. Ind. Inf., vol. 20, no. 3, pp. 4360–4369, Mar. 2024. doi: 10.1109/TII.2023.3316222
|
| [27] |
X.-Q. Cai, P. Zhang, L. Zhao, J. Bian, M. Sugiyama, and A. J. Llorens, “Distributional Pareto-optimal multi-objective reinforcement learning,” in Proc. 37th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2024, pp. 686.
|
| [28] |
J. Wu, J. Zhang, B. Nie, Y. Liu, and X. He, “Adaptive control of PMSM servo system for steering-by-wire system with disturbances observation,” IEEE Trans. Transp. Electrif., vol. 8, no. 2, pp. 2015–2028, Jun. 2022. doi: 10.1109/TTE.2021.3128429
|
| [29] |
H. K. Khalil, Nonlinear Systems. 3rd ed. Upper Saddle River, USA: Prentice Hall, 2002.
|
| [30] |
J. Zhao, Y. Lv, Z. Zhao, and Z. Wang, “Adaptive optimal tracking control of servo mechanisms via generalized policy learning,” IEEE Trans. Instrum. Meas., vol. 73, p. 3002311, Sep. 2024.
|