A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 19.2, Top 1 (SCI Q1)
    CiteScore: 28.2, Top 1% (Q1)
    Google Scholar h5-index: 95, TOP 5
Turn off MathJax
Article Contents
C. Xu, J. Zhang, and H. Yu, “Control-communication co-optimization for wireless cloud robotic system via multi-agent transfer reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 2, pp. 1–16, Feb. 2026. doi: 10.1109/JAS.2025.125894
Citation: C. Xu, J. Zhang, and H. Yu, “Control-communication co-optimization for wireless cloud robotic system via multi-agent transfer reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 2, pp. 1–16, Feb. 2026. doi: 10.1109/JAS.2025.125894

Control-Communication Co-Optimization for Wireless Cloud Robotic System via Multi-Agent Transfer Reinforcement Learning

doi: 10.1109/JAS.2025.125894
Funds:  This work was supported in part by the National Natural Science Foundation of China (62522320, 92267108, 62173322), Liaoning Revitalization Talents Program (XLYC2403062), and the Science and Technology Program of Liaoning Province (2023JH3/10200004, 2022JH25/10100005)
More Information
  • The wireless cloud robotic system (WCRS), which fully integrates sensing, communication, computing, and control capabilities as an intelligent agent, is a promising way to achieve intelligent manufacturing due to easy deployment and flexible expansion. However, the high-precision control of WCRS requires deterministic wireless communication, which is always challenging in the complex and dynamic radio space. This paper employs the reconfigurable intelligent surface (RIS) to establish a novel RIS-assisted WCRS architecture, where the radio channel is controlled to achieve ultra-reliable, low-delay, low-jitter communication for high-precision closed-loop motion control. However, control and communication are strongly coupled and should be co-optimized. Fully considering the constraints of control input threshold, control delay deadline, beam phase, antenna power, and information distortion, we establish a stability maximization problem to jointly optimize control input compensation, RIS phase shift, and beamforming. Herein, a new jitter-oriented system stability objective with respect to control error and communication jitter is defined and the closed-form expression of control delay deadline is derived based on the Jensen Inequality and Lyapunov-Krasovskii functional. Due to the time-varying and partial observability of the channel and robot states, we model the problem as a partially observable Markov decision process (POMDP). To solve this complex problem, we propose a multi-agent transfer reinforcement learning algorithm named LSTM-PPO-MATRL, where the LSTM-enhanced proximal policy optimization (PPO) is designed to approximate an optimal solution and the option-guided policy transfer learning is proposed to facilitate the learning process. By centralized training and decentralized execution, LSTM-PPO-MATRL is validated by extensive experiments on MuJoCo tasks for both low-mobility and high-mobility robotic control scenarios. The results demonstrate that LSTM-PPO-MATRL not only realizes high learning efficiency, but also supports low-delay, low-jitter communication for low error control, where 71.9% control accuracy improvement and 68.7% delay jitter reduction are achieved compared to the PPO-MADRL baseline.

     

  • loading
  • [1]
    V. Dawarka and G. Bekaroo, “Building and evaluating cloud robotic systems: A systematic review,” Rob. Comput. Integr. Manuf., vol. 73, p. 102240, Feb. 2022. doi: 10.1016/j.rcim.2021.102240
    [2]
    M. Afrin, J. Jin, A. Rahman, A. Rahman, J. Wan, and E. Hossain, “Resource allocation and service provisioning in multi-agent cloud robotics: A comprehensive survey,” IEEE Commun. Surv. Tutor., vol. 23, no. 2, pp. 842–870, Feb. 2021. doi: 10.1109/COMST.2021.3061435
    [3]
    B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, “A survey of research on cloud robotics and automation,” IEEE Trans. Autom. Sci. Eng., vol. 12, no. 2, pp. 398–409, Apr. 2015. doi: 10.1109/TASE.2014.2376492
    [4]
    Z. Sheng, S. Pfersich, A. Eldridge, J. Zhou, D. Tian, and V. C. M. Leung, “Wireless acoustic sensor networks and edge computing for rapid acoustic monitoring,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 64–74, Jan. 2019. doi: 10.1109/JAS.2019.1911324
    [5]
    X. Liu, Z. Zeng, and S. Wen, “Implementation of memristive neural network with full-function Pavlov associative memory,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 63, no. 9, pp. 1454–1463, Sep. 2016. doi: 10.1109/TCSI.2016.2570819
    [6]
    C. Xu, P. Zeng, H. Yu, X. Jin, and C. Xia, “WIA-NR: Ultra-reliable low-latency communication for industrial wireless control networks over unlicensed bands,” IEEE Netw., vol. 35, no. 1, pp. 258–265, Jan.-Feb. 2021. doi: 10.1109/MNET.011.2000308
    [7]
    J. Fan, L. Jin, P. Li, J. Liu, Z.-G. Wu, and W. Chen, “Coevolutionary neural dynamics considering multiple strategies for nonconvex optimization,” Tsinghua Sci. Technol., 2025. DOI: 10.26599/TST.2025.9010120.
    [8]
    Y. Qiu, S. Wu, J. Jiao, N. Zhang, and Q. Zhang, “Model-free control in wireless cyber-physical system with communication latency: A DRL method with improved experience replay,” IEEE Trans. Cybern., vol. 53, no. 7, pp. 4704–4717, Jul. 2023. doi: 10.1109/TCYB.2023.3275150
    [9]
    A. Singh and R. M. Hegde, “GEE maximization in UAV-aided mobile IoT networks using deep reinforcement learning,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Hyderabad, India, 2025, pp. 1−5.
    [10]
    M. Samir, C. Assi, S. Sharafeddine, and A. Ghrayeb, “Online altitude control and scheduling policy for minimizing AoI in UAV-assisted IoT wireless networks,” IEEE Trans. Mob. Comput., vol. 21, no. 7, pp. 2493–2505, Jul. 2022. doi: 10.1109/tmc.2020.3042925
    [11]
    Z. Wang, Y. Xue, L. Liu, H. Zhang, C. Qu, and C. Fang, “Multi-agent DRL-controlled connected and automated vehicles in mixed traffic with time delays,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 11, pp. 17676–17688, Nov. 2024. doi: 10.1109/TITS.2024.3435036
    [12]
    Z. Lv, C. Cheng, and H. Lv, “Multi-robot distributed communication in heterogeneous robotic systems on 5G networking,” IEEE Wirel. Commun., vol. 30, no. 2, pp. 98–104, Apr. 2023. doi: 10.1109/MWC.001.2200315
    [13]
    S. Alirezazadeh and L. A. Alexandre, “A survey on task allocation and scheduling in robotic network systems,” IEEE Internet Things J., vol. 12, no. 2, pp. 1484–1508, Jan. 2025. doi: 10.1109/JIOT.2024.3491944
    [14]
    X. Huang, L. He, X. Chen, L. Wang, and F. Li, “Revenue and energy efficiency-driven delay-constrained computing task offloading and resource allocation in a vehicular edge computing network: A deep reinforcement learning approach,” IEEE Internet Things J., vol. 9, no. 11, pp. 8852–8868, Jun. 2022. doi: 10.1109/JIOT.2021.3116108
    [15]
    C. Xu, P. Zhang, H. Yu, and Y. Li, “D3QN-based multi-priority computation offloading for time-sensitive and interference-limited industrial wireless networks,” IEEE Trans. Veh. Technol., vol. 73, no. 9, pp. 13682–13693, Sep. 2024. doi: 10.1109/TVT.2024.3387567
    [16]
    C. Xu, Z. Tang, H. Yu, P. Zeng, and L. Kong, “Digital twin-driven collaborative scheduling for heterogeneous task and edge-end resource via multi-agent deep reinforcement learning,” IEEE J. Sel. Areas Commun., vol. 41, no. 10, pp. 3056–3069, Oct. 2023. doi: 10.1109/JSAC.2023.3310066
    [17]
    C. Xu, P. Zhang, and H. Yu, “Lyapunov-guided resource allocation and task scheduling for edge computing cognitive radio networks via deep reinforcement learning,” IEEE Sens. J., vol. 25, no. 7, pp. 12253–12264, Apr. 2025. doi: 10.1109/JSEN.2025.3542972
    [18]
    Y. Zhao, J. Hu, K. Yang, and X. Wei, “A joint communication and control system for URLLC in industrial IoT,” IEEE Trans. Veh. Technol., vol. 72, no. 11, pp. 15074–15079, Nov. 2023. doi: 10.1109/TVT.2023.3281718
    [19]
    W. Cao, J. Yan, X. Yang, X. Luo, and X. Guan, “Communication-aware formation control of AUVs with model uncertainty and fading channel via integral reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 1, pp. 159–176, Jan. 2023. doi: 10.1109/JAS.2023.123021
    [20]
    Y. Liu, P. Zeng, J. Cui, and C. Xia, “Co-design of control, computation, and network scheduling based on reinforcement learning,” IEEE Internet Things J., vol. 11, no. 3, pp. 5249–5258, Feb. 2024. doi: 10.1109/JIOT.2023.3305708
    [21]
    Z. Lyu, C. Ren, and L. Qiu, “Movement and communication co-design in multi-UAV enabled wireless systems via DRL,” in Proc. IEEE 6th Int. Conf. Computer and Communications, Chengdu, China, 2020, pp. 220−226.
    [22]
    Z. Zhao, W. Liu, D. E. Quevedo, Y. Li, and B. Vucetic, “Deep learning for wireless-networked systems: A joint estimation-control-scheduling approach,” IEEE Internet Things J., vol. 11, no. 3, pp. 4535–4550, Feb. 2024. doi: 10.1109/JIOT.2023.3300074
    [23]
    T. Zhou, M. Chen, and J. Zou, “Reinforcement learning based data fusion method for multi-sensors,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 6, pp. 1489–1497, Nov. 2020. doi: 10.1109/jas.2020.1003180
    [24]
    S. He, Z. An, J. Zhu, J. Zhang, Y. Huang, and Y. Zhang, “Beamforming design for multiuser uRLLC with finite blocklength transmission,” IEEE Trans. Wirel. Commun., vol. 20, no. 12, pp. 8096–8109, Dec. 2021. doi: 10.1109/TWC.2021.3090197
    [25]
    X. Gao, Y. Li, and H. Fan, “Stability in distribution of highly nonlinear hybrid stochastic delay systems by delay feedback control,” IEEE Trans. Autom. Control, vol. 69, no. 3, pp. 1834–1841, Mar. 2024. doi: 10.1109/TAC.2023.3328231
    [26]
    J. Liu, D. W. C. Ho, and L. Li, “A generic algorithm framework for distributed optimization over the time-varying network with communication delays,” IEEE Trans. Autom. Control, vol. 69, no. 1, pp. 371–378, Jan. 2024. doi: 10.1109/TAC.2023.3264784
    [27]
    C.-K. Zhang, Y. He, L. Jiang, and M. Wu, “Notes on stability of time-delay systems: Bounding inequalities and augmented lyapunov-krasovskii functionals,” IEEE Trans. Autom. Control, vol. 62, no. 10, pp. 5331–5336, Oct. 2017. doi: 10.1109/TAC.2016.2635381
    [28]
    C. Tan and H. Zhang, “Necessary and sufficient stabilizing conditions for networked control systems with simultaneous transmission delay and packet dropout,” IEEE Trans. Autom. Control, vol. 62, no. 8, pp. 4011–4016, Aug. 2017. doi: 10.1109/TAC.2016.2614887
    [29]
    T. Schauss, A. Peer, and M. Buss, “Parameter-space stability analysis of LTI time-delay systems with parametric uncertainties,” IEEE Trans. Autom. Control, vol. 63, no. 11, pp. 3927–3934, Nov. 2018. doi: 10.1109/TAC.2018.2808039
    [30]
    H. Yu, T. Taleb, and J. Zhang, “Deterministic latency/jitter-aware service function chaining over beyond 5G edge fabric,” IEEE Trans. Netw. Serv. Manag., vol. 19, no. 3, pp. 2148–2162, Sep. 2022. doi: 10.1109/TNSM.2022.3151431
    [31]
    S. Sharifi and S. Shahbazpanahi, “A POMDP-based approach to joint antenna selection and user scheduling for multi-user massive MIMO communication,” IEEE Trans. Commun., vol. 71, no. 3, pp. 1691–1706, Mar. 2023. doi: 10.1109/TCOMM.2022.3227304
    [32]
    W. Gao, Z. Yu, L. Wang, H. Cui, B. Guo, and H. Xiong, “Hierarchical deep reinforcement learning for computation offloading in autonomous multi-robot systems,” IEEE Robot. Autom. Lett., vol. 10, no. 1, pp. 540–547, Jan. 2025. doi: 10.1109/LRA.2024.3511408
    [33]
    S. Han, L. Jin, X. Xu, X. Tao, and P. Zhang, “R3C: Reliability and control cost co-aware in RIS-assisted wireless control systems for IIoT,” IEEE Internet Things J., vol. 11, no. 8, pp. 13692–13707, Apr. 2024. doi: 10.1109/JIOT.2023.3338618
    [34]
    Y. Zou, Y. Liu, X. Mu, X. Zhang, Y. Liu, and C. Yuen, “Machine learning in RIS-assisted NOMA IoT networks,” IEEE Internet Things J., vol. 10, no. 22, pp. 19427–19440, Nov. 2023. doi: 10.1109/JIOT.2023.3245288
    [35]
    C.-K. Zhang, Y. He, L. Jiang, M. Wu, and H.-B. Zeng, “Summation inequalities to bounded real lemmas of discrete-time systems with time-varying delay,” IEEE Trans. Autom. Control, vol. 62, no. 5, pp. 2582–2588, May 2017. doi: 10.1109/TAC.2016.2600024

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(5)

    Article Metrics

    Article views (27) PDF downloads(3) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return