IEEE/CAA Journal of Automatica Sinica
Citation: | J. Zhao, C. Yang, W. Gao, L. Zhou, and X. Liu, “Adaptive optimal output regulation of interconnected singularly perturbed systems with application to power systems,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 595–607, Mar. 2024. doi: 10.1109/JAS.2023.123651 |
This article studies the adaptive optimal output regulation problem for a class of interconnected singularly perturbed systems (SPSs) with unknown dynamics based on reinforcement learning (RL). Taking into account the slow and fast characteristics among system states, the interconnected SPS is decomposed into the slow time-scale dynamics and the fast time-scale dynamics through singular perturbation theory. For the fast time-scale dynamics with interconnections, we devise a decentralized optimal control strategy by selecting appropriate weight matrices in the cost function. For the slow time-scale dynamics with unknown system parameters, an off-policy RL algorithm with convergence guarantee is given to learn the optimal control strategy in terms of measurement data. By combining the slow and fast controllers, we establish the composite decentralized adaptive optimal output regulator, and rigorously analyze the stability and optimality of the closed-loop system. The proposed decomposition design not only bypasses the numerical stiffness but also alleviates the high-dimensionality. The efficacy of the proposed methodology is validated by a load-frequency control application of a two-area power system.
[1] |
J. Huang, Nonlinear Output Regulation: Theory and Applications. Philadelphia, USA: SIAM, 2004.
|
[2] |
W. Gao and Z.-P. Jiang, “Learning-based adaptive optimal output regulation of linear and nonlinear systems: An overview,” Control Theory Technol., vol. 20, pp. 1–19, 2022. doi: 10.1007/s11768-022-00081-3
|
[3] |
W. Gao and Z.-P. Jiang, “Adaptive dynamic programming and adaptive optimal output regulation of linear systems,” IEEE Trans. Autom. Control, vol. 61, no. 12, pp. 4164–4169, Dec. 2016. doi: 10.1109/TAC.2016.2548662
|
[4] |
J. Zhao, C. Yang, W. Gao, and L. Zhou, “Reinforcement learning and optimal setpoint tracking control of linear systems with external disturbances,” IEEE Trans. Ind. Informat., vol. 18, no. 11, pp. 7770–7779, Nov. 2022. doi: 10.1109/TII.2022.3151797
|
[5] |
W. Gao, Z.-P. Jiang, F. L. Lewis, and Y. Wang, “Leader-to-formation stability of multiagent systems: An adaptive optimal control approach,” IEEE Trans. Autom. Control, vol. 63, no. 10, pp. 3581–3587, Oct. 2018. doi: 10.1109/TAC.2018.2799526
|
[6] |
W. Gao, C. Deng, Y. Jiang, and Z.-P. Jiang, “Resilient reinforcement learning and robust output regulation under denial-of-service attacks,” Automatica, vol. 142, p. 110366., 2022. doi: 10.1016/j.automatica.2022.110366
|
[7] |
D. Liang and E. Tian, “The binary distributed observer approach to the cooperative output regulation of linear multi-agent systems,” IEEE Trans. Autom. Control, vol. 68, no. 11, pp. 6944–6950, Nov. 2023.
|
[8] |
J. Zhao, C. Yang, W. Gao, and L. Zhou, “Reinforcement learning and optimal control of PMSM speed servo system,” IEEE Trans. Ind. Electron., vol. 70, no. 8, pp. 8305–8313, Aug. 2023. doi: 10.1109/TIE.2022.3220886
|
[9] |
W. Huang, H. Liu, and J. Huang, “Distributed robust containment control of linear heterogeneous multi-agent systems: An output regulation approach,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 5, pp. 864–877, May 2022. doi: 10.1109/JAS.2022.105560
|
[10] |
A. Silani, M. Cucuzzella, J. M. A. Scherpen, and M. J. Yazdanpanah, “Output regulation for load frequency control,” IEEE Trans. Control Syst. Technol., vol. 30, no. 3, pp. 1130–1144, May 2022. doi: 10.1109/TCST.2021.3099096
|
[11] |
W. Gao, J. Gao, K. Ozbay, and Z.-P. Jiang, “Reinforcement-learning-based cooperative adaptive cruise control of buses in the Lincoln tunnel corridor with time-varying topology,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3796–3805, Oct. 2019. doi: 10.1109/TITS.2019.2895285
|
[12] |
M. Huang, W. Gao, Y. Wang, and Z.-P. Jiang, “Data-driven shared steering control of semi-autonomous vehicles,” IEEE Trans. Human-Mach. Syst., vol. 49, no. 4, pp. 350–361, Aug. 2019. doi: 10.1109/THMS.2019.2900409
|
[13] |
F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. Hoboken, USA: Wiley, 2012.
|
[14] |
J. Li, J. Ding, T. Chai, F. L. Lewis, and S. Jagannathan, “Adaptive interleaved reinforcement learning: Robust stability of affine nonlinear systems with unknown uncertainty,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 1, pp. 270–280, Jan. 2022. doi: 10.1109/TNNLS.2020.3027653
|
[15] |
M. Ha, D. Wang, and D. Liu, “Discounted iterative adaptive critic designs with novel stability analysis for tracking control,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1262–1272, Jul. 2022. doi: 10.1109/JAS.2022.105692
|
[16] |
J. Zhao, C. Yang, W. Gao, H. Modares, X. Chen, and W. Dai, “Linear quadratic tracking control of unknown systems: A two-phase reinforcement learning method,” Automatica, vol. 148, p. 110761, 2023. doi: 10.1016/j.automatica.2022.110761
|
[17] |
X. Yang and H. He, “Decentralized event-triggered control for a class of nonlinear-interconnected systems using reinforcement learning,” IEEE Trans. Cyber., vol. 51, no. 2, pp. 635–648, Feb. 2021. doi: 10.1109/TCYB.2019.2946122
|
[18] |
T. Bian and Z.-P. Jiang, “Reinforcement learning for linear continuous-time systems: An incremental learning approach,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 2, pp. 433–440, Mar. 2019. doi: 10.1109/JAS.2019.1911390
|
[19] |
Y. Xu and Z.-G. Wu, “Data-efficient off-policy learning for distributed optimal tracking control of HMAS with unidentified exosystem dynamics,” IEEE Trans. Neural Netw. Learn. Syst., 2022. DOI: 10.1109/TNNLS.2022.3172130
|
[20] |
B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and autonomous control using reinforcement learning: A survey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 6, pp. 2042–2062, Jun. 2018. doi: 10.1109/TNNLS.2017.2773458
|
[21] |
D. Liu, S. Xue, B. Zhao, B. Luo, and Q. Wei, “Adaptive dynamic programming for control: A survey and recent advances,” IEEE Trans. Syst.,Man,Cyber.,Syst., vol. 51, no. 1, pp. 142–160, Jan. 2021. doi: 10.1109/TSMC.2020.3042876
|
[22] |
K. G. Vamvoudakis and N.-M. T. Kokolakis, “Synchronous reinforcement learning-based control for cognitive autonomy,” Found. Trends ® Syst. Control, vol. 8, no. 1–2, pp. 1–175, 2020.
|
[23] |
Z.-P. Jiang, T. Bian, and W. Gao, “Learning-based control: A tutorial and some recent results,” Found. Trends ® Syst. Control, vol. 8, no. 3, pp. 176–284, 2020.
|
[24] |
Y. Jiang and Z.-P. Jiang, Robust Adaptive Dynamic Programming. Hoboken, USA: Wiley-IEEE Press, 2017.
|
[25] |
W. Gao, Y. Jiang, and M. Davari, “Data-driven cooperative output regulation of multi-agent systems via robust adaptive dynamic programming,” IEEE Trans. Circuits Syst. II,Exp. Briefs, vol. 66, no. 3, pp. 447–451, Mar. 2019.
|
[26] |
C. Chen, H. Modares, K. Xie, F. L. Lewis, Y. Wan, and S. Xie, “Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics,” IEEE Trans. Autom. Control, vol. 64, no. 11, pp. 4423–4438, Nov. 2019. doi: 10.1109/TAC.2019.2905215
|
[27] |
Y. Wu, Q. Liang, and J. Hu, “Optimal output regulation for general linear systems via adaptive dynamic programming,” IEEE Trans. Cyber., vol. 52, no. 11, pp. 11916–11926, Nov. 2022. doi: 10.1109/TCYB.2021.3086223
|
[28] |
Y. Jiang, W. Gao, J. Na, D. Zhang, T. T. Hämäläinen, V. Stojanovic, and F. L. Lewis, “Value iteration and adaptive optimal output regulation with assured convergence rate,” Control Eng. Pract., vol. 121, p. 105042, 2022. doi: 10.1016/j.conengprac.2021.105042
|
[29] |
C. Chen, F. L. Lewis, K. Xie, S. Xie, and Y. Liu, “Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems,” Automatica, vol. 119, p. 109081, 2020. doi: 10.1016/j.automatica.2020.109081
|
[30] |
J. Song, Y. Niu, and H.-K. Lam, “Reliable sliding mode control of fast sampling singularly perturbed systems: A redundant channel transmission protocol approach,” IEEE Trans. Circuits Syst. I,Reg. Papers, vol. 66, no. 11, pp. 4490–4501, Nov. 2019. doi: 10.1109/TCSI.2019.2929554
|
[31] |
Y. Lei, Y.-W. Wang, X.-K. Liu, and W. Yang, “Prescribed-time stabilization of singularly perturbed systems,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 2, pp. 569–571, Feb. 2023. doi: 10.1109/JAS.2023.123246
|
[32] |
W. Xue, J. Fan, V. G. Lopez, Y. Jiang, T. Chai, and F. L. Lewis, “Off-policy reinforcement learning for tracking in continuous-time systems on two time scales,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 10, pp. 4334–4346, Oct. 2021. doi: 10.1109/TNNLS.2020.3017461
|
[33] |
S. Mukherjee, H. Bai, and A. Chakrabortty, “Reduced-dimensional reinforcement learning control using singular perturbation approximations,” Automatica, vol. 126, p. 109451, 2021. doi: 10.1016/j.automatica.2020.109451
|
[34] |
P. V. Kokotovic, H. K. Khalil, and J. O’Reilly, Singular Perturbation Methods in Control: Analysis and Design. Philadelphia, USA: SIAM, 1999.
|
[35] |
J. Zhao, C. Yang, and W. Gao, “Reinforcement learning based optimal control of linear singularly perturbed systems,” IEEE Trans. Circuits Syst. II,Exp. Briefs, vol. 69, no. 3, pp. 1362–1366, Mar. 2022.
|
[36] |
C. Yang, Q. Zhang, and L. Zhou, Stability Analysis and Design for Nonlinear Singular Systems. Berlin Heidelberg, Germany: Springer-Verlag, 2013.
|
[37] |
L. Zhou, J. Zhao, L. Ma, and C. Yang, “Decentralized composite suboptimal control for a class of two-time-scale interconnected networks with unknown slow dynamics,” Neurocomputing, vol. 382, pp. 71–79, 2020. doi: 10.1016/j.neucom.2019.11.057
|
[38] |
J. Zhao, C. Yang, W. Dai, and W. Gao, “Reinforcement learning-based composite optimal operational control of industrial systems with multiple unit devices,” IEEE Trans. Ind. Informat., vol. 18, no. 2, pp. 1091–1101, Feb. 2022. doi: 10.1109/TII.2021.3076471
|
[39] |
C. Yang, S. Zhong, X. Liu, W. Dai, and L. Zhou, “Adaptive composite suboptimal control for linear singularly perturbed systems with unknown slow dynamics,” Int. J. Robust Nonlinear Control, vol. 30, no. 7, pp. 2625–2643, 2020. doi: 10.1002/rnc.4895
|
[40] |
X. Liu, C. Yang, B. Luo, and W. Dai, “Suboptimal control for nonlinear slow-fast coupled systems using reinforcement learning and Takagi-Sugeno fuzzy methods,” Int. J. Adapt. Contr. Signal Process., vol. 35, no. 6, pp. 1017–1038, 2021. doi: 10.1002/acs.3234
|
[41] |
H. K. Khalil and P. V. Kokotovic, “Control strategies for decision makers using different models of the same system,” IEEE Trans. Autom. Control, vol. 23, no. 2, pp. 289–298, Apr. 1978. doi: 10.1109/TAC.1978.1101712
|
[42] |
Ü. Özgüner, “Near-optimal control of composite systems: The multi time-scale approach,” IEEE Trans. Autom. Control, vol. 24, no. 4, pp. 652–655, Aug. 1979. doi: 10.1109/TAC.1979.1102131
|
[43] |
H. K. Khalil, “Feedback control of nonstandard singularly perturbed systems,” IEEE Trans. Autom. Control, vol. 34, no. 10, pp. 1052–1060, Oct. 1989. doi: 10.1109/9.35275
|
[44] |
D. Kleinman, “On an iterative technique for Riccati equation computations,” IEEE Trans. Autom. Control, vol. AC-13, no. 1, pp. 114–115, Feb. 1968.
|
[45] |
J. Wang, D. Wang, L. Su, J. H. Park, and H. Shen, “Dynamic event-triggered H∞ load frequency control for multi-area power systems subject to hybrid cyber attacks,” IEEE Trans. Syst.,Man,Cyber.,Syst., vol. 52, no. 12, pp. 7787–7798, Dec. 2022. doi: 10.1109/TSMC.2022.3163261
|
[46] |
Z. Deng and C. Xu, “Frequency regulation of power systems with a wind farm by sliding-mode-based design,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 11, pp. 1980–1989, Nov. 2022. doi: 10.1109/JAS.2022.105407
|
[47] |
Z. Cheng, S. Hu, D. Yue, C. Dou, and S. Shen, “Resilient distributed coordination control of multiarea power systems under hybrid attacks,” IEEE Trans. Syst.,Man,Cyber.,Syst., vol. 52, no. 1, pp. 7–18, Jan. 2022. doi: 10.1109/TSMC.2021.3049373
|