IEEE/CAA Journal of Automatica Sinica
Citation:  M. Wang, H. T. Shi, and C. Wang, “Distributed cooperative learning for discretetime strictfeedback multi agent systems over directed graphs,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 10, pp. 1831–1844, Oct. 2022. doi: 10.1109/JAS.2022.105542 
This paper focuses on the distributed cooperative learning (DCL) problem for a class of discretetime strictfeedback multiagent systems under directed graphs. Compared with the previous DCL works based on undirected graphs, two main challenges lie in that the Laplacian matrix of directed graphs is nonsymmetric, and the derived weight error systems exist nstep delays. Two novel lemmas are developed in this paper to show the exponential convergence for two kinds of linear timevarying (LTV) systems with different phenomena including the nonsymmetric Laplacian matrix and time delays. Subsequently, an adaptive neural network (NN) control scheme is proposed by establishing a directed communication graph along with nstep delays weight updating law. Then, by using two novel lemmas on the extended exponential convergence of LTV systems, estimated NN weights of all agents are verified to exponentially converge to small neighbourhoods of their common optimal values if directed communication graphs are strongly connected and balanced. The stored NN weights are reused to structure learning controllers for the improved control performance of similar control tasks by the “mod” function and proper time series. A simulation comparison is shown to demonstrate the validity of the proposed DCL method.
[1] 
L. X. Wang and J. M. Mendel, “Generating fuzzy rules by learning from examples,” IEEE Trans. Syst.,Man,Cybern., vol. 22, no. 6, pp. 1414–1427, 1992. doi: 10.1109/21.199466

[2] 
K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems using neural networks,” IEEE Trans. Neural Netw., vol. 1, no. 1, pp. 4–27, 1990. doi: 10.1109/72.80202

[3] 
S. Tong and Y. Li, “Observerbased adaptive fuzzy backstepping control of uncertain nonlinear purefeedback systems,” Sci. China Inf. Sci., vol. 57, no. 1, pp. 1–14, 2014.

[4] 
T. Gao, Y.J. Liu, L. Liu, and D. Li, “Adaptive neural networkbased control for a class of nonlinear purefeedback systems with timevarying full state constraints,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 5, pp. 923–933, 2018. doi: 10.1109/JAS.2018.7511195

[5] 
X. Yang and H. He, “Adaptive critic designs for eventtriggered robust control of nonlinear systems with unknown dynamics,” IEEE Trans. Cybern., vol. 49, no. 6, pp. 2255–2267, 2019. doi: 10.1109/TCYB.2018.2823199

[6] 
B. Luo, Y. Yang, and D. Liu, “Adaptive Qlearning for databased optimal output regulation with experience replay,” IEEE Trans. Cybern., vol. 48, no. 12, pp. 3337–3348, 2018. doi: 10.1109/TCYB.2018.2821369

[7] 
C. Wang, D. J. Hill, S. S. Ge, and G. Chen, “An ISSmodular approach for adaptive neural control of purefeedback systems,” Automatica, vol. 42, no. 5, pp. 723–731, 2006. doi: 10.1016/j.automatica.2006.01.004

[8] 
X. Wang, D. Ding, H. Dong, and X.M. Zhang, “Neuralnetworkbased control for discretetime nonlinear systems with input saturation under stochastic communication protocol,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 766–778, 2021.

[9] 
P. C. Yeh and P. V. Kokotovic, “Adaptive control of a class of nonlinear discretetime systems,” Int. J. Control, vol. 62, no. 2, pp. 303–324, 1995. doi: 10.1080/00207179508921545

[10] 
A. Sahoo, H. Xu, and S. Jagannathan, “Adaptive neural networkbased eventtriggered control of singleinput singleoutput nonlinear discretetime systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 1, pp. 151–164, 2016. doi: 10.1109/TNNLS.2015.2472290

[11] 
Y.J. Liu, S. Li, S. C. Tong, and C. L. P. Chen, “Adaptive reinforcement learning control based on neural approximation for nonlinear discretetime systems with unknown nonaffine deadzone input,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 1, pp. 295–305, 2019. doi: 10.1109/TNNLS.2018.2844165

[12] 
S. S. Ge, G. Y. Li, and T. H. Lee, “Adaptive NN control for a class of strictfeedback discretetime nonlinear systems,” Automatica, vol. 39, no. 5, pp. 807–819, 2003. doi: 10.1016/S00051098(03)000323

[13] 
S. S. Ge, C. Yang, and T. H. Lee, “Adaptive predictive control using neural network for a class of purefeedback systems in discrete time,” IEEE Trans. Neural Netw., vol. 19, no. 9, pp. 1599–1614, 2008. doi: 10.1109/TNN.2008.2000446

[14] 
J. Vance and S. Jagannathan, “Discretetime neural network output feedback control of nonlinear discretetime systems in nonstrict form,” Automatica, vol. 44, no. 4, pp. 1020–1027, 2008. doi: 10.1016/j.automatica.2007.08.008

[15] 
M. Wang, Z. Wang, Y. Chen, and W. Sheng, “Eventbased adaptive neural tracking control for discretetime stochastic nonlinear systems: a triggering threshold compensation strategy,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 6, pp. 1968–1981, 2020. doi: 10.1109/TNNLS.2019.2927595

[16] 
M. Wang, Z. Wang, H. Dong, and Q.L. Han, “A novel framework for backsteppingbased control of discretetime strictfeedback nonlinear systems with multiplicative noises,” IEEE Trans. Autom. Control, vol. 66, no. 4, pp. 1484–1496, 2021. doi: 10.1109/TAC.2020.2995576

[17] 
S.L. Dai, S. He, M. Wang, and C. Yuan, “Adaptive neural control of underactuated surface vessels with prescribed performance guarantees,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 12, pp. 3686–3698, 2019. doi: 10.1109/TNNLS.2018.2876685

[18] 
J. Zhang, C. Sun, R. Zhang, and C. Qian, “Adaptive sliding mode control for reentry attitude of near space hypersonic vehicle based on backstepping design,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 1, pp. 94–101, 2015. doi: 10.1109/JAS.2015.7032910

[19] 
Q. Zhou, S. Zhao, H. Li, R. Lu, and C. Wu, “Adaptive neural network tracking control for robotic manipulators with dead zone,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 12, pp. 3611–3620, 2019. doi: 10.1109/TNNLS.2018.2869375

[20] 
W. He, Z. Li, and C. L. P. Chen, “A survey of humancentered intelligent robots: issues and challenges,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 4, pp. 602–609, 2017. doi: 10.1109/JAS.2017.7510604

[21] 
M. Wang, L. Wang, R. Huang, and C. Yang, “Eventbased disturbance compensation control for discretetime SPMSM with mismatched disturbances,” Int. Journal of Systems Science, vol. 52, no. 4, pp. 785–804, 2021. doi: 10.1080/00207721.2020.1840650

[22] 
P. J. Antsaklis and A. Rahnama, “Control and machine intelligence for system autonomy,” J. Intell. Robot. Syst, vol. 91, no. 1, pp. 23–34, 2018. doi: 10.1007/s1084601808326

[23] 
K. S. Fu, “Learning control systems and intelligent control systems: an intersection of artificial intelligence and automatic control,” IEEE Trans. Autom. Control, vol. 16, no. 1, pp. 70–72, 1971. doi: 10.1109/TAC.1971.1099633

[24] 
C. Wang and D. J. Hill, “Learning from neural control,” IEEE Trans. Neural Netw., vol. 17, no. 1, pp. 130–146, 2006. doi: 10.1109/TNN.2005.860843

[25] 
M. Wang, C. Wang, P. Shi, and X. Liu, “Dynamic learning from neural control for strictfeedback systems with guaranteed predefined performance,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 12, pp. 2564–2576, 2016. doi: 10.1109/TNNLS.2015.2496622

[26] 
M. Wang, Y. Zhang, and C. Wang, “Learning from neural control for nonaffine systems with full state constraints using command filtering,” Int. J. Control, vol. 93, no. 10, pp. 2392–2406, 2020. doi: 10.1080/00207179.2018.1558285

[27] 
J. Zhang, C. Yuan, C. Wang, P. Stegagno, and W. Zeng, “Composite adaptive NN learning and control for discretetime nonlinear uncertain systems in normal form,” Neurocomputing, vol. 390, pp. 168–184, 2020. doi: 10.1016/j.neucom.2020.01.052

[28] 
J. Zhang, C. Yuan, P. Stegagno, H. He, and C. Wang, “Small fault detection of discretetime nonlinear uncertain systems,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 750–764, 2021. doi: 10.1109/TCYB.2019.2945629

[29] 
S.L. Dai, C. Wang, and M. Wang, “Dynamic learning from adaptive neural network control of a class of nonaffine nonlinear systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 1, pp. 111–123, 2014. doi: 10.1109/TNNLS.2013.2257843

[30] 
C. Yang, X. Wang, L. Cheng, and H. Ma, “Neurallearning based telerobot control with guaranteed performance,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3148–3159, 2017.

[31] 
K. You and L. Xie, “Network topology and communication data rate for consensusability of discretetime multiagent systems,” IEEE Trans. Autom. Control, vol. 56, no. 10, pp. 2262–2275, 2011. doi: 10.1109/TAC.2011.2164017

[32] 
A. Amini, A. Asif, and A. Mohammadi, “Formationcontainment control using dynamic eventtriggering mechanism for multiagent systems,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1235–1248, 2020.

[33] 
B. Ning, Q.L. Han, Z. Zuo, J. Jin, and J. Zheng, “Collective behaviors of mobile robots beyond the nearest neighbor rules with switching topology,” IEEE Trans. Cybern., vol. 48, no. 5, pp. 1577–1590, 2018. doi: 10.1109/TCYB.2017.2708321

[34] 
S. He, M. Wang, S.L. Dai, and F. Luo, “Leaderfollower formation control of USVs with prescribed performance and collision avoidance,” IEEE Trans. Ind. Informa., vol. 15, no. 1, pp. 572–581, 2019. doi: 10.1109/TII.2018.2839739

[35] 
R. OlfatiSaber and R. M. Murray, “Consensus problems in networks of agents with switching topology and timedelays,” IEEE Trans. Autom. Control, vol. 49, no. 9, pp. 1520–1533, 2004. doi: 10.1109/TAC.2004.834113

[36] 
L. Liu, L. Ma, J. Zhang, and Y. Bo, “Distributed nonfragile setmembership filtering for nonlinear systems under fading channels and bias injection attacks,” Int. Journal of Systems Science, vol. 52, no. 6, pp. 1192–1205, 2021. doi: 10.1080/00207721.2021.1872118

[37] 
W. Chen, S. Hua, and S. S. Ge, “Consensusbased distributed cooperative learning control for a group of discretetime nonlinear multiagent systems using neural networks,” Automatica, vol. 50, no. 9, pp. 2254–2268, 2014. doi: 10.1016/j.automatica.2014.07.020

[38] 
W. Chen, S. Hua, and H. Zhang, “Consensusbased distributed cooperative learning from closedloop neural control systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 2, pp. 331–345, 2015. doi: 10.1109/TNNLS.2014.2315535

[39] 
F. Gao, W. Chen, Z. Li, J. Li, and B. Xu, “Neural networkbased distributed cooperative learning control for multiagent systems via eventtriggered communication,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 2, pp. 407–419, 2020. doi: 10.1109/TNNLS.2019.2904253

[40] 
S.L. Dai, S. He, Y. Ma, J. Li, and C. Yuan, “Distributed cooperative learning control of uncertain multiagent systems with prescribed performance and preserved connectivity,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 7, pp. 3217–3229, 2021. doi: 10.1109/TNNLS.2020.3010690

[41] 
M. Abdelatti, C. Yuan, W. Zeng, and C. Wang, “Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators”, Sci. China Inf. Sci., vol. 61, no. 11, pp. 112201, 2018.

[42] 
C. Yuan, H. He, and C. Wang, “Cooperative deterministic learning based formation control for a group of nonlinear uncertain mechanical systems,” IEEE Trans. Ind. Informat., vol. 15, no. 1, pp. 319–333, 2019. doi: 10.1109/TII.2018.2792455

[43] 
W. Yu, G. Chen, and M. Cao, “Consensus in directed networks of agents with nonlinear dynamics,” IEEE Trans. Autom. Control, vol. 56, no. 6, pp. 1436–1441, 2011. doi: 10.1109/TAC.2011.2112477

[44] 
J. Ni, P. Shi, Y. Zhao, and Z. Wu, “Fixedtime output consensus tracking for highorder multiagent systems with directed network topology and packet dropout,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 817–836, 2021. doi: 10.1109/JAS.2021.1003916

[45] 
Q. Wei, X. Wang, X. Zhong, and N. Wu, “Consensus control of leaderfollowing multiagent systems in directed topology with heterogeneous disturbances,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 423–431, 2021. doi: 10.1109/JAS.2021.1003838

[46] 
D. Lee and M. W. Spong, “Stable flocking of multiple inertial agents on balanced graphs,” IEEE Trans. Autom. Control, vol. 52, no. 8, pp. 1469–1475, 2007. doi: 10.1109/TAC.2007.902752

[47] 
W. Chen and W. Ren, “Eventtriggered zerogradientsum distributed consensus optimization over directed networks,” Automatica, vol. 65, pp. 90–97, 2016. doi: 10.1016/j.automatica.2015.11.015

[48] 
R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: Cambridge Univ. Press, 1987.

[49] 
T. Zheng, C. Wang, “Relationship between persistent excitation levels and RBF network structures, with application to performance analysis of deterministic learning,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3380–3392, 2017. doi: 10.1109/TCYB.2017.2710284

[50] 
H. K. Khalil, Nonlinear Systems. New Jersey, NJ, USA: PrenticeHall, vol. 3, 1996.
