IEEE/CAA Journal of Automatica Sinica
Citation: | M. M. Ha, D. Wang, and D. Liu, “Discounted iterative adaptive critic designs with novel stability analysis for tracking control,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1262–1272, Jul. 2022. doi: 10.1109/JAS.2022.105692 |
[1] |
D. Liu, Q. L. Wei, D. Wang, X. Yang, and H. L. Li, Adaptive Dynamic Programming with Applications in Optimal Control. Cham, Germany: Springer, 2017.
|
[2] |
D. Liu and Q. L. Wei, “Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems,” IEEE Trans. Cybern., vol. 43, no. 2, pp. 779–789, Apr. 2013. doi: 10.1109/TSMCB.2012.2216523
|
[3] |
D. Wang, J. F. Qiao, and L. Cheng, “An approximate neuro-optimal solution of discounted guaranteed cost control design,” IEEE Trans. Cybern., vol. 52, no. 1, pp. 77–86, Jan. 2022. doi: 10.1109/TCYB.2020.2977318
|
[4] |
Q. L. Wei and D. Liu, “A novel iterative θ-adaptive dynamic programming for discrete-time nonlinear systems,” IEEE Trans. Autom. Sci. Eng., vol. 11, no. 4, pp. 1176–1190, Oct. 2014. doi: 10.1109/TASE.2013.2280974
|
[5] |
M. M. Ha, D. Wang, and D. Liu, “Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems,” IEEE Trans. Syst.,Man,Cybern.: Syst., vol. 50, no. 9, pp. 3158–3168, Sept. 2020. doi: 10.1109/TSMC.2018.2868510
|
[6] |
D. Wang, M. M. Ha, and M. M. Zhao, “The intelligent critic framework for advanced optimal control,” Artif. Intell. Rev., vol. 55, no. 1, pp. 1–22, Jan. 2022. doi: 10.1007/s10462-021-10118-9
|
[7] |
M. M. Ha, D. Wang, and D. Liu, “Event-triggered constrained control with DHP implementation for nonaffine discrete-time systems,” Inf. Sci., vol. 519, pp. 110–123, May 2020. doi: 10.1016/j.ins.2020.01.020
|
[8] |
D. Wang, H. B. He, and D. Liu, “Adaptive critic nonlinear robust control: A survey,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3429–3451, Oct. 2017. doi: 10.1109/TCYB.2017.2712188
|
[9] |
Q. L. Wei, D. Liu, Y. Liu, and R. Z. Song, “Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168–176, Apr. 2017. doi: 10.1109/JAS.2016.7510262
|
[10] |
D. Liu, Y. C. Xu, Q. L. Wei, and X. L. Liu, “Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 36–46, Jan. 2018. doi: 10.1109/JAS.2017.7510739
|
[11] |
A. Heydari and S. N. Balakrishnan, “Adaptive critic-based solution to an orbital rendezvous problem,” J. Guid.,Control,Dyn., vol. 37, no. 1, pp. 344–350, Jan. 2014. doi: 10.2514/1.60553
|
[12] |
A. Heydari, “Theoretical and numerical analysis of approximate dynamic programming with approximation errors,” J. Guid.,Control,Dyn., vol. 39, no. 2, pp. 301–311, Feb. 2016. doi: 10.2514/1.G001154
|
[13] |
D. Wang, M. M. Ha, and J. F. Qiao, “Data-driven iterative adaptive critic control toward an urban wastewater treatment plant,” IEEE Trans. Ind. Electron., vol. 68, no. 8, pp. 7362–7369, Aug. 2021. doi: 10.1109/TIE.2020.3001840
|
[14] |
X. Han, Z. Z. Zheng, L. Liu, B. Wang, Z. T. Cheng, H. J. Fan, and Y. J. Wang, “Online policy iteration ADP-based attitude-tracking control for hypersonic vehicles,” Aerosp. Sci. Technol., vol. 106, p. 106233, Nov. 2020.
|
[15] |
F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32–50, Jan. 2009. doi: 10.1109/MCAS.2009.933854
|
[16] |
F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, “Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,” IEEE Control Syst. Mag., vol. 32, no. 6, pp. 76–105, Dec. 2012. doi: 10.1109/MCS.2012.2214134
|
[17] |
H. Li and D. Liu, “Optimal control for discrete-time affine non-linear systems using general value iteration,” IET Control Theory Appl., vol. 6, no. 18, pp. 2725–2736, Dec. 2012. doi: 10.1049/iet-cta.2011.0783
|
[18] |
Q. L. Wei, D. Liu, and H. Q. Lin, “Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems,” IEEE Trans. Cybern., vol. 46, no. 3, pp. 840–853, Mar. 2016. doi: 10.1109/TCYB.2015.2492242
|
[19] |
D. Wang and X. N. Zhong, “Advanced policy learning near-optimal regulation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 743–749, May 2019. doi: 10.1109/JAS.2019.1911489
|
[20] |
A. Heydari, “Stability analysis of optimal adaptive control using value iteration with approximation errors,” IEEE Trans. Autom. Control, vol. 63, no. 9, pp. 3119–3126, Sept. 2018. doi: 10.1109/TAC.2018.2790260
|
[21] |
D. Liu and Q. L. Wei, “Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems,” IEEE Trans. Neural Netw. Learn. Syet., vol. 25, no. 3, pp. 621–634, Mar. 2014. doi: 10.1109/TNNLS.2013.2281663
|
[22] |
D. P. Bertsekas, “Value and policy iterations in optimal control and adaptive dynamic programming,” IEEE Trans. Neural Netw. Learn. Syet., vol. 28, no. 3, pp. 500–509, Mar. 2017. doi: 10.1109/TNNLS.2015.2503980
|
[23] |
D. Liu, D. Wang, and H. L. Li, “Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 2, pp. 418–428, Feb. 2014. doi: 10.1109/TNNLS.2013.2280013
|
[24] |
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof,” IEEE Trans. Syst.,Man,Cybern. Part B Cybern., vol. 38, no. 4, pp. 943–949, Aug. 2008. doi: 10.1109/TSMCB.2008.926614
|
[25] |
D. Liu, X. Yang, D. Wang, and Q. L. Wei, “Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints,” IEEE Trans. Cybern., vol. 45, no. 7, pp. 1372–1385, Jul. 2015. doi: 10.1109/TCYB.2015.2417170
|
[26] |
D. Liu, S. Xue, B. Zhao, B. Luo, and Q. L. Wei, “Adaptive dynamic programming for control: A survey and recent advances,” IEEE Trans. Syst.,Man,Cybern.: Syst., vol. 51, no. 1, pp. 142–160, Jan. 2021. doi: 10.1109/TSMC.2020.3042876
|
[27] |
B. Lincoln and A. Rantzer, “Relaxing dynamic programming,” IEEE Trans. Autom. Control, vol. 51, no. 8, pp. 1249–1260, Aug. 2006. doi: 10.1109/TAC.2006.878720
|
[28] |
A. Heydari, “Stability analysis of optimal adaptive control under value iteration using a stabilizing initial policy,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 9, pp. 4522–4527, Sept. 2018. doi: 10.1109/TNNLS.2017.2755501
|
[29] |
D. Wang, D. Liu, Q. L. Wei, D. B. Zhao, and N. Jin, “Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming,” Automatica, vol. 48, no. 8, p. 1825–1832, Aug. 2012.
|
[30] |
M. M. Ha, D. Wang, and D. Liu. “Generalized value iteration for discounted optimal control with stability analysis,” Syst. Control Lett., vol. 147, p. 104847, Jan. 2021.
|
[31] |
D. Wang, M. M. Ha, and J. F. Qiao, “Self-learning optimal regulation for discrete-time nonlinear systems under event-driven formulation,” IEEE Trans. Autom. Control, vol. 65, no. 3, pp. 1272–1279, Mar. 2020. doi: 10.1109/TAC.2019.2926167
|
[32] |
H. G. Zhang, Q. L. Wei, and Y. H. Luo, “A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm,” IEEE Trans. Syst. Man,Cybern. Part B Cybern., vol. 38, no. 4, pp. 937–942, Aug. 2008. doi: 10.1109/TSMCB.2008.920269
|
[33] |
B. Kiumarsi, F. L. Lewis, H. Modares, A. Karimpour, and M. B. Naghibi-Sistani, “Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics,” Automatica, vol. 50, no. 4, pp. 1167–1175, Apr. 2014. doi: 10.1016/j.automatica.2014.02.015
|
[34] |
B. Kiumarsi and F. L. Lewis, “Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 1, pp. 140–151, Jan. 2015. doi: 10.1109/TNNLS.2014.2358227
|
[35] |
Y. Cao, Y. D. Song, and C. Y. Wen, “Practical tracking control of perturbed uncertain nonaffine systems with full state constraints,” Automatica, vol. 110, p. 108608, Dec. 2019.
|
[36] |
M. M. Ha, D. Wang, and D. Liu, “Data-based nonaffine optimal tracking control using iterative DHP approach,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 4246–4251, Jul. 2020. doi: 10.1016/j.ifacol.2020.12.2473
|
[37] |
M. M. Ha, D. Wang, and D. Liu, “Value-iteration-based neuro-optimal tracking control for affine systems with completely unknown dynamics,” in Proc. 39th Chinese Control Conf., Shenyang, China, 2020, pp. 1951–1956.
|
[38] |
D. Wang, D. Liu, and Q. L. Wei, “Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach,” Neurocomputing, vol. 78, no. 1, pp. 14–22, Feb. 2012. doi: 10.1016/j.neucom.2011.03.058
|
[39] |
B. Kiumarsi, F. L. Lewis, M. B. Naghibi-Sistani, and A. Karimpour, “Optimal tracking control of unknown discrete-time linear systems using input-output measured data,” IEEE Trans. Cybern., vol. 45, no. 12, pp. 2770–2779, Dec. 2015. doi: 10.1109/TCYB.2014.2384016
|
[40] |
L. Liu, Z. S. Wang, and H. G. Zhang, “Neural-network-based robust optimal tracking control for MIMO discrete-time systems with unknown uncertainty using adaptive critic design,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 4, pp. 1239–1251, Apr. 2018. doi: 10.1109/TNNLS.2017.2660070
|
[41] |
B. Luo, D. Liu, T. W. Huang, and D. Wang, “Model-free optimal tracking control via critic-only Q-learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 10, pp. 2134–2144, Oct. 2016. doi: 10.1109/TNNLS.2016.2585520
|
[42] |
C. Li, J. L. Ding, F. L. Lewis, and T. Y. Chai, “A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems,” Automatica, vol. 129, p. 109687, Jul. 2021.
|
[43] |
H. Modares and F. L. Lewis, “Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning,” Automatica, vol. 50, no. 7, pp. 1780–1792, Jul. 2014. doi: 10.1016/j.automatica.2014.05.011
|
[44] |
C. B. Qin, H. G. Zhang, and Y. H. Luo, “Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming,” Int. J. Control, vol. 87, no. 5, pp. 1000–1009, May 2014. doi: 10.1080/00207179.2013.863432
|
[45] |
R. Kamalapurkar, H. Dinhb, S. Bhasin, and W. E. Dixon, “Approximate optimal trajectory tracking for continuous-time nonlinear systems,” Automatica, vol. 51, pp. 40–48, Jan. 2015. doi: 10.1016/j.automatica.2014.10.103
|
[46] |
C. Chen, H. Modares, K. Xie, F. L. Lewis, Y. Wan, and S. L. Xie, “Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics,” IEEE Trans. Autom. Control, vol. 64, no. 11, pp. 4423–4438, Nov. 2019. doi: 10.1109/TAC.2019.2905215
|
[47] |
X. N. Zhong, Z. Ni, and H. B. He, “A theoretical foundation of goal representation heuristic dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 12, pp. 2513–2525, Dec. 2016. doi: 10.1109/TNNLS.2015.2490698
|