Volume 11
							Issue 11 
						IEEE/CAA Journal of Automatica Sinica
| Citation: | B. Yang, C. Tang, Y. Liu, G. Wen, and G. Chen, “A linear programming-based reinforcement learning mechanism for incomplete-information games,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 11, pp. 2340–2342, Nov. 2024. doi: 10.1109/JAS.2024.124464 | 
	                | [1] | 
					 H. Kebriaei, A. Rahimi-Kian, and M. N. Ahmadabadi, “Model-based and learning-based decision making in incomplete information cournot games: A state estimation approach,” IEEE Trans. Syst. Man Cybern. Syst., vol. 45, no. 4, pp. 713–718, Apr. 2015. doi:  10.1109/TSMC.2014.2373336 
						
					 | 
			
| [2] | 
					 L. Xue, C. Sun, D. Wunsch, Y. Zhou, and F. Yu, “An adaptive strategy via reinforcement learning for the prisoners dilemma game,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 301–310, Jan. 2018. doi:  10.1109/JAS.2017.7510466 
						
					 | 
			
| [3] | 
					 W. Zha, J. Chen, and Z. Peng, “Dynamic multi-team antagonistic games model with incomplete information and its application to multi-UAV,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 1, pp. 74–84, Jan. 2015. doi:  10.1109/JAS.2015.7032908 
						
					 | 
			
| [4] | 
					 H. Wang, T. Huang, X. Liao, H. Abu-Rub, and G. Chen, “Reinforcement learning for constrained energy trading games with incomplete information,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3404–3416, Oct. 2017. doi:  10.1109/TCYB.2016.2539300 
						
					 | 
			
| [5] | 
					 D. Shen, “Iterative learning control with incomplete information: A survey,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 5, pp. 885–901, Sep. 2018. doi:  10.1109/JAS.2018.7511123 
						
					 | 
			
| [6] | 
					 G. Wen, J. Fu, P. Dai, and J. Zhou, “DTDE: A new cooperative multiagent reinforcement learning framework,” The Innovation, vol. 2, no. 4, p. 100162, Sep. 2021. doi:  10.1016/j.xinn.2021.100162 
						
					 | 
			
| [7] | 
					 J. Tsitsiklis, “Asynchronous stochastic approximation and Q-learning,” Mach. Learn., vol. 16, no. 3, pp. 185–202, Sep. 1994. 
						
					 | 
			
| [8] | 
					 Y. Zhou, J. Li, and J. Zhu, “Posterior sampling for multi-agent reinforcement learning: Solving extensive games with imperfect information,” in Proc. Int. Conf. Learn. Represent., 2020. 
						
					 | 
			
| [9] | 
					 L. Meng, Z. Ge, P. Tian, B. An, and Y. Gao, “An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form games,” in Proc. AAAI Conf. Artif. Intell., Jun. 2023, vol. 37, no. 5, pp. 5823–5831. 
						
					 | 
			
| [10] | 
					 E. Lockhart, M. Lanctot, J. Pérolat, J. Lespiau, D. Morrill, F. Timbers, and K. Tuyls, “Computing approximate equilibria in sequential adversarial games by exploitability descent,” in Proc. Int. Joint Conf. Artif. Intell., 2019, pp. 464–470. 
						
					 | 
			
| [11] | 
					 S. Srinivasan, M. Lanctot, V. Zambaldi, J. Pérolat, K. Tuyls, R. Munos, and M. Bowling, “Actor-critic policy optimization in partially observable multiagent environments,” in Proc. Adv. Neural Inf. Proces. Syst., 2018, pp. 3422–3435. 
						
					 | 
			
| [12] | 
					 M. Lanctot, V. Zambaldi, A. Gruslys, A. Lazaridou, K. Tuyls, J. Perolat, D. Silver, and T. Graepel, “A unified game-theoretic approach to multiagent reinforcement learning,” in Proc. Adv. Neural Inf. Proces. Syst., 2017, pp. 4191–4204. 
						
					 | 
			
| [13] | 
					 S. Fang and S. Puthenpura, Linear Optimization and Extensions: Theory and Algorithms. Englewood Cliffs, USA: Prentice Hall, 1993. 
						
					 |