IEEE/CAA Journal of Automatica Sinica
Citation:  N. Chen, L. Li, and W. Mao, “Equilibrium strategy of the pursuitevasion game in threedimensional space,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 446–458, Feb. 2024. doi: 10.1109/JAS.2023.123996 
The pursuitevasion game models the strategic interaction among players, attracting attention in many realistic scenarios, such as missile guidance, unmanned aerial vehicles, and target defense. Existing studies mainly concentrate on the cooperative pursuit of multiple players in twodimensional pursuitevasion games. However, these approaches can hardly be applied to practical situations where players usually move in threedimensional space with a threedegreeoffreedom control. In this paper, we make the first attempt to investigate the equilibrium strategy of the realistic pursuitevasion game, in which the pursuer follows a threedegreeoffreedom control, and the evader moves freely. First, we describe the pursuer’s threedegreeoffreedom control and the evader’s relative coordinate. We then rigorously derive the equilibrium strategy by solving the retrogressive path equation according to the HamiltonJacobiBellmanIsaacs (HJBI) method, which divides the pursuitevasion process into the navigation and acceleration phases. Besides, we analyze the maximum allowable speed for the pursuer to capture the evader successfully and provide the strategy with which the evader can escape when the pursuer’s speed exceeds the threshold. We further conduct comparison tests with various unilateral deviations to verify that the proposed strategy forms a Nash equilibrium.
[1] 
I. E. Weintraub, M. Pachter, and E. García, “An introduction to pursuitevasion differential games,” in Proc. American Control Conf., 2020, pp. 1049–1066.

[2] 
M. J. Osborne and A. Rubinstein, A Course in Game Theory. Cambridge, USA: MIT Press, 1994.

[3] 
H. Huang, J. Ding, W. Zhang, and C. J. Tomlin, “Automationassisted capturetheflag: A differential game approach,” IEEE Trans. Control Systems Technology, vol. 23, no. 3, pp. 1014–1028, 2015. doi: 10.1109/TCST.2014.2360502

[4] 
J. Shinar, M. Guelman, and A. Green, “An optimal guidance law for a planar pursuitevasion game of kind,” Computers &Mathematics With Applications, vol. 18, no. 1−3, pp. 35–44, 1989.

[5] 
V. Turetsky and J. Shinar, “Missile guidance laws based on pursuitevasion game formulations,” Automatica, vol. 39, no. 4, pp. 607–618, 2003. doi: 10.1016/S00051098(02)00273X

[6] 
W. Li, Y. Zhu, and D. Zhao, “Missile guidance with assisted deep reinforcement learning for headon interception of maneuvering target,” Complex &Intelligent Systems, vol. 8, no. 2, pp. 1205–1216, 2022.

[7] 
J. Guo, Z. Wang, J. Lan, B. Dong, R. Li, Q. Yang, and J. Zhang, “Maneuver decision of UAV in air combat based on deterministic policy gradient,” in Proc. IEEE 17th Int. Conf. Control & Automation, 2022, pp. 243–248.

[8] 
R. Vidal, S. Rashid, C. Sharp, O. Shakernia, K. Jin, and S. Sastry, “Pursuitevasion games with unmanned ground and aerial vehicles,” in Proc. IEEE Int. Conf. Robotics and Automation, 2001, vol. 3, pp. 2948–2955.

[9] 
Q. Yang, J. Zhang, G. Shi, J. Hu, and Y. Wu, “Maneuver decision of UAV in shortrange air combat based on deep reinforcement learning,” IEEE Access, vol. 8, pp. 363–378, 2020. doi: 10.1109/ACCESS.2019.2961426

[10] 
M. Chen, Z. Zhou, and C. J. Tomlin, “Multiplayer reachavoid games via pairwise outcomes,” IEEE Trans. Autom. Control, vol. 62, no. 3, pp. 1451–1457, 2016.

[11] 
E. Garcia, D. W. Casbeer, and M. Pachter, “Design and analysis of statefeedback optimal strategies for the differential game of active defense,” IEEE Trans. Autom. Control, vol. 64, no. 2, pp. 553–568, 2018.

[12] 
S. Pan, H. Huang, J. Ding, W. Zhang, D. M. S. vić, and C. J. Tomlin, “Pursuit, evasion and defense in the plane,” in Proc. American Control Conf., 2012, pp. 4167–4173.

[13] 
R. Isaacs, Differential Games: A Mathematical Theory With Applications to Warfare and Pursuit, Control and Optimization. Mineola, USA: John Wiley and Sons, Inc., 1965.

[14] 
L. C. Evans, Partial Differential Equations. Providence, USA: American Mathematical Soc., 2010.

[15] 
P. Hagedorn and J. Breakwell, “A differential game with two pursuers and one evader,” J. Optimization Theory and Applications, vol. 18, no. 1, pp. 15–29, 1976. doi: 10.1007/BF00933791

[16] 
J. Breakwell and P. Hagedorn, “Point capture of two evaders in succession,” J. Optimization Theory and Applications, vol. 27, no. 1, pp. 89–97, 1979. doi: 10.1007/BF00933327

[17] 
A. T. Bilgin and E. KadiogluUrtis, “An approach to multiagent pursuit evasion games using reinforcement learning,” in Proc. Int. Conf. Advanced Robotics, 2015, pp. 164–169.

[18] 
E. Bakolas and P. Tsiotras, “Relay pursuit of a maneuvering target using dynamic Voronoi diagrams,” Automatica, vol. 48, no. 9, pp. 2213–2220, 2012. doi: 10.1016/j.automatica.2012.06.003

[19] 
Z. Zhou, W. Zhang, J. Ding, H. Huang, D. M. Stipanović, and C. J. Tomlin, “Cooperative pursuit with Voronoi partitions,” Automatica, vol. 72, pp. 64–72, 2016. doi: 10.1016/j.automatica.2016.05.007

[20] 
J. Chen, W. Zha, Z. Peng, and D. Gu, “Multiplayer pursuitevasion games with one superior evader,” Automatica, vol. 71, pp. 24–32, 2016. doi: 10.1016/j.automatica.2016.04.012

[21] 
M. Ramana and M. Kothari, “A cooperative pursuitevasion game of a high speed evader,” in Proc. IEEE Conf. Decision and Control, 2015, pp. 2969–2974.

[22] 
X. Fang, C. Wang, L. Xie, and J. Chen, “Cooperative pursuit with multipursuer and one faster freemoving evader,” IEEE Trans. Cyber., vol. 52, no. 3, pp. 1405–1414, 2022.

[23] 
W. Zha, J. Chen, Z. Peng, and D. Gu, “Construction of barrier in a fishing game with point capture,” IEEE Trans. Cyber., vol. 47, no. 6, pp. 1409–1422, 2016.

[24] 
E. Garcia, D. W. Casbeer, A. Von Moll, and M. Pachter, “Multiple pursuer multiple evader differential games,” IEEE Trans. Automatic Control, vol. 66, no. 5, pp. 2345–2350, 2020.

[25] 
G. Hexner, “A differential game of incomplete information,” J. Optimization Theory and Applications, vol. 28, no. 2, pp. 213–232, 1979. doi: 10.1007/BF00933243

[26] 
M. Pachter and Y. Yavin, “A stochastic homicidal chauffeur pursuitevasion differential game,” J. Optimization Theory and Applications, vol. 34, no. 3, pp. 405–424, 1981. doi: 10.1007/BF00934680

[27] 
Y. Yang and J. Wang, “An overview of multiagent reinforcement learning from game theoretical perspective,” [Online], Available: https://arxiv.org/abs/2011.00583, 2020.

[28] 
J. Selvakumar and E. Bakolas, “Minmax Qlearning for multiplayer pursuitevasion games,” Neurocomputing, vol. 475, pp. 1–14, 2022. doi: 10.1016/j.neucom.2021.12.025

[29] 
R. Lowe, Y. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multiagent actorcritic for mixed cooperativecompetitive environments,” in Proc. Advances in Neural Information Processing Systems, 2017, vol. 30, pp. 6379–6390.

[30] 
Y. Wang, L. Dong, and C. Sun, “Cooperative control for multiplayer pursuitevasion games with reinforcement learning,” Neurocomputing, vol. 412, pp. 101–114, 2020. doi: 10.1016/j.neucom.2020.06.031

[31] 
K. Wan, D. Wu, Y. Zhai, B. Li, X. Gao, and Z. Hu, “An improved approach towards multiagent pursuitevasion game decisionmaking using deep reinforcement learning,” Entropy, vol. 23, no. 11, p. 1433, 2021. doi: 10.3390/e23111433

[32] 
Z. Zhou and H. Xu, “Decentralized optimal large scale multiplayer pursuitevasion strategies: A mean field game approach with reinforcement learning,” Neurocomputing, vol. 484, pp. 46–58, 2022. doi: 10.1016/j.neucom.2021.01.141

[33] 
T. T. Nguyen, N. D. Nguyen, and S. Nahavandi, “Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications,” IEEE Trans. Cyber., vol. 50, no. 9, pp. 3826–3839, 2020. doi: 10.1109/TCYB.2020.2977374

[34] 
Y. Yang, L. Liao, H. Yang, and S. Li, “An optimal control strategy for multiUAVs target tracking and cooperative competition,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 12, pp. 1931–1947, 2021. doi: 10.1109/JAS.2020.1003012

[35] 
N. Wen, L. Zhao, X. Su, and P. Ma, “UAV online path planning algorithm in a low altitude dangerous environment,” IEEE/CAA J. Autom. Sinica, vol. 2, no. 2, pp. 173–185, 2015. doi: 10.1109/JAS.2015.7081657

[36] 
Z. Zuo, C. Liu, Q.L. Han, and J. Song, “Unmanned aerial vehicles: Control methods and future challenges,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 4, pp. 601–614, 2022. doi: 10.1109/JAS.2022.105410

[37] 
T. Miloh, “A note on threedimensional pursuitevasion game with bounded curvature,” IEEE Trans. Automatic Control, vol. 27, no. 3, pp. 739–741, 1982. doi: 10.1109/TAC.1982.1102992

[38] 
N. Greenwood, “A differential game in three dimensions: The aerial dogfight scenario,” Dynamics and Control, vol. 2, no. 2, pp. 161–200, 1992. doi: 10.1007/BF02169496

[39] 
N. Rajan and M. Ardema, “Interception in three dimensionsan energy formulation,” J. Guidance,Control,and Dynamics, vol. 8, no. 1, pp. 23–30, 1985. doi: 10.2514/3.19930

[40] 
F. Imado and T. Kuroda, “A method to solve missileaircraft pursuitevasion differential games,” IFAC Proceedings Volumes, vol. 38, no. 1, pp. 176–181, 2005.

[41] 
Z. Hu, P. Gao, and F. Wang, “Research on autonomous maneuvering decision of UCAV based on approximate dynamic programming,” [Online], Available: https://arxiv.org/abs/1908.10010, 2019.

[42] 
T. Başar and G. J. Olsder, Dynamic Noncooperative Game Theory. New York, USA: SIAM, 1998.

[43] 
T. Başar, A. Haurie, and G. Zaccour, NonzeroSum Differential Games. Cham, Switzerland: Springer Int. Publishing, 2018, pp. 61–110.

[44] 
P. L A, Differential Games Of Pursuit, ser. Series on Optimization, vol 2. Singapore: World Scientific, 1993.

[45] 
D. Liberzon, Calculus of Variations and Optimal Control Theory: A Concise Introduction. USA: Princeton University Press, 2011.

[46] 
X. Liao, C. Zhou, J. Wang, J. Fan, and Z. Zhang, “A wiredriven elastic robotic fish and its design and cpgbased control,” J. Intelligent &Robotic Systems, vol. 107, no. 1, p. 4, 2022.

[47] 
J. Chai, W. Chen, Y. Zhu, Z. Yao, and D. Zhao, “A hierarchical deep reinforcement learning framework for 6DoF UCAV airtoair combat,” IEEE Trans. Systems,Man,and Cyber: Systems, vol. 53, no. 9, pp. 5417–5429, 2023. doi: 10.1109/TSMC.2023.3270444

[48] 
G. Wu, S. Bai, and P. Hjørnet, “On the stiffness of three/four degreeoffreedom parallel pickandplace robots with four identical limbs,” in Proc. IEEE Int. Conf. Robotics and Automation, 2016, pp. 861–866.
