A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 9 Issue 9
Sep.  2022

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
W. Y. Ruan, H. B. Duan, and Y. M. Deng, “Autonomous maneuver decisions via transfer learning pigeon-inspired optimization for UCAVs in dogfight engagements,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1639–1657, Sept. 2022. doi: 10.1109/JAS.2022.105803
Citation: W. Y. Ruan, H. B. Duan, and Y. M. Deng, “Autonomous maneuver decisions via transfer learning pigeon-inspired optimization for UCAVs in dogfight engagements,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1639–1657, Sept. 2022. doi: 10.1109/JAS.2022.105803

Autonomous Maneuver Decisions via Transfer Learning Pigeon-Inspired Optimization for UCAVs in Dogfight Engagements

doi: 10.1109/JAS.2022.105803
Funds:  This work was partially supported by the Science and Technology Innovation 2030-Key Project of “New Generation Artificial Intelligence” (2018AAA0100803), the National Natural Science Foundation of China (U20B2071, 91948204, T2121003, U1913602)
More Information
  • This paper proposes an autonomous maneuver decision method using transfer learning pigeon-inspired optimization (TLPIO) for unmanned combat aerial vehicles (UCAVs) in dogfight engagements. Firstly, a nonlinear F-16 aircraft model and automatic control system are constructed by a MATLAB/Simulink platform. Secondly, a 3-degrees-of-freedom (3-DOF) aircraft model is used as a maneuvering command generator, and the expanded elemental maneuver library is designed, so that the aircraft state reachable set can be obtained. Then, the game matrix is composed with the air combat situation evaluation function calculated according to the angle and range threats. Finally, a key point is that the objective function to be optimized is designed using the game mixed strategy, and the optimal mixed strategy is obtained by TLPIO. Significantly, the proposed TLPIO does not initialize the population randomly, but adopts the transfer learning method based on Kullback-Leibler (KL) divergence to initialize the population, which improves the search accuracy of the optimization algorithm. Besides, the convergence and time complexity of TLPIO are discussed. Comparison analysis with other classical optimization algorithms highlights the advantage of TLPIO. In the simulation of air combat, three initial scenarios are set, namely, opposite, offensive and defensive conditions. The effectiveness performance of the proposed autonomous maneuver decision method is verified by simulation results.

     

  • loading
  • [1]
    Y. Q. Dong and J. L. Ai, “Decision making in autonomous air combat: A brief review and future prospects,” Acta Aeronautica et Astronautica Sinica, vol. 41, no. S2, pp. 2019–1985, Dec. 2020. doi: 10.2514/6.2019-1985
    [2]
    R. Isaacs, Differential Games: A Mathematical Theory With Applications to Warfare and Pursuit, Control and Optimization. Mineola, USA: Dover Publication INC., 1999.
    [3]
    B. Baris and K. Emre, “Differential flatness-based optimal air combat maneuver strategy generation,” in Proc. AIAA SciTech Forum, Jan. 2019, p. 1985.
    [4]
    F. Austin, G. Carbone, M. Falco, H. Hinz, and M. Lewis, “Game theory for automated maneuvering during air-to-air combat,” J. Guidance, vol. 13, no. 6, pp. 1143–1149, Nov.–Dec. 1990. doi: 10.2514/3.20590
    [5]
    B. Baris and K. Emre, “Aerial combat simulation environment for one-on-one engagement,” in Proc. AIAA Modeling and Simulation Technologies Conf., Jan. 2018, p. 432.
    [6]
    R. L. Nelson and Z. Rafal, “Effectiveness of autonomous decision making for unmanned combat aerial vehicles in dogfight engagements,” J. Guidance,Control,and Dynamics, vol. 41, no. 4, pp. 1015–1021, Apr. 2018. doi: 10.2514/1.G003088
    [7]
    K. Virtanen, T. Raivio, and R. P. Hamalainen, “Modeling pilot’s sequential maneuvering decision by a multistage influence diagram,” J. Guidance,Control,and Dynamics, vol. 27, no. 4, pp. 665–676, Jul.–Aug. 2004. doi: 10.2514/1.11167
    [8]
    J. S. McGrew, P. J. How, B. Williams, and N. Roy, “Air-combat strategy using approximate dynamic programming,” J. Guidance,Control,and Dynamics, vol. 33, no. 5, pp. 1641–1653, Sep.–Oct. 2010. doi: 10.2514/1.46815
    [9]
    K. Srivastava and A. Surana, “Monte Carlo tree search based tactical maneuvering,” arXiv preprint arXiv: 2009.08807, Sep. 2020.
    [10]
    J. R. Bertram and P. Wei, “An efficient algorithm for multiple-pursuer-multiple-evader pursuit/evasion game,” in Proc. AIAA SciTech Forum, Jan. 2021, p. 1862.
    [11]
    N, Ernest, K. Cohen, E. Kivelevitch, C. Schumacher, and D. Casbeer, “Genetic fuzzy trees and their application towards autonomous training and control of a squadron of unmanned combat aerial vehicles,” Unmanned Systems, vol. 3, no. 3, pp. 185–204, May 2015. doi: 10.1142/S2301385015500120
    [12]
    N. Ernest, D. Carrol, C. Schumacher, M. Clarkm, K. Cohen, and G. Lee, “Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions,” J. Defense Management, vol. 6, no. 1, p. 1000144, Mar. 2016.
    [13]
    O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Y. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V. Dalibard, D. Budden, Y. Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Y, Wa ng, T. Pfaff, Y. H. Wu, R. Ring, D. Yogatama, D. Wünsch, K. McKinney, O. Smith, T. Schaul, T. Lillicrap, K. Kavukcuoglu, D. Hassabis, C. Apps, and D. Silver, “Grandmaster level in StarCraft II using multi-agent reinforcement learning,” Nature, vol. 575, no. 14, pp. 350–359, Nov. 2019.
    [14]
    A. P. Pope, J. S. Ide, D. Micovic, H. Diaz, D. Rosenbluth, L. Ritholtz, J. C. Twedt, T. T. Walker, K. Alcedo, and D. Javorsek, “Hierarchical reinforcement learning for air-to-air combat,” in Proc. Int. Conf. Unmanned Aircraft Systems, Jun. 2021, pp. 275−284.
    [15]
    J. Xu, Q. Guo, L. Xiao, Z. Li, and G. Zhang, “Autonomous decision-making method for combat mission of UAV based on deep reinforcement learning,” in Proc. 4th IEEE Advanced Information Technology, Electronic and Automation Control Conf., Dec. 2019, vol. 1, pp. 538−544.
    [16]
    Q. Yang, J. Zhang, G. Shi, J. Hu, and Y. Wu, “Maneuver decision of UAV in short-range air combat based on deep reinforcement learning,” IEEE Access, vol. 8, pp. 363–378, Dec. 2019.
    [17]
    W. Kong, D. Zhou, Z. Yang, Y. Zhao, and K. Zhang, “UAV autonomous aerial combat maneuver strategy generation with observation error based on state-adversarial deep deterministic policy gradient and inverse reinforcement learning,” Electronics, vol. 9, no. 7, p. 1121, Jul. 2020.
    [18]
    Y. Chen, J. Zhang, Q. Yang, Y. Zhou, G. Shi, and Y. Wu, “Design and verification of UAV maneuver decision simulation system based on deep q-learning network,” in Proc. 16th Int. Conf. Control, Automation, Robotics and Vision, Dec. 2020, pp. 817–823.
    [19]
    Z. X. Sun, H. Y. Piao, Z. Yang, Y. Y. Zhao, G. Zhan, D. Y. Zhou, G. L. Meng, H. C. Chen, X. Chen, B. H. Qu and Y. J. Lu, “Multi-agent hierarchical policy gradient for air combat tactics emergence via self-play,” Engineering Applications Artificial Intelligence, vol. 98, p. 104112, Feb. 2021.
    [20]
    L. A. Zhang, J. Xu, D. Gold, J. Hagen, A. K. Kochhar, A. J. Lohn, and O. A. Osoba, “Air dominance through machine learning: A preliminary exploration of artificial intelligence-assisted mission planning,” RAND Corporation, Santa Monica, USA, Tech. Rep., May. 2020.
    [21]
    H. B. Duan and P. X. Qiao, “Pigeon-inspired optimization: A new swarm intelligence optimizer for air robot path planning,” Int. J. Intelligent Computing and Cybernetics, vol. 7, no. 1, pp. 24–37, Mar. 2014. doi: 10.1108/IJICC-02-2014-0005
    [22]
    H. B. Duan, J. X. Zhao, Y. M. Deng, Y. H. Shi, and X. L. Ding, “Dynamic discrete pigeon-inspired optimization for multi-UAV cooperative search-attack mission planning,” IEEE Trans. Aerospace and Electronic Systems, vol. 57, no. 1, pp. 706–720, Feb. 2021. doi: 10.1109/TAES.2020.3029624
    [23]
    H. B. Duan, M. Z. Huo, and S. Y. Hui, “Limit-cycle-based mutant multiobjective pigeon-inspired optimization,” IEEE Trans. Evolutionary Computation, vol. 24, no. 5, pp. 948–959, Oct. 2020. doi: 10.1109/TEVC.2020.2983311
    [24]
    H. B. Duan, M. Z. Huo, Z. Y. Yang, S. Y. Hui, and Q. N Luo, “Predator-prey pigeon-inspired optimization for UAV ALS longitudinal parameters tuning,” IEEE Trans. Aerospace and Electronic Systems, vol. 55, no. 5, pp. 2347–2358, Oct. 2019. doi: 10.1109/TAES.2018.2886612
    [25]
    Q. Xue and H. B. Duan, “Robust attitude control for reusable launch vehicles based on fractional calculus and pigeon-inspired optimization,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 1, pp. 89–97, Jan. 2017. doi: 10.1109/JAS.2017.7510334
    [26]
    M. Jiang, Z. Q. Huang, L. M. Qiu, W. Z. Huang, and G. G. Yen, “Transfer learning-based dynamic multiobjective optimization algorithms,” IEEE Trans. Evolutionary Computation, vol. 22, no. 4, pp. 501–514, Aug. 2018. doi: 10.1109/TEVC.2017.2771451
    [27]
    L. Sonneveldt, Nonlinear F-16 Model Description. Delft University of Technology, Delft, Netherlands, Tech. Rep. Jun. 2006
    [28]
    S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, Oct. 2010. doi: 10.1109/TKDE.2009.191
    [29]
    B. Zhang, H. B. Duan, “Three-dimensional path planning for uninhabited combat aerial vehicle based on predator-prey pigeon-inspired optimization in dynamic environment,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 14, no. 1, pp. 97–107, Feb. 2017. doi: 10.1109/TCBB.2015.2443789
    [30]
    K. Yang, “Transfer learning based particle swarm optimization algorithms research and applications,” M.S. thesis, Department of Information and Control Engineering, China University of Mining and Technology, China, Jun. 2019.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(16)  / Tables(9)

    Article Metrics

    Article views (549) PDF downloads(95) Cited by()

    Highlights

    • A nonlinear F-16 aircraft model with aerodynamic data is used to the control object, and automatic control system are constructed by a MATLAB/Simulink platform
    • The expanded elemental maneuver library is designed utilizing a maneuvering command generator, and the control commands converter from 3-DOF aircraft model to 6-DOF aircraft model is presented
    • The maneuver decision objective function is designed using the game mixed strategy, and the optimal mixed strategy is obtained by TLPIO. Besides, the convergence and time complexity of TLPIO are discussed

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return