A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
L. Xu, J. Liu, X. Chang, X. Liu, and C. Sun, “Hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 6, pp. 1–12, Jun. 2025. doi: 10.1109/JAS.2024.124920
Citation: L. Xu, J. Liu, X. Chang, X. Liu, and C. Sun, “Hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 6, pp. 1–12, Jun. 2025. doi: 10.1109/JAS.2024.124920

Hazard-Aware Weighted Advantage Combination for UAV Target Tracking and Obstacle Avoidance

doi: 10.1109/JAS.2024.124920
Funds:  This work was supported by the National Natural Science Foundation of China (62236002, 61921004)
More Information
  • In recent years, the rapid evolution of unmanned aerial vehicles (UAVs) has brought about transformative changes across various industries. However, addressing fundamental challenges in UAV technology, particularly target tracking and obstacle avoidance, remains crucial for wildlife protection, military industry security, etc. Many existing methods based on reinforcement learning to solve UAV multi-tasks need to be redesigned and retrained, and cannot be quickly and effectively extended to other scenarios. To this end, we propose a novel solution based on a hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance. First, we independently trained the UAV target tracking and obstacle avoidance using the dueling double deep Q-network reinforcement learning algorithm. Subsequently, in a multitasking scenario, we introduce the two pre-trained networks. Meanwhile, we design a weight determined by the present risk level encountered by the UAV. This weight is utilized to perform a weighted summation of the advantage values from both networks, eliminating the need for retraining to obtain the final action. We validate our approach through extensive simulation experiments in the robotics simulator known as CoppeliaSim. The results demonstrate that our method outperforms current state-of-the-art techniques, achieving superior performance in both tracking accuracy and avoidance of collisions.

     

  • loading
  • [1]
    S. Li, T. Liu, C. Zhang, D.-Y. Yeung, and S. Shen, “Learning unmanned aerial vehicle control for autonomous target following,” in Proc. 27th Int. Joint Conf. Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4936–4942.
    [2]
    S. P. Bharati, Y. Wu, Y. Sui, C. Padgett, and G. Wang, “Real-time obstacle detection and tracking for sense-and-avoid mechanism in UAVs,” IEEE Trans. Intell. Veh., vol. 3, no. 2, pp. 185–197, Jun. 2018. doi: 10.1109/TIV.2018.2804166
    [3]
    B. Li, Z. Gan, D. Chen, and D. Sergey Aleksandrovich, “UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning,” Remote Sens., vol. 12, no. 22, p. 3789, Nov. 2020. doi: 10.3390/rs12223789
    [4]
    S. Bhagat and P. Sujit, “UAV target tracking in urban environments using deep reinforcement learning,” in Proc. Int. Conf. Unmanned Aircraft Systems, Athens, Greece, 2020, pp. 694–701.
    [5]
    J. Kim, “Target following and close monitoring using an unmanned surface vehicle,” IEEE Trans. Syst. Man Cybern. Syst., vol. 50, no. 11, pp. 4233–4242, Nov. 2018.
    [6]
    L. Xu, T. Wang, W. Cai, and C. Sun, “UAV target following in complex occluded environments with adaptive multi-modal fusion,” Appl. Intell., vol. 53, no. 13, pp. 16998–17014, Jul. 2023. doi: 10.1007/s10489-022-04317-2
    [7]
    Y. Xue and W. Chen, “Multi-agent deep reinforcement learning for UAVs navigation in unknown complex environment,” IEEE Trans. Intell. Veh., vol. 9, no. 1, pp. 2290–2303, Jan. 2024. doi: 10.1109/TIV.2023.3298292
    [8]
    D. Wang, Q. Pan, Y. Shi, J. Hu, and C. Zhao, “Efficient nonlinear model predictive control for quadrotor trajectory tracking: Algorithms and experiment,” IEEE Trans. Cybern., vol. 51, no. 10, pp. 5057–5068, Oct. 2021.
    [9]
    Y. Yang, L. Liao, H. Yang, and S. Li, “An optimal control strategy for multi-UAVs target tracking and cooperative competition,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 12, pp. 1931–1947, Dec. 2021. doi: 10.1109/JAS.2020.1003012
    [10]
    P. Sun, S. Li, B. Zhu, Z. Zuo, and X. Xia, “Vision-based fixed-time uncooperative aerial target tracking for UAV,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 5, pp. 1322–1324, May 2023. doi: 10.1109/JAS.2023.123510
    [11]
    H. Huang, Y. Yang, H. Wang, Z. Ding, H. Sari, and F. Adachi, “Deep reinforcement learning for UAV navigation through massive MIMO technique,” IEEE Trans. Veh. Technol., vol. 69, no. 1, pp. 1117–1121, Jan. 2020. doi: 10.1109/TVT.2019.2952549
    [12]
    S. Feng, L. Zeng, J. Liu, Y. Yang, and W. Song, “Multi-UAVs collaborative path planning in the cramped environment,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 529–538, Feb. 2024. doi: 10.1109/JAS.2023.123945
    [13]
    H. Kandath, T. Bera, R. Bardhan, and S. Sundaram, “Autonomous navigation and sensorless obstacle avoidance for UGV with environment information from UAV,” in Proc. 2nd IEEE Int. Conf. Robotic Computing, Laguna Hills, USA, 2018, pp. 266–269.
    [14]
    Z. Han, R. Zhang, N. Pan, C. Xu, and F. Gao, “Fast-tracker: A robust aerial system for tracking agile target in cluttered environments,” in Proc. IEEE Int. Conf. Robotics and Automation, Xi'an, China, 2021, pp. 328–334.
    [15]
    N. Pan, R. Zhang, T. Yang, C. Cui, C. Xu, and F. Gao, “Fast-tracker 2.0: Improving autonomy of aerial tracking with active vision and human location regression,” IET Cyber-Syst. Robot., vol. 3, no. 4, pp. 292–301, Dec. 2021.
    [16]
    X. Zhou, X. Wen, Z. Wang, Y. Gao, H. Li, Q. Wang, T. Yang, H. Lu, Y. Cao, C. Xu, and F. Gao, “Swarm of micro flying robots in the wild,” Sci. Robot., vol. 7, p. 66, May 2022.
    [17]
    S. S. Mansouri, C. Kanellakis, B. Lindqvist, F. Pourkamali-Anaraki, A.-A. Agha-Mohammadi, J. Burdick, and G. Nikolakopoulos, “A unified NMPC scheme for MAVs navigation with 3D collision avoidance under position uncertainty,” IEEE Robot. Autom. Lett., vol. 5, no. 4, pp. 5740–5747, Oct. 2020.
    [18]
    J. Li, H. He, and A. Tiwari, “Simulation of autonomous UAV navigation with collision avoidance and space awareness,” in Proc. 3rd Int. Conf. Intelligent Robotic and Control Engineering, Oxford, UK, 2020, pp. 110–116.
    [19]
    C. Yan, X. Xiang, and C. Wang, “Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments,” J. Intell. Robot. Syst., vol. 98, no. 2, pp. 297–309, May 2020.
    [20]
    G. Xu, W. Jiang, Z. Wang, and Y. Wang, “Autonomous obstacle avoidance and target tracking of UAV based on deep reinforcement learning,” J. Intell. Robot. Syst., vol. 104, no. 4, p. 60, Apr. 2022. doi: 10.1007/s10846-022-01601-8
    [21]
    C. Sampedro, A. Rodriguez-Ramos, H. Bavle, A. Carrio, P. de la Puente, and P. Campoy, “A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques,” J. Intell. Robot. Syst., vol. 95, no. 2, pp. 601–627, Aug. 2019.
    [22]
    C. Wang, J. Wang, Y. Shen, and X. Zhang, “Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach,” IEEE Trans. Veh. Technol., vol. 68, no. 3, pp. 2124–2136, Mar. 2019.
    [23]
    J. Moon, S. Papaioannou, C. Laoudias, P. Kolios, and S. Kim, “Deep reinforcement learning multi-UAV trajectory control for target tracking,” IEEE Internet Things J., vol. 8, no. 20, pp. 15441–15455, Oct. 2021.
    [24]
    N. Patrizi, G. Fragkos, K. Ortiz, M. Oishi, and E. E. Tsiropoulou, “A UAV-enabled dynamic multi-target tracking and sensing framework,” in Proc. IEEE Global Communications Conf., Taipei, China, 2020, pp. 1–6.
    [25]
    Z. Xia, J. Du, J. Wang, C. Jiang, Y. Ren, G. Li, and Z. Han, “Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking,” IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 931–945, Jan. 2022. doi: 10.1109/TVT.2021.3129504
    [26]
    Z. Zheng, J. Li, Z. Guan, and Z. Zuo, “Constrained moving path following control for UAV with robust control barrier function,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 7, pp. 1557–1570, Jul. 2023.
    [27]
    J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993
    [28]
    P. Sun, B. Zhu, Z. Zuo, and M. V. Basin, “Vision-based finite-time uncooperative target tracking for UAV subject to actuator saturation,” Automatica, vol. 130, p. 109708, Aug. 2021.
    [29]
    W.-C. Chen, C.-L. Lin, Y.-Y. Chen, and H.-H. Cheng, “Quadcopter drone for vision-based autonomous target following,” Aerospace, vol. 10, no. 1, p. 82, Jan. 2023.
    [30]
    Y. Liu, Z. Meng, Y. Zou, and M. Cao, “Visual object tracking and servoing control of a nano-scale quadrotor: System, algorithms, and experiments,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 344–360, Feb. 2021. doi: 10.1109/JAS.2020.1003530
    [31]
    P. Sun, B. Zhu, and S. Li, “Vision-based prescribed performance control for UAV target tracking subject to actuator saturation,” IEEE Trans. Intell. Veh., vol. 9, no. 1, pp. 2382–2389, Jan. 2024. doi: 10.1109/TIV.2023.3302689
    [32]
    N. Bashir, S. Boudjit, G. Dauphin, and S. Zeadally, “An obstacle avoidance approach for UAV path planning,” Simul. Model. Pract. Theory, vol. 129, p. 102815, Dec. 2023. doi: 10.1016/j.simpat.2023.102815
    [33]
    W. Hematulin, P. Kamsing, P. Torteeka, T. Somjit, T. Phisannupawong, and T. Jarawan, “Trajectory planning for multiple UAVs and hierarchical collision avoidance based on nonlinear Kalman filters,” Drones, vol. 7, no. 2, p. 142, Feb. 2023. doi: 10.3390/drones7020142
    [34]
    A. Sonny, S. R. Yeduri, and L. R. Cenkeramaddi, “Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance,” Appl. Soft Comput., vol. 147, p. 110773, Nov. 2023. doi: 10.1016/j.asoc.2023.110773
    [35]
    C. Yan, C. Wang, X. Xiang, K. H. Low, X. Wang, X. Xu, and L. Shen, “Collision-avoiding flocking with multiple fixed-wing UAVs in obstacle-cluttered environments: A task-specific curriculum- based MADRL approach,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 8, pp. 10894–10908, Aug. 2024. doi: 10.1109/TNNLS.2023.3245124
    [36]
    C. Hu, Z. Meng, G. Qu, H.-S. Shin, and A. Tsourdos, “Distributed cooperative path planning for tracking ground moving target by multiple fixed-wing UAVS via DMPC-GVD in urban environment,” Int. J. Control Autom. Syst., vol. 19, no. 2, pp. 823–836, Feb. 2021. doi: 10.1007/s12555-019-0625-0
    [37]
    S. Zhao, X. Wang, H. Chen, and Y. Wang, “Cooperative path following control of fixed-wing unmanned aerial vehicles with collision avoidance,” J. Intell. Robot. Syst., vol. 100, pp. 3–4, Dec. 1569.
    [38]
    L. Xu, T. Wang, J. Wang, J. Liu, and C. Sun, “Attention-based policy distillation for UAV simultaneous target tracking and obstacle avoidance,” IEEE Trans. Intell. Veh., vol. 9, no. 2, pp. 3768–3781, Feb. 2024. doi: 10.1109/TIV.2023.3342174
    [39]
    A. Singletary, K. Klingebiel, J. Bourne, A. Browning, P. Tokumaru, and A. Ames, “Comparative analysis of control barrier functions and artificial potential fields for obstacle avoidance,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Prague, Czech Republic, 2021, pp. 8129–8136.
    [40]
    Q. Yuan and X. Li, “Distributed model predictive formation control for a group of UAVs with spatial kinematics and unidirectional data transmissions,” IEEE Trans. Network Sci. Eng., vol. 10, no. 6, pp. 3209–3222, 2023. doi: 10.1109/TNSE.2023.3252724
    [41]
    B. Li, S. Wen, Z. Yan, G. Wen, and T. Huang, “A survey on the control lyapunov function and control barrier function for nonlinear-affine control systems,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 3, pp. 584–602, Mar. 2023. doi: 10.1109/JAS.2023.123075
    [42]
    I. Misra, A. Shrivastava, A. Gupta, and M. Hebert, “Cross-stitch networks for multi-task learning,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 3994–4003.
    [43]
    S. Liu, E. Johns, and A. J. Davison, “End-to-end multi-task learning with attention,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 1871–1880.
    [44]
    L. Duong, T. Cohn, S. Bird, and P. Cook, “Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser,” in Proc. 53rd Annu. Meeting of the Association for Computational Linguistics and the 7th Int. Joint Conf. Natural Language Processing, Beijing, China, 2015, pp. 845–850.
    [45]
    M. Long, Z. Cao, J. Wang, and P. S. Yu, “Learning multiple tasks with multilinear relationship networks,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 1593–1602.
    [46]
    N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. V. Le, G. E. Hinton, and J. Dean, “Outrageously large neural networks: The sparsely-gated mixture-of-experts layer,” in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2017.
    [47]
    J. Ma, Z. Zhao, X. Yi, J. Chen, L. Hong, and E. H. Chi, “Modeling task relationships in multi-task learning with multi-gate mixture-of-experts,” in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 1930–1939.
    [48]
    H. Tang, J. Liu, M. Zhao, and X. Gong, “Progressive layered extraction (PLE): A novel multi-task learning (MTL) model for personalized recommendations,” in Proc. 14th ACM Conf. Recommender Systems, 2020, pp. 269–278. (查阅网上资料,未找到出版地信息,请补充)
    [49]
    X. Sun, R. Panda, R. Feris, and K. Saenko, “Adashare: Learning what to share for efficient deep multi-task learning,” in Proc. 34th Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 8728–8740.
    [50]
    S. Lee, S. Behpour, and E. Eaton, “Sharing less is more: Lifelong learning in deep networks with selective layer transfer,” in Proc. 38th Int. Conf. Machine Learning, 2021, pp. 6065–6075. (查阅网上资料,未找到出版地信息,请补充)
    [51]
    D. Bhattacharjee, T. Zhang, S. Süsstrunk, and M. Salzmann, “ MuIT: An end-to-end multitask learning transformer,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 12021–12031.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(4)

    Article Metrics

    Article views (140) PDF downloads(35) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return