A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 2
Feb.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
N. Zeng, X. Li, P. Wu, H. Li, and  X. Luo,  “A novel tensor decomposition-based efficient detector for low-altitude aerial objects with knowledge distillation scheme,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 487–501, Feb. 2024. doi: 10.1109/JAS.2023.124029
Citation: N. Zeng, X. Li, P. Wu, H. Li, and  X. Luo,  “A novel tensor decomposition-based efficient detector for low-altitude aerial objects with knowledge distillation scheme,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 487–501, Feb. 2024. doi: 10.1109/JAS.2023.124029

A Novel Tensor Decomposition-Based Efficient Detector for Low-Altitude Aerial Objects With Knowledge Distillation Scheme

doi: 10.1109/JAS.2023.124029
Funds:  This work was supported in part by the National Natural Science Foundation of China (62073271), the Natural Science Foundation for Distinguished Young Scholars of the Fujian Province of China (2023J06010), and the Fundamental Research Funds for the Central Universities of China (20720220076)
More Information
  • Unmanned aerial vehicles (UAVs) have gained significant attention in practical applications, especially the low-altitude aerial (LAA) object detection imposes stringent requirements on recognition accuracy and computational resources.  In this paper, the LAA images-oriented tensor decomposition and knowledge distillation-based network (TDKD-Net) is proposed, where the TT-format TD (tensor decomposition) and equal-weighted response-based KD (knowledge distillation) methods are designed to minimize redundant parameters while ensuring comparable performance.  Moreover, some robust network structures are developed, including the small object detection head and the dual-domain attention mechanism, which enable the model to leverage the learned knowledge from small-scale targets and selectively focus on salient features.  Considering the imbalance of bounding box regression samples and the inaccuracy of regression geometric factors, the focal and efficient IoU (intersection of union) loss with optimal transport assignment (F-EIoU-OTA) mechanism is proposed to improve the detection accuracy.  The proposed TDKD-Net is comprehensively evaluated through extensive experiments, and the results have demonstrated the effectiveness and superiority of the developed methods in comparison to other advanced detection algorithms, which also present high generalization and strong robustness.  As a resource-efficient precise network, the complex detection of small and occluded LAA objects is also well addressed by TDKD-Net, which provides useful insights on handling imbalanced issues and realizing domain adaptation.

     

  • loading
  • [1]
    M. L. Liu, Z. D. Wang, H. Li, P. S. Wu, F. E. Alsaadi, and N. Y. Zeng, “AA-WGAN: Attention augmented wasserstein generative adversarial network with application to fundus retinal vessel segmentation,” Comput. Biol. Med., vol. 158, p. 106874, 2023. doi: 10.1016/j.compbiomed.2023.106874
    [2]
    P. S. Wu, Z. D. Wang, B. X. Zheng, H. Li, F. E. Alsaadi, and N. Y. Zeng, “AGGN: Attention-based glioma grading network with multi-scale feature extraction and multi-modal information fusion,” Comput. Biol. Med., vol. 152, p. 106457, 2022.
    [3]
    S. M. Adams and C. J. Friedland, “A survey of unmanned aerial vehicle (UAV) usage for imagery collection in disaster research and management,” in Proc. 9th Int. Conf. Remote Sensing for Disaster Response, 2011, pp. 1–8.
    [4]
    E. I. Vlahogianni, J. D. Ser, K. Kepaptsoglou, and I. Laña, “Model free identification of traffic conditions using unmanned aerial vehicles and deep learning,” J. Big Data Anal. Transp., vol. 3, pp. 1–13, 2021.
    [5]
    N. Audebert, S. B. Le, and S. Lefevre, “Segment-before-detect: Vehicle detection and classification through semantic segmentation of aerial images,” Remote Sens., vol. 9, no. 4, p. 368, 2017. doi: 10.3390/rs9040368
    [6]
    S. Q. Ren, K. M. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2016.
    [7]
    J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proc. IEEE 29th Conf. Computer Vision and Pattern Recognition, 2016, pp. 779–788.
    [8]
    J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger, “ in Proc. IEEE 30th Conf. Computer Vision and Pattern Recognition, 2017, pp. 6517–6525.
    [9]
    J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv: 1804.02767, 2018.
    [10]
    S. F. Zhang, L. Y. Wen, X. Bian, Z. L. Lei, and S. Z. Li, “Single-shot refinement neural network for object detection,” in Proc. IEEE/CVF 31st Conf. Computer Vision and Pattern Recognition, 2018, pp. 4203–4212.
    [11]
    S. H. Liu, J. L. Zha, J. Sun, Z. Li, and G. Wang, “EdgeYOLO: An edge-real-time object detector,” arXiv preprint arXiv: 2302.07483, 2023.
    [12]
    P. Y. Chen, M. C. Chang, J. W. Hsieh, and Y. S. Chen, “Parallel residual bi-fusion feature pyramid network for accurate single-shot object detection,” IEEE Trans. Image Process., vol. 30, no. 3118953, pp. 9099–9111, 2021.
    [13]
    X. Li, P. J. Ye, J. J. Li, Z. M. Liu, L. B. Cao, and F.-Y. Wang, “From features engineering to scenarios engineering for trustworthy AI: I&I, C&C, and V&V,” IEEE Intell. Syst., vol. 37, no. 4, pp. 18–26, 2022. doi: 10.1109/MIS.2022.3197950
    [14]
    X. Li, Y. L. Tian, P. J. Ye, H. B. Duan, and F.-Y. Wang, “A novel scenarios engineering methodology for foundation models in metaverse,” IEEE Trans. Syst. Man Cybern. Syst., vol. 53, no. 4, pp. 2148–2159, 2022.
    [15]
    X. Li, K. F. Wang, X. F. Gu, F. Deng, and F.-Y. Wang, “ParallelEye pipeline: An effective method to synthesize images for improving the visual intelligence of intelligent vehicles,” IEEE Trans. Syst. Man Cybern. Syst., vol. 53, no. 9, pp. 5545–5556, 2023. doi: 10.1109/TSMC.2023.3273896
    [16]
    C. Y. Wang, A. Bochkovskiy, and H. Y. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in Proc. IEEE/CVF 36th Conf. Computer Vision and Pattern Recognition, 2023, pp. 7464–7475.
    [17]
    M. L. Li, X. K. Zhao, J. S. Li, and L. L. Nan, “ComNet: Combinational neural network for object detection in UAV-borne thermal images,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 8, pp. 6662–6673, 2021. doi: 10.1109/TGRS.2020.3029945
    [18]
    T. Ye, W. Y. Qin, Z. Y. Zhao, X. Z. Gao, X. P. Deng, and Y. Ouyang, “Real-time object detection network in UAV-vision based on CNN and transformer,” IEEE Trans. Instrum. Meas., vol. 72, no. 2505713, pp. 1–13, 2023.
    [19]
    X. K. Zhu, S. C. Lyu, X. Wang, and Q. Zhao, “TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios,” in Proc. IEEE/CVF 18th Int. Conf. Computer Vision, 2021, pp. 2778–2788.
    [20]
    P. Mittal, A. Sharma, R. Singh, and V. Dhull, “Dilated convolution based RCNN using feature fusion for low-altitude aerial objects,” Expert Syst. Appl., vol. 199, p. 117106, 2022. doi: 10.1016/j.eswa.2022.117106
    [21]
    W. Sun, L. Dai, X. R. Zhang, P. S. Chang, and X. Z. He, “RSOD: Real-time small object detection algorithm in UAV-based traffic monitoring,” Appl. Intell., vol. 52, no. 8, pp. 8448–8463, 2021.
    [22]
    C. Y. Wang, H. M. Liao, and I. Yeh, “Designing network design strategies through gradient path analysis,” arXiv preprint arXiv: 2211.04800, 2022.
    [23]
    X. H. Ding, X. Y. Zhang, N. N. Ma, J. G. Han, G. G. Ding, and J. Sun, “RepVGG: Making VGG-style ConvNets great again,” in Proc. IEEE/CVF 34th Conf. Computer Vision and Pattern Recognition, 2021, pp. 13728–13737.
    [24]
    N. Y. Zeng, P. S. Wu, Z. D. Wang, H. Li, W. B. Liu, and X. H. Liu, “A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection,” IEEE Trans. Instrum. Meas., vol. 71, p. 3507014, 2022.
    [25]
    W. Liu, M. L. Wang, S. C. Zhang, and P. Zhou, “Research on vehicle target detection technology based on UAV aerial images,” in Proc. IEEE 19th Int. Conf. Mechatronics and Autom., 2022, pp. 412–416.
    [26]
    F. M. Deng, Z. X. Xie, W. Mao, B. Li, Y. Shan, B. Q. Wei, and H. Zeng, “Research on edge intelligent recognition method oriented to transmission line insulator fault detection,” Int. J. Electr. Power Energy Syst., vol. 139, p. 108054, 2022. doi: 10.1016/j.ijepes.2022.108054
    [27]
    B. Wu, Z. J. Xue, and W. J. Chen, “Model compression based on YOLOv3 object detection algorithm from the perspective of UAV,” in Proc. 40th Conf. Chinese Control, 2021, pp. 8439–8444.
    [28]
    S. Y. Wang, J. Zhao, N. Ta, X. Y. Zhao, M. X. Xiao, and H. C. Wei, “A real-time deep learning forest fire monitoring algorithm based on an improved pruned plus KD model,” J. Real Time Image Process., vol. 18, no. 6, pp. 2319–2329, 2021. doi: 10.1007/s11554-021-01124-9
    [29]
    I. V. Oseledets, “Tensor-train decomposition,” SIAM J. Sci. Comput., vol. 33, no. 5, pp. 2295–2317, 2011. doi: 10.1137/090752286
    [30]
    A. Novikov, D. Podoprikhin, A. Osokin, and D. Vetrov, “Tensorizing neural networks,” in Proc. 29th Conf. Neural Information Processing Systems, 2015, pp. 1–9.
    [31]
    Y. Pan, M. L. Wang, and Z. L. Xu, “TedNet: A pytorch toolkit for tensor decomposition networks,” Neurocomputing, vol. 469, no. 10064, pp. 234–238, 2022.
    [32]
    T. Garipov, D. Podoprikhin, A. Novikov, and D. Vetrov, “Ultimate tensorization: Compressing convolutional and FC layers alike,” arXiv preprint arXiv: 1611.03214, 2016.
    [33]
    X. Luo, H. Wu, Z. Wang, J. J. Wang, and D. Y. Meng, “A novel approach to large-scale dynamically weighted directed network representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 12, pp. 9756–9773, 2022. doi: 10.1109/TPAMI.2021.3132503
    [34]
    Z. G. Liu, G. X. Yuan, and X. Luo, “Symmetry and nonnegativity-constrained matrix factorization for community detection,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 9, pp. 1691–1693, 2022. doi: 10.1109/JAS.2022.105794
    [35]
    Z. G. Liu, Y. G. Yi, and X. Luo, “A high-order proximity-incorporated nonnegative matrix factorization-based community detector,” IEEE Trans. Emerg. Topics Comput., vol. 7, no. 3, pp. 700–714, 2023. doi: 10.1109/TETCI.2022.3230930
    [36]
    J. P. Gou, B. S. Yu, S. J. Maybank, and D. C. Tao, “Knowledge distillation: A survey,” Int. J. Comput. Vis., vol. 129, no. 6, pp. 1789–1819, 2021. doi: 10.1007/s11263-021-01453-z
    [37]
    G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv: 1503.02531, 2015.
    [38]
    Z. D. Yang, Z. Li, X. H. Jiang, Y. Gong, Z. H. Yuan, D. P. Zhao, and C. Yuan, “Focal and global knowledge distillation for detectors,” in Proc. IEEE/CVF 35th Conf. Computer Vision and Pattern Recognition, 2022, pp. 4633–464.
    [39]
    X. Dai, Z. R. Jiang, Z. Wu, Y. P. Bao, Z. C. Wang, S. Liu, and E. J. Zhou, “General instance distillation for object detection,” in Proc. IEEE/CVF 34th Conf. Computer Vision and Pattern Recognition, 2021, pp. 7838–7847.
    [40]
    S. Woo, J. Park, J. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Proc. 15th European Conf. Computer Vision, 2018, pp. 3–19.
    [41]
    M. S. Liu, S. Y. Luo, K. Han, B. Yuan, R. F. DeMara, and Y. Bai, “An efficient real-time object detection framework on resource-constricted hardware devices via software and hardware co-design,” in Proc. IEEE 32nd Int. Conf. Application-Specific Systems, Architectures and Processors, 2021, pp. 77–84.
    [42]
    Z. Ge, S. T. Liu, Z. M. Li, Y. Osamu, and J. Sun, “Ota: Optimal transport assignment for object detection” in Proc. IEEE/CVF 34th Conf. Computer Vision and Pattern Recognition, 2021, pp. 303–312.
    [43]
    Y. F. Zhang, W. Q. Ren, Z. Zhang, Z. Jia, L. Wang, and T. Tan, “Focal and efficient IOU loss for accurate bounding box regression,” Neurocomputing, vol. 506, no. 07042, pp. 146–157, 2022.
    [44]
    R. Mehta and C. Ozturk, “Object detection at 200 frames per second,” in Proc. 15th European Conf. Computer Vision, 2018, pp. 659–675.
    [45]
    P. F. Zhu, L. Y. Wen, D. W. Du, X. Bian, H. Fan, Q. H. Hu, and H. B. Ling, “Detection and tracking meet drones challenge,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 7380–7399, 2022. doi: 10.1109/TPAMI.2021.3119563
    [46]
    L. A. Varga, B. Kiefer, M. Messmer, and A. Zell, “SeaDronesSee: A maritime benchmark for detecting humans in open water,” in Proc. IEEE/CVF 22th Winter Conf. Applications of Computer Vision, 2022, pp. 3686–3696.
    [47]
    W. Han, J. Li, S. Wang, Y. Wang, J. N. Yan, R. Y. Fan, X. H. Zhang, and L. Z. Wang, “A context-scale-aware detector and a new benchmark for remote sensing small weak object detection in unmanned aerial vehicle images,” Int. J. Appl. Earth Obs. Geoinformation, vol. 112, p. 102966, 2022. doi: 10.1016/j.jag.2022.102966
    [48]
    T. Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, “Microsoft COCO: Common objects in context,” arXiv preprint arXiv: 1405.0312, 2015.
    [49]
    T. Lin, P. Goyal, R. Girshick, K. M. He, and P. Dollar, “Focal loss for dense object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, 2018.
    [50]
    Z. Tian, C. H. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” in Proc. IEEE/CVF 17th Int. Conf. Computer Vision, 2019, pp. 9626–9635.
    [51]
    X. Y. Zhou, D. Q. Wang, and P. Krähenbühl, “Objects as points,” arXiv preprint arXiv: 1904.07850, 2019.
    [52]
    Y. H. Li, Y. T. Chen, N. Y. Wang, and Z. X. Zhang, “Scale-aware trident networks for object detection,” in Proc. IEEE/CVF 17th Int. Conf. Computer Vision, 2019, pp. 6053–6062.
    [53]
    S. f. Zhang, C. Chi, Y. Q. Yao, Z. Lei, and S. Z. Li, “Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection,” in Proc. IEEE/CVF 33rd Conf. Computer Vision and Pattern Recognition, 2020, pp. 9756–9765.
    [54]
    C. C. Zhu, Y. H. He, and M. Savvides, “Feature selective anchor-free module for single-shot object detection,” in Proc. IEEE/CVF 32nd Conf. Computer Vision and Pattern Recognition, 2019, pp. 840–849.
    [55]
    H. Y. Zhang, Y. Wang, F. Dayoub, and N. Sunderhauf, “VarifocalNet: An IoU-aware dense object detector,” in Proc. IEEE/CVF 34th Conf. Computer Vision and Pattern Recognition, 2021, pp. 8510–8519.
    [56]
    Z. H. Chen, C. H. Yang, J. H. Chang, F. Zhao, Z. J. Zha, and F. Wu, “DDOD: Dive deeper into the disentanglement of object detector,” IEEE Trans. Multimedia, no. 3264008, pp. 1–15, 2023.
    [57]
    Z. Ge, S. T. Liu, F. Wang, Z. M. Li, and J. Sun, “YOLOX: Exceeding YOLO series in 2021,” arXiv preprint arXiv: 2107.08430, 2021.
    [58]
    Z. W. Cai and N. Vasconcelos, “Cascade R-CNN: Delving into high quality object detection,” in Proc. IEEE/CVF 31st Conf. Computer Vision and Pattern Recognition, 2018, pp. 6154–6162.
    [59]
    C. J. Feng, Y. J. Zhong, Y. Gao, M. R. Scott, and W. L. Huang, “TOOD: Task-aligned one-stage object detection,” in Proc. IEEE/CVF 18th Int. Conf. Computer Vision, 2021, pp. 3490–3499.
    [60]
    J. Glenn, C. Ayush, and Q. Jing. YOLO by Ultralytics (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics
    [61]
    S. L. Xu, X. X. Wang, W. Y. Lv, Q. Y. Chang, C. Cui, K. P. Deng, G. Z. Wang, Q. Q. Dang, S. Y. Wei, Y. N. Du, and B. H. Lai, “PP-YOLOE: An evolved version of YOLO,” arXiv preprint arXiv: 2203.16250, 2022.
    [62]
    H. Y. Zhao, H. P. Zhang, and Y. Y. Zhao, “Yolov7-sea: Object detection of maritime uav images based on improved Yolov7,” in Proc. IEEE/CVF 23rd Winter Conf. Applications of Computer Vision Workshops, 2023, pp. 233–238.
    [63]
    X. Li, H. B. Duan, Y. L. Tian, and F.-Y. Wang, “Exploring image generation for UAV change detection,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 1061–1072, 2022. doi: 10.1109/JAS.2022.105629
    [64]
    Y. L. Tian, X. Li, K. Wang, and F.-Y. Wang, “Training and testing object detectors with virtual images,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 539–546, 2018.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(7)

    Article Metrics

    Article views (528) PDF downloads(92) Cited by()

    Highlights

    • A novel low-altitude aerial target detection framework for efficient computation
    • Multi-domain attention mechanisms contribute to key and robust feature extraction
    • Tensor decomposition can optimize convolution operators to reduce model redundancy
    • Knowledge distillation can compensate for precision loss during model compression
    • An effective loss function that considers multiple geometric information

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return