A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 12 Issue 11
Nov.  2025

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 19.2, Top 1 (SCI Q1)
    CiteScore: 28.2, Top 1% (Q1)
    Google Scholar h5-index: 95, TOP 5
Turn off MathJax
Article Contents
X. Wang, Y. Jin, H. Yu, Y. Cen, T. Wang, and Y. Li, “Point-MASNet: Masked autoencoder-based sampling network for 3D point cloud,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 11, pp. 2300–2313, Nov. 2025. doi: 10.1109/JAS.2024.125088
Citation: X. Wang, Y. Jin, H. Yu, Y. Cen, T. Wang, and Y. Li, “Point-MASNet: Masked autoencoder-based sampling network for 3D point cloud,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 11, pp. 2300–2313, Nov. 2025. doi: 10.1109/JAS.2024.125088

Point-MASNet: Masked Autoencoder-Based Sampling Network for 3D Point Cloud

doi: 10.1109/JAS.2024.125088
Funds:  This work was supported by the National Key Research and Development Program of China (2022YFB3103500), the National Natural Science Foundation of China (62473033, 62571027), in part by the Beijing Natural Science Foundation (L231012), and the State Scholarship Fund from the China Scholarship Council
More Information
  • Task-oriented point cloud sampling aims to select a representative subset from the input, tailored to specific application scenarios and task requirements. However, existing approaches rarely tackle the problem of redundancy caused by local structural similarities in 3D objects, which limits the performance of sampling. To address this issue, this paper introduces a novel task-oriented point cloud masked autoencoder-based sampling network (Point-MASNet), inspired by the masked autoencoder mechanism. Point-MASNet employs a voxel-based random non-overlapping masking strategy, which allows the model to selectively learn and capture distinctive local structural features from the input data. This approach effectively mitigates redundancy and enhances the representativeness of the sampled subset. In addition, we propose a lightweight, symmetrically structured keypoint reconstruction network, designed as an autoencoder. This network is optimized to efficiently extract latent features while enabling refined reconstructions. Extensive experiments demonstrate that Point-MASNet achieves competitive sampling performance across classification, registration, and reconstruction tasks.

     

  • loading
  • [1]
    L. Chen, X. Hu, W. Tian, H. Wang, D. Cao, and F. Wang, “Parallel planning: A new motion planning framework for autonomous driving,” IEEE CAA J. Autom. Sinica, vol. 6, no. 1, pp. 236–246, 2019. doi: 10.1109/JAS.2018.7511186
    [2]
    H. Liu, C. Lin, B. Gong, and D. Wu, “Automatic lane-level intersection map generation using low-channel roadside LiDAR,” IEEE CAA J. Autom. Sinica, vol. 10, no. 5, pp. 1209–1222, 2023. doi: 10.1109/JAS.2023.123183
    [3]
    Y. Liu, B. Tian, Y. Lv, L. Li, and F. Wang, “Point cloud classification using content-based transformer via clustering in feature space,” IEEE CAA J. Autom. Sinica, vol. 11, no. 1, pp. 231–239, 2024. doi: 10.1109/JAS.2023.123432
    [4]
    W. Ren, Y. Tang, Q. Sun, C. Zhao, and Q. Han, “Visual semantic segmentation based on few/zero-shot learning: An overview,” IEEE CAA J. Autom. Sinica, vol. 11, no. 5, pp. 1106–1126, 2024. doi: 10.1109/JAS.2023.123207
    [5]
    X. Wang, Y. Jin, Y. Cen, T. Wang, and Y. Li, “Attention models for point clouds in deep learning: A survey,” arXiv preprint arXiv: 2102.10788, 2021.
    [6]
    F. Yin, Z. Huang, T. Chen, G. Luo, G. Yu, and B. Fu, “DCNet: Largescale point cloud semantic segmentation with discriminative and efficient feature aggregation,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 8, pp. 4083–4095, 2023. doi: 10.1109/TCSVT.2023.3239541
    [7]
    Q. Zhang, L. Wang, H. Meng, W. Zhang, and G. Huang, “A LiDAR point clouds dataset of ships in a maritime environment,” IEEE CAA J. Autom. Sinica, vol. 11, no. 7, pp. 1681–1694, 2024. doi: 10.1109/JAS.2024.124275
    [8]
    I. Lang, A. Manor, and S. Avidan, “SampleNet: Differentiable point cloud sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition. Seattle, WA, USA, 2020, pp. 7575–7585.
    [9]
    X. Wang, Y. Jin, Y. Cen, T. Wang, B. Tang, and Y. Li, “LighTN: Lightweight transformer network for performance-overhead tradeoff in point cloud downsampling,” IEEE Trans. Multim., pp. 1–16, 2023.
    [10]
    C. Wu, J. Zheng, J. Pfrommer, and J. Beyerer, “Attention-based point cloud edge sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 5333–5343.
    [11]
    C. Wen, B. Yu, and D. Tao, “Learnable skeleton-aware 3d point cloud sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 17671–17681.
    [12]
    K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. B. Girshick, “Masked x autoencoders are scalable vision learners,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022, pp. 15979–15988.
    [13]
    A. Xiao, J. Huang, D. Guan, X. Zhang, S. Lu, and L. Shao, “Unsupervised point cloud representation learning with deep neural networks: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 11321–11339, 2023. doi: 10.1109/TPAMI.2023.3262786
    [14]
    J. Li, S. Guo, L. Wang, and S. Han, “HQ-Net: A heatmap-based query backbone for point cloud understanding,” Neurocomputing, vol. 606, p. 128413, 2024. doi: 10.1016/j.neucom.2024.128413
    [15]
    Z. Zhang, H. Jiang, D. Shen, and S. S. Saab, “Data-driven learning control algorithms for unachievable tracking problems,” IEEE CAA J. Autom. Sinica, vol. 11, no. 1, pp. 205–218, 2024. doi: 10.1109/JAS.2023.123756
    [16]
    R. Hassan, M. M. Fraz, A. Rajput, and M. Shahzad, “Residual learning with annularly convolutional neural networks for classification and segmentation of 3D point clouds,” Neurocomputing, vol. 526, pp. 96–108, 2023. doi: 10.1016/j.neucom.2023.01.026
    [17]
    S. Ye, D. Chen, S. Han, Z. Wan, and J. Liao, “Meta-PU: An arbitraryscale upsampling network for point cloud,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 9, pp. 3206–3218, 2022. doi: 10.1109/TVCG.2021.3058311
    [18]
    J. Liu, J. Guo, and D. Xu, “APSNet: Toward adaptive point sampling for efficient 3D action recognition,” IEEE Trans. Image Process., vol. 31, pp. 5287–5302, 2022. doi: 10.1109/TIP.2022.3193290
    [19]
    Y. Wu, X. Hu, Y. Zhang, M. Gong, W. Ma, and Q. Miao, “SACF-Net: Skip-attention based correspondence filtering network for point cloud registration,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 8, pp. 3585–3595, 2023. doi: 10.1109/TCSVT.2023.3237328
    [20]
    Q. Hu, B. Yang, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, and A. Markham, “Learning semantic segmentation of large-scale point clouds with random sampling,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 8338–8354, 2022.
    [21]
    Z. Li, Z. Chen, A. Li, L. Fang, Q. Jiang, X. Liu, J. Jiang, B. Zhou, and H. Zhao, “SimIPU: Simple 2D image and 3D point cloud unsupervised pre-training for spatial-aware visual representations,” in Proc. 36th AAAI Conf. Artificial Intelligence, 2022, pp. 1500–1508.
    [22]
    Y. Tian, L. Huang, H. Yu, X. Wu, X. Li, K. Wang, Z. Wang, and F. Wang, “Context-aware dynamic feature extraction for 3D object detection in point clouds,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 10773–10785, 2022. doi: 10.1109/TITS.2021.3095719
    [23]
    O. Dovrat, I. Lang, and S. Avidan, “Learning to sample,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 2760–2769.
    [24]
    X. Wang, Y. Jin, Y. Cen, C. Lang, and Y. Li, “Pst-net: Point cloud sampling via point-based transformer,” in Proc. Image and Graphics: 11th Int. Conf., Haikou, China, 2021, pp. 57–69.
    [25]
    H. Lee, J. Jeon, S. Hong, J. Kim, and J. Yoo, “TransNet: Transformerbased point cloud sampling network,” Sensors, vol. 23, p. 10, 2023.
    [26]
    Y. Qian, J. Hou, Q. Zhang, Y. Zeng, S. Kwong, and Y. He, “Task-oriented compact representation of 3D point clouds via a matrix optimizationdriven network,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6981–6995, 2023. doi: 10.1109/TCSVT.2023.3270315
    [27]
    Y. Lin, T. Liu, Y. Zhang, S. Liu, L. Ye, and H. Min, “LA-Net: LSTM and attention based point cloud down-sampling and its application,” Meas. Control., vol. 56, pp. 7–8, 1261.
    [28]
    Y. Pang, W. Wang, F. E. H. Tay, W. Liu, Y. Tian, and L. Yuan, “Masked autoencoders for point cloud self-supervised learning,” in Proc. European Conference on Computer Vision, Tel Aviv, Israel, vol. 13662, 2022, pp. 604–621.
    [29]
    R. Zhang, Z. Guo, P. Gao, R. Fang, B. Zhao, D. Wang, Y. Qiao, and H. Li, “Point-M2AE: Multi-scale masked autoencoders for hierarchical point cloud pre-training,” in Proc. Advances in Neural Information Processing Systems, New Orleans, LA, USA, vol. 35, 2022, pp. 27061–27074.
    [30]
    J. Jiang, X. Lu, L. Zhao, R. Dazaley, and M. Wang, “Masked autoencoders in 3D point cloud representation learning,” IEEE Trans. Multim., pp. 1–12, 2023.
    [31]
    C. Min, L. Xiao, D. Zhao, Y. Nie, and B. Dai, “Occupancy-MAE: Self-supervised pre-training large-scale LiDAR point clouds with masked occupancy autoencoders,” IEEE Trans. Intell. Veh., vol. 9, no. 7, pp. 5150–5162, 2024. doi: 10.1109/TIV.2023.3322409
    [32]
    G. Hess, J. Jaxing, E. Svensson, D. Hagerman, C. Petersson, and L. Svensson, “Masked autoencoder for self-supervised pre-training on LiDAR point clouds,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision Workshops, Waikoloa, HI, USA, 2023, pp. 350–359.
    [33]
    G. Krispel, D. Schinagl, C. Fruhwirth-Reisinger, H. Possegger, and H. Bischof, “MAELi: Masked autoencoder for large-scale LiDAR point clouds,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision, Waikoloa, HI, USA, 2024, pp. 3371–3380.
    [34]
    H. Li, Z. Wang, C. Lan, P. Wu, and N. Zeng, “A novel dynamic multiobjective optimization algorithm with hierarchical response system,” IEEE Trans. Comput. Soc. Syst., vol. 11, no. 2, pp. 2494–2512, 2024. doi: 10.1109/TCSS.2023.3293331
    [35]
    W. Tan, H. Zhang, Z. Wang, H. Li, X. Gao, and N. Zeng, “S3TNet: A novel electroencephalogram signals-oriented emotion recognition model,” Comput. Biol. Medicine, vol. 179, p. 108808, 2024. doi: 10.1016/j.compbiomed.2024.108808
    [36]
    L. Hu, Z. Wang, H. Li, P. Wu, J. Mao, and N. Zeng, “ $ {\mathcal{L}}$-DARTS: Lightweight differentiable architecture search with robustness enhancement strategy,” Knowl. Based Syst., vol. 288, p. 111466, 2024. doi: 10.1016/j.knosys.2024.111466
    [37]
    M. Guo, J. Cai, Z. Liu, T. Mu, R. R. Martin, and S. Hu, “PCT: Point cloud transformer,” Comput. Vis. Media, vol. 7, no. 2, pp. 187–199, 2021. doi: 10.1007/s41095-021-0229-5
    [38]
    Y. Ye, X. Yang, and S. Ji, “APSNet: Attention based point cloud sampling,” in Proc. 33rd British Machine Vision Conf., London, UK, 2022, p. 298.
    [39]
    Y. Lin, K. Chen, S. Zhou, Y. Huang, and Y. Lei, “CO-NET: Classification-oriented point cloud sampling via informative feature learning and non-overlapped local adjustment,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 2023, pp. 1–5.
    [40]
    J. Xiong, T. Dai, Y. Zha, X. Wang, and S. Xia, “Semantic preserving learning for Task-oriented point cloud downsampling,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 2023, pp. 1–5.
    [41]
    C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning on point sets for 3D classification and segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 77–85.
    [42]
    Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D ShapeNets: A deep representation for volumetric shapes,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1912–1920.
    [43]
    V. Sarode, X. Li, H. Goforth, Y. Aoki, R. A. Srivatsan, S. Lucey, and H. Choset, “PCRNet: Point cloud registration network using pointnet encoding,” arXiv preprint arXiv: 1908. 07906, 2019.
    [44]
    W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “PCN: Point completion network,” in Proc. Int. Conf. 3D Vision, Verona, Italy, 2018, pp. 728–737.
    [45]
    A. X. Chang, T. A. Funkhouser, L. J. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An information-rich 3D model repository,” arXiv preprint arXiv: 1512.03012, 2015.
    [46]
    M. A. Uy, Q. Pham, B. Hua, D. T. Nguyen, and S. Yeung, “Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 1588–1597.
    [47]
    Y. Dai, C. Wen, H. Wu, Y. Guo, L. Chen, and C. Wang, “Indoor 3D human trajectory reconstruction using surveillance camera videos and point clouds,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 4, pp. 2482–2495, 2022. doi: 10.1109/TCSVT.2021.3081591
    [48]
    Q. Han, “Editorial: Driving into future with reliable, secure, efficient and intelligent metavehicles,” IEEE CAA J. Autom. Sinica, vol. 10, no. 6, pp. 1355–1356, 2023. doi: 10.1109/JAS.2023.123621
    [49]
    H. Li, Y. Guo, Z. Ren, F. R. Yu, J. You, and X. You, “Explicit local coupling global structure clustering,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6649–6660, 2023. doi: 10.1109/TCSVT.2023.3266283
    [50]
    S. Zang, H. Jin, Q. Yu, S. Zhang, and H. Yu, “Video summarization using u-shaped non-local network,” Int. J. Network Dynamics and Intelligence, pp. 100013–100013, 2024.
    [51]
    X. Wang, Y. Jin, H. Yu, Y. Cen, and Y. Li, "Visual perception-inspired 3D point cloud sampling," Pattern Recognition, 169, p.111883, 2026.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(8)

    Article Metrics

    Article views (8) PDF downloads(1) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return