Volume 12
Issue 11
IEEE/CAA Journal of Automatica Sinica
| Citation: | X. Wang, Y. Jin, H. Yu, Y. Cen, T. Wang, and Y. Li, “Point-MASNet: Masked autoencoder-based sampling network for 3D point cloud,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 11, pp. 2300–2313, Nov. 2025. doi: 10.1109/JAS.2024.125088 |
| [1] |
L. Chen, X. Hu, W. Tian, H. Wang, D. Cao, and F. Wang, “Parallel planning: A new motion planning framework for autonomous driving,” IEEE CAA J. Autom. Sinica, vol. 6, no. 1, pp. 236–246, 2019. doi: 10.1109/JAS.2018.7511186
|
| [2] |
H. Liu, C. Lin, B. Gong, and D. Wu, “Automatic lane-level intersection map generation using low-channel roadside LiDAR,” IEEE CAA J. Autom. Sinica, vol. 10, no. 5, pp. 1209–1222, 2023. doi: 10.1109/JAS.2023.123183
|
| [3] |
Y. Liu, B. Tian, Y. Lv, L. Li, and F. Wang, “Point cloud classification using content-based transformer via clustering in feature space,” IEEE CAA J. Autom. Sinica, vol. 11, no. 1, pp. 231–239, 2024. doi: 10.1109/JAS.2023.123432
|
| [4] |
W. Ren, Y. Tang, Q. Sun, C. Zhao, and Q. Han, “Visual semantic segmentation based on few/zero-shot learning: An overview,” IEEE CAA J. Autom. Sinica, vol. 11, no. 5, pp. 1106–1126, 2024. doi: 10.1109/JAS.2023.123207
|
| [5] |
X. Wang, Y. Jin, Y. Cen, T. Wang, and Y. Li, “Attention models for point clouds in deep learning: A survey,” arXiv preprint arXiv: 2102.10788, 2021.
|
| [6] |
F. Yin, Z. Huang, T. Chen, G. Luo, G. Yu, and B. Fu, “DCNet: Largescale point cloud semantic segmentation with discriminative and efficient feature aggregation,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 8, pp. 4083–4095, 2023. doi: 10.1109/TCSVT.2023.3239541
|
| [7] |
Q. Zhang, L. Wang, H. Meng, W. Zhang, and G. Huang, “A LiDAR point clouds dataset of ships in a maritime environment,” IEEE CAA J. Autom. Sinica, vol. 11, no. 7, pp. 1681–1694, 2024. doi: 10.1109/JAS.2024.124275
|
| [8] |
I. Lang, A. Manor, and S. Avidan, “SampleNet: Differentiable point cloud sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition. Seattle, WA, USA, 2020, pp. 7575–7585.
|
| [9] |
X. Wang, Y. Jin, Y. Cen, T. Wang, B. Tang, and Y. Li, “LighTN: Lightweight transformer network for performance-overhead tradeoff in point cloud downsampling,” IEEE Trans. Multim., pp. 1–16, 2023.
|
| [10] |
C. Wu, J. Zheng, J. Pfrommer, and J. Beyerer, “Attention-based point cloud edge sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 5333–5343.
|
| [11] |
C. Wen, B. Yu, and D. Tao, “Learnable skeleton-aware 3d point cloud sampling,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 17671–17681.
|
| [12] |
K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. B. Girshick, “Masked x autoencoders are scalable vision learners,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022, pp. 15979–15988.
|
| [13] |
A. Xiao, J. Huang, D. Guan, X. Zhang, S. Lu, and L. Shao, “Unsupervised point cloud representation learning with deep neural networks: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 11321–11339, 2023. doi: 10.1109/TPAMI.2023.3262786
|
| [14] |
J. Li, S. Guo, L. Wang, and S. Han, “HQ-Net: A heatmap-based query backbone for point cloud understanding,” Neurocomputing, vol. 606, p. 128413, 2024. doi: 10.1016/j.neucom.2024.128413
|
| [15] |
Z. Zhang, H. Jiang, D. Shen, and S. S. Saab, “Data-driven learning control algorithms for unachievable tracking problems,” IEEE CAA J. Autom. Sinica, vol. 11, no. 1, pp. 205–218, 2024. doi: 10.1109/JAS.2023.123756
|
| [16] |
R. Hassan, M. M. Fraz, A. Rajput, and M. Shahzad, “Residual learning with annularly convolutional neural networks for classification and segmentation of 3D point clouds,” Neurocomputing, vol. 526, pp. 96–108, 2023. doi: 10.1016/j.neucom.2023.01.026
|
| [17] |
S. Ye, D. Chen, S. Han, Z. Wan, and J. Liao, “Meta-PU: An arbitraryscale upsampling network for point cloud,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 9, pp. 3206–3218, 2022. doi: 10.1109/TVCG.2021.3058311
|
| [18] |
J. Liu, J. Guo, and D. Xu, “APSNet: Toward adaptive point sampling for efficient 3D action recognition,” IEEE Trans. Image Process., vol. 31, pp. 5287–5302, 2022. doi: 10.1109/TIP.2022.3193290
|
| [19] |
Y. Wu, X. Hu, Y. Zhang, M. Gong, W. Ma, and Q. Miao, “SACF-Net: Skip-attention based correspondence filtering network for point cloud registration,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 8, pp. 3585–3595, 2023. doi: 10.1109/TCSVT.2023.3237328
|
| [20] |
Q. Hu, B. Yang, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, and A. Markham, “Learning semantic segmentation of large-scale point clouds with random sampling,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 8338–8354, 2022.
|
| [21] |
Z. Li, Z. Chen, A. Li, L. Fang, Q. Jiang, X. Liu, J. Jiang, B. Zhou, and H. Zhao, “SimIPU: Simple 2D image and 3D point cloud unsupervised pre-training for spatial-aware visual representations,” in Proc. 36th AAAI Conf. Artificial Intelligence, 2022, pp. 1500–1508.
|
| [22] |
Y. Tian, L. Huang, H. Yu, X. Wu, X. Li, K. Wang, Z. Wang, and F. Wang, “Context-aware dynamic feature extraction for 3D object detection in point clouds,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 10773–10785, 2022. doi: 10.1109/TITS.2021.3095719
|
| [23] |
O. Dovrat, I. Lang, and S. Avidan, “Learning to sample,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 2760–2769.
|
| [24] |
X. Wang, Y. Jin, Y. Cen, C. Lang, and Y. Li, “Pst-net: Point cloud sampling via point-based transformer,” in Proc. Image and Graphics: 11th Int. Conf., Haikou, China, 2021, pp. 57–69.
|
| [25] |
H. Lee, J. Jeon, S. Hong, J. Kim, and J. Yoo, “TransNet: Transformerbased point cloud sampling network,” Sensors, vol. 23, p. 10, 2023.
|
| [26] |
Y. Qian, J. Hou, Q. Zhang, Y. Zeng, S. Kwong, and Y. He, “Task-oriented compact representation of 3D point clouds via a matrix optimizationdriven network,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6981–6995, 2023. doi: 10.1109/TCSVT.2023.3270315
|
| [27] |
Y. Lin, T. Liu, Y. Zhang, S. Liu, L. Ye, and H. Min, “LA-Net: LSTM and attention based point cloud down-sampling and its application,” Meas. Control., vol. 56, pp. 7–8, 1261.
|
| [28] |
Y. Pang, W. Wang, F. E. H. Tay, W. Liu, Y. Tian, and L. Yuan, “Masked autoencoders for point cloud self-supervised learning,” in Proc. European Conference on Computer Vision, Tel Aviv, Israel, vol. 13662, 2022, pp. 604–621.
|
| [29] |
R. Zhang, Z. Guo, P. Gao, R. Fang, B. Zhao, D. Wang, Y. Qiao, and H. Li, “Point-M2AE: Multi-scale masked autoencoders for hierarchical point cloud pre-training,” in Proc. Advances in Neural Information Processing Systems, New Orleans, LA, USA, vol. 35, 2022, pp. 27061–27074.
|
| [30] |
J. Jiang, X. Lu, L. Zhao, R. Dazaley, and M. Wang, “Masked autoencoders in 3D point cloud representation learning,” IEEE Trans. Multim., pp. 1–12, 2023.
|
| [31] |
C. Min, L. Xiao, D. Zhao, Y. Nie, and B. Dai, “Occupancy-MAE: Self-supervised pre-training large-scale LiDAR point clouds with masked occupancy autoencoders,” IEEE Trans. Intell. Veh., vol. 9, no. 7, pp. 5150–5162, 2024. doi: 10.1109/TIV.2023.3322409
|
| [32] |
G. Hess, J. Jaxing, E. Svensson, D. Hagerman, C. Petersson, and L. Svensson, “Masked autoencoder for self-supervised pre-training on LiDAR point clouds,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision Workshops, Waikoloa, HI, USA, 2023, pp. 350–359.
|
| [33] |
G. Krispel, D. Schinagl, C. Fruhwirth-Reisinger, H. Possegger, and H. Bischof, “MAELi: Masked autoencoder for large-scale LiDAR point clouds,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision, Waikoloa, HI, USA, 2024, pp. 3371–3380.
|
| [34] |
H. Li, Z. Wang, C. Lan, P. Wu, and N. Zeng, “A novel dynamic multiobjective optimization algorithm with hierarchical response system,” IEEE Trans. Comput. Soc. Syst., vol. 11, no. 2, pp. 2494–2512, 2024. doi: 10.1109/TCSS.2023.3293331
|
| [35] |
W. Tan, H. Zhang, Z. Wang, H. Li, X. Gao, and N. Zeng, “S3TNet: A novel electroencephalogram signals-oriented emotion recognition model,” Comput. Biol. Medicine, vol. 179, p. 108808, 2024. doi: 10.1016/j.compbiomed.2024.108808
|
| [36] |
L. Hu, Z. Wang, H. Li, P. Wu, J. Mao, and N. Zeng, “ $ {\mathcal{L}}$-DARTS: Lightweight differentiable architecture search with robustness enhancement strategy,” Knowl. Based Syst., vol. 288, p. 111466, 2024. doi: 10.1016/j.knosys.2024.111466
|
| [37] |
M. Guo, J. Cai, Z. Liu, T. Mu, R. R. Martin, and S. Hu, “PCT: Point cloud transformer,” Comput. Vis. Media, vol. 7, no. 2, pp. 187–199, 2021. doi: 10.1007/s41095-021-0229-5
|
| [38] |
Y. Ye, X. Yang, and S. Ji, “APSNet: Attention based point cloud sampling,” in Proc. 33rd British Machine Vision Conf., London, UK, 2022, p. 298.
|
| [39] |
Y. Lin, K. Chen, S. Zhou, Y. Huang, and Y. Lei, “CO-NET: Classification-oriented point cloud sampling via informative feature learning and non-overlapped local adjustment,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 2023, pp. 1–5.
|
| [40] |
J. Xiong, T. Dai, Y. Zha, X. Wang, and S. Xia, “Semantic preserving learning for Task-oriented point cloud downsampling,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 2023, pp. 1–5.
|
| [41] |
C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning on point sets for 3D classification and segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 77–85.
|
| [42] |
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D ShapeNets: A deep representation for volumetric shapes,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1912–1920.
|
| [43] |
V. Sarode, X. Li, H. Goforth, Y. Aoki, R. A. Srivatsan, S. Lucey, and H. Choset, “PCRNet: Point cloud registration network using pointnet encoding,” arXiv preprint arXiv: 1908. 07906, 2019.
|
| [44] |
W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “PCN: Point completion network,” in Proc. Int. Conf. 3D Vision, Verona, Italy, 2018, pp. 728–737.
|
| [45] |
A. X. Chang, T. A. Funkhouser, L. J. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An information-rich 3D model repository,” arXiv preprint arXiv: 1512.03012, 2015.
|
| [46] |
M. A. Uy, Q. Pham, B. Hua, D. T. Nguyen, and S. Yeung, “Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 1588–1597.
|
| [47] |
Y. Dai, C. Wen, H. Wu, Y. Guo, L. Chen, and C. Wang, “Indoor 3D human trajectory reconstruction using surveillance camera videos and point clouds,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 4, pp. 2482–2495, 2022. doi: 10.1109/TCSVT.2021.3081591
|
| [48] |
Q. Han, “Editorial: Driving into future with reliable, secure, efficient and intelligent metavehicles,” IEEE CAA J. Autom. Sinica, vol. 10, no. 6, pp. 1355–1356, 2023. doi: 10.1109/JAS.2023.123621
|
| [49] |
H. Li, Y. Guo, Z. Ren, F. R. Yu, J. You, and X. You, “Explicit local coupling global structure clustering,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6649–6660, 2023. doi: 10.1109/TCSVT.2023.3266283
|
| [50] |
S. Zang, H. Jin, Q. Yu, S. Zhang, and H. Yu, “Video summarization using u-shaped non-local network,” Int. J. Network Dynamics and Intelligence, pp. 100013–100013, 2024.
|
| [51] |
X. Wang, Y. Jin, H. Yu, Y. Cen, and Y. Li, "Visual perception-inspired 3D point cloud sampling," Pattern Recognition, 169, p.111883, 2026.
|