Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

Wenqi Ren; Yang Tang; Qiyu Sun; Chaoqiang Zhao; Qing-Long Han

doi:10.1109/JAS.2023.123207

Volume 11 Issue 5

May 2024

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 19.2, Top 1 (SCI Q1)

CiteScore: 28.2, Top 1% (Q1)
Google Scholar h5-index: 95， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > 11(5): 1106-1126

W. Ren, Y. Tang, Q. Sun, C. Zhao, and Q.-L. Han, “Visual semantic segmentation based on few/zero-shot learning: An overview,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 5, pp. 1106–1126, May 2024. doi: 10.1109/JAS.2023.123207

Citation:

W. Ren, Y. Tang, Q. Sun, C. Zhao, and Q.-L. Han, “Visual semantic segmentation based on few/zero-shot learning: An overview,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 5, pp. 1106–1126, May 2024. doi: 10.1109/JAS.2023.123207

Citation:

PDF( 2304 KB)

Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

doi: 10.1109/JAS.2023.123207

Funds: This work was supported by National Key Research and Development Program of China (2021YFB1714300), the National Natural Science Foundation of China (62233005), and in part by the CNPC Innovation Fund (2021D002-0902), Fundamental Research Funds for the Central Universities and Shanghai AI Lab. Qiyu Sun is sponsored by Shanghai Gaofeng and Gaoyuan Project for University Academic Program Development

More Information

Author Bio:
Wenqi Ren received the B.S. degree in electrical engineering and automation, from the East China University of Science and Technology in 2020, where she is currently working toward the Ph.D. degree in control science and engineering. Her research interests include meta-learning, few-shot learning, domain adaptation, and scene understanding

Yang Tang (Fellow, IEEE) received the B.S. and Ph.D. degrees in electrical engineering from Donghua University, in 2006 and 2010, respectively. From 2008 to 2010, he was a Research Associate with The Hong Kong Polytechnic University, Hong Kong, China. From 2011 to 2015, he was a Post-Doctoral Researcher with the Humboldt University of Berlin, Germany, and with the Potsdam Institute for Climate Impact Research, Germany. He is now a Professor with the East China University of Science and Technology. His current research interests include distributed estimation/control/optimization, computer vision, reinforcement learning, cyber-physical systems, hybrid dynamical systems, and their applications. He was a recipient of the Alexander von Humboldt Fellowship. He is an Associate Editor of IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Cybernetics, IEEE Transactions on Industrial Informatics, IEEE/ASME Transactions on Mechatronics, IEEE Transactions on Circuits and Systems-I: Regular Papers, IEEE Transactions on Cognitive and Developmental Systems, IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Systems Journal, Engineering Applications of Artificial Intelligence (IFAC Journal) and Science China Information Sciences, etc. He has published more than 200 papers in international journals and conferences, including more than 100 papers in IEEE transactions and 20 papers in IFAC journals. He has been awarded as best/outstanding Associate Editor in IEEE journals for four times. He is a (leading) Guest Editor for several special issues focusing on autonomous systems, robotics, and industrial intelligence in IEEE transactions

Qiyu Sun received the B.S. degree in automation, from the East China University of Science and Technology in 2019, where she is currently working toward the Ph.D. degree in control science and engineering. Her research interests include 3D scene understanding, domain adaptation, and deep learning

Chaoqiang Zhao received the B.S. degree in automation in 2018, and the Ph.D. degree in control science and engineering in 2023 from the East China University of Science and Technology. Dr. Zhao has worked for the National Key Laboratory of Air-Based Information Perception and Fusion, Aviation Industry Corporation of China since 2023. His research interests include depth perception, visual odometry, and stereo vision

Qing-Long Han (Fellow, IEEE) received the B.Sc. degree in mathematics from Shandong Normal University in 1983, and the M.Sc. and Ph.D. degrees in control engineering from East China University of Science and Technology, in 1992 and 1997, respectively. He is Pro Vice-Chancellor (Research Quality) and a Distinguished Professor at Swinburne University of Technology, Australia. He held various academic and management positions at Griffith University and Central Queensland University, Australia. His research interests include networked control systems, multi-agent systems, time-delay systems, smart grids, unmanned surface vehicles, and neural networks.Professor Han was awarded The 2021 Norbert Wiener Award (the Highest Award in systems science and engineering, and cybernetics) and The 2021 M. A. Sargent Medal (the Highest Award of the Electrical College Board of Engineers Australia). He was the recipient of The 2022 IEEE Systems, Man, and Cybernetics Society Andrew P. Sage Best Transactions Paper Award, The 2021 IEEE/CAA Journal of Automatica Sinica Norbert Wiener Review Award, The 2020 IEEE Systems, Man, and Cybernetics Society Andrew P. Sage Best Transactions Paper Award, The 2020 IEEE Transactions on Industrial Informatics Outstanding Paper Award, and The 2019 IEEE Systems, Man, and Cybernetics Society Andrew P. Sage Best Transactions Paper Award.Professor Han is a Member of the Academia Europaea (The Academy of Europe). He is a Fellow of The International Federation of Automatic Control (IFAC), a Fellow of The Institution of Engineers Australia (IEAust), and a Fellow of Chinese Association of Automation (CAA). He is a Highly Cited Researcher in both Engineering and Computer Science (Clarivate). He has served as an AdCom Member of IEEE Industrial Electronics Society (IES), a Member of IEEE IES Fellows Committee, and Chair of IEEE IES Technical Committee on Networked Control Systems. Currently, he is Editor-in-Chief of IEEE/CAA Journal of Automatica Sinica, Co-Editor-in-Chief of IEEE Transactions on Industrial Informatics, and Co-Editor of Australian Journal of Electrical and Electronic Engineering
Corresponding author: Yang Tang, e-mail: yangtang@ecust.edu.cn; Qing-Long Han, e-mail: qhan@swin.edu.au
Received Date: 2022-08-31
Revised Date: 2022-10-06
Accepted Date: 2022-11-02

Available Online: 2024-02-21

Abstract

Abstract

Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.
- Computer vision,
- deep learning,
- few-shot learning,
- low-shot learning,
- semantic segmentation,
- zero-shot learning

FullText(HTML)

References(163)

References

[1]	L. Chen, X. M. Hu, W. Tian, H. Wang, D. P. Cao, and F.-Y. Wang, “Parallel planning: A new motion planning framework for autonomous driving,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 236–246, Jan. 2019. doi: 10.1109/JAS.2018.7511186
[2]	C. Sun, J. M. U. Vianney, Y. Li, L. Chen, L. Li, F.-Y. Wang, A. Khajepour, and D. P. Cao, “Proximity based automatic data annotation for autonomous driving,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 395–404, Mar. 2020. doi: 10.1109/JAS.2020.1003033
[3]	W. X. He, M. Liu, Y. Tang, Q. H. Liu, and Y. N. Wang, “Differentiable automatic data augmentation by proximal update for medical image segmentation,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1315–1318, Jul. 2022. doi: 10.1109/JAS.2022.105701
[4]	J. M. J. Valanarasu, P. Oza, I. Hacihaliloglu, and V. M. Patel, “Medical transformer: Gated axial-attention for medical image segmentation,” in Proc. 24th Int. Conf. Medical Image Computing and Computer Assisted Intervention, Strasbourg, France, 2021, pp. 36–46.
[5]	Q. M. Cheng, Y. Z. Zhou, H. Y. Huang, and Z. Y. Wang, “Multi-attention fusion and fine-grained alignment for bidirectional image-sentence retrieval in remote sensing,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 8, pp. 1532–1535, Aug. 2022. doi: 10.1109/JAS.2022.105773
[6]	C. Z. Zhang, Y. Tang, C. Q. Zhao, Q. Y. Sun, Z. C. Ye, and J. Kurths, “Multitask GANs for semantic segmentation and depth completion with cycle consistency,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5404–5415, Dec. 2021. doi: 10.1109/TNNLS.2021.3072883
[7]	B. W. Cheng, A. G. Schwing, and A. Kirillov, “Per-pixel classification is not all you need for semantic segmentation,” in Proc. 35th Conf. Neural Information Processing Systems, 2021, pp. 17864–17875.
[8]	P. W. Patil, A. Dudhane, A. Kulkarni, S. Murala, A. B. Gonde, and S. Gupta, “An unified recurrent video object segmentation framework for various surveillance environments,” IEEE Trans. Image Process., vol. 30, pp. 7889–7902, Sept. 2021. doi: 10.1109/TIP.2021.3108405
[9]	S. Garg and V. Goel, “Mask selection and propagation for unsupervised video object segmentation,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2021, pp. 1679–1689.
[10]	Q. Y. Hu, B. Yang, L. H. Xie, S. Rosa, Y. L. Guo, Z. H. Wang, N. Trigoni, and A. Markham, “Learning semantic segmentation of large-scale point clouds with random sampling,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 8338–8354, Nov. 2022.
[11]	G. Te, W. Hu, A. M. Zheng, and Z. M. Guo, “RGCNN: Regularized graph CNN for point cloud segmentation,” in Proc. 26th ACM Int. Conf. Multimedia, Seoul, Republic of Korea, 2018, pp. 746–754.
[12]	X. B. Chen, A. Golovinskiy, and T. Funkhouser, “A benchmark for 3D mesh segmentation,” ACM Trans. Graph., vol. 28, no. 3, p. 73, Aug. 2009.
[13]	M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL visual object classes (VOC) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, Jun. 2010. doi: 10.1007/s11263-009-0275-4
[14]	A. Shaban, S. Bansal, Z. Liu, I. Essa, and B. Boots, “One-shot learning for semantic segmentation,” in Proc. British Machine Vision Conf., London, UK, 2017.
[15]	K. Nguyen and S. Todorovic, “Feature weighting and boosting for few-shot segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 622–631.
[16]	N. Q. Dong and E. P. Xing, “Few-shot semantic segmentation with prototype learning,” in Proc. British Machine Vision Conf., Newcastle, UK, 2018, pp. 79.
[17]	K. X. Wang, J. H. Liew, Y. T. Zou, D. Q. Zhou, and J. S. Feng, “PANet: Few-shot image semantic segmentation with prototype alignment,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9196–9205.
[18]	C. Zhang, G. S. Lin, F. Y. Liu, R. Yao, and C. H. Shen, “CANet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 5212–5221.
[19]	C. B. Lang, G. Cheng, B. F. Tu, and J. W. Han, “Learning what not to segment: A new perspective on few-shot segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 8047–8057.
[20]	K. Rakelly, E. Shelhamer, T. Darrell, A. A. Efros, and S. Levine, “Conditional networks for few-shot semantic segmentation,” in Proc. 6th Int. Conf. Learning Representations, Vancouver, Canada, 2018.
[21]	Z. T. Tian, H. S. Zhao, M. Shu, Z. C. Yang, R. Y. Li, and J. Y. Jia, “Prior guided feature enrichment network for few-shot segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 2, pp. 1050–1065, Feb. 2022. doi: 10.1109/TPAMI.2020.3013717
[22]	T. Chen, G. S. Xie, Y. Z. Yao, Q. Wang, F. M. Shen, Z. M. Tang, and J. Zhang, “Semantically meaningful class prototype learning for one-shot image segmentation,” IEEE Trans. Multimedia, vol. 24, pp. 968–980, 2022. doi: 10.1109/TMM.2021.3061816
[23]	T. Hu, P. W. Yang, C. L. Zhang, G. Yu, Y. D. Mu, and C. G. M. Snoek, “Attention-based multi-context guiding for few-shot semantic segmentation,” in Proc. 33rd AAAI Conf. Artificial Intelligence, Honolulu, USA, 2019, pp. 8441–8448.
[24]	Y. W. Yang, F. M. Meng, H. L. Li, Q. B. Wu, X. L. Xu, and S. Chen, “A new local transformation module for few-shot segmentation,” in Proc. 26th Int. Conf. Multimedia Modeling, Daejeon, South Korea, 2020, pp. 76–87.
[25]	X. L. Zhang, Y. C. Wei, Y. Yang, and T. S. Huang, “SG-One: Similarity guidance network for one-shot semantic segmentation,” IEEE Trans. Cybern., vol. 50, no. 9, pp. 3855–3865, Sept. 2020. doi: 10.1109/TCYB.2020.2992433
[26]	Y. Liu, B. Jiang, and J. M. Xu, “Axial assembled correspondence network for few-shot semantic segmentation,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 3, pp. 711–721, Mar. 2023. doi: 10.1109/JAS.2022.105863
[27]	B. Y. Yang, F. Wan, C. Liu, B. H. Li, X. Y. Ji, and Q. X. Ye, “Part-based semantic transform for few-shot semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 7141–7152, Dec. 2022. doi: 10.1109/TNNLS.2021.3084252
[28]	G. Li, V. Jampani, L. Sevilla-Lara, D. Q. Sun, J. Kim, and J. Kim, “Adaptive prototype learning and allocation for few-shot segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 8330–8339.
[29]	Y. F. Liu, X. Y. Zhang, S. Y. Zhang, and X. M. He, “Part-aware prototype network for few-shot semantic segmentation,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 142–158.
[30]	X. L. Zhang, Y. C. Wei, Z. Li, C. G. Yan, and Y. Yang, “Rich embedding features for one-shot semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 11, pp. 6484–6493, Nov. 2022. doi: 10.1109/TNNLS.2021.3081693
[31]	H. C. Wang, Y. D. Yang, X. B. Cao, X. T. Zhen, C. Snoek, and L. Shao, “Variational prototype inference for few-shot semantic segmentation,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2021, pp. 525–534.
[32]	C. Zhang, G. S. Lin, F. Y. Liu, J. S. Guo, Q. Y. Wu, and R. Yao, “Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9586–9594.
[33]	H. C. Wang, X. D. Zhang, Y. T. Hu, Y. D. Yang, X. B. Cao, and X. T. Zhen, “Few-shot semantic segmentation with democratic attention networks,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 730–746.
[34]	B. H. Liu, J. B. Jiao, and Q. X. Ye, “Harmonic feature activation for few-shot semantic segmentation,” IEEE Trans. Image Process., vol. 30, pp. 3142–3153, Feb. 2021. doi: 10.1109/TIP.2021.3058512
[35]	G. W. Zhang, G. L. Kang, Y. Yang, and Y. C. Wei, “Few-shot segmentation via cycle-consistent transformer,” in Proc. 35th Conf. Neural Information Processing Systems, 2021, pp. 21984–21996.
[36]	H. H. Gao, J. S. Xiao, Y. Y. Yin, T. Liu, and J. G. Shi, “A mutually supervised graph attention network for few-shot segmentation: The perspective of fully utilizing limited samples,” IEEE Trans. Neural Netw. Learn. Syst., 2022, DOI: 10.1109/TNNLS.2022.3155486
[37]	X. Y. Shi, D. Wei, Y. Zhang, D. H. Lu, M. N. Ning, J. S. Chen, K. Ma, and Y. F. Zheng, “Dense cross-query-and-support attention weighted mask aggregation for few-shot segmentation,” in Proc. 17th European Conf. Computer Vision, Tel Aviv, Israel, 2022, pp. 151–168.
[38]	Y. Z. Zhuge and C. H. Shen, “Deep reasoning network for few-shot semantic segmentation,” in Proc. 29th ACM Int. Conf. Multimedia, China, 2021, pp. 5344–5352.
[39]	L. Z. Liu, J. Y. Cao, M. Q. Liu, Y. Guo, Q. Chen, and M. K. Tan, “Dynamic extension nets for few-shot semantic segmentation,” in Proc. 28th ACM Int. Conf. Multimedia, Seattle, USA, 2020, pp. 1441–1449.
[40]	X. H. Yang, B. R. Wang, K. G. Chen, X. C. Zhou, S. Yi, W. L. Ouyang, and L. P. Zhou, “BriNet: Towards bridging the intra-class and inter-class gaps in one-shot segmentation,” in Proc. 31st British Machine Vision Conf., UK, 2020.
[41]	P. Z. Tian, Z. K. Wu, L. Qi, L. Wang, Y. H. Shi, and Y. Gao, “Differentiable meta-learning model for few-shot semantic segmentation,” in Proc. 34th AAAI Conf. Artificial Intelligence, New York, USA, 2020, pp. 12087–12094.
[42]	M. Boudiaf, H. Kervadec, Z. I. Masud, P. Piantanida, I. Ben Ayed, and J. Dolz, “Few-shot segmentation without meta-learning: A good transductive inference is all you need?” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 13974–13983.
[43]	Z. H. Wu, X. X. Shi, G. S. Lin, and J. F. Cai, “Learning meta-class memory for few-shot semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 497–506.
[44]	G. S. Xie, H. Xiong, J. Liu, Y. Z. Yao, and L. Shao, “Few-shot semantic segmentation with cyclic memory network,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 7273–7282.
[45]	R. Mottaghi, X. J. Chen, X. B. Liu, N. G. Cho, S. W. Lee, S. Fidler, R. Urtasun, and A. Yuille, “The role of context for object detection and semantic segmentation in the wild,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, USA, 2014, pp. 891–898.
[46]	Y. Q. Xian, S. Choudhury, Y. He, B. Schiele, and Z. Akata, “Semantic projection network for zero- and few-label semantic segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 8248–8257.
[47]	G. Y. Tian, S. Wang, J. Feng, L. Zhou, and Y. D. Mu, “Cap2seg: Inferring semantic and spatial context from captions for zero-shot image segmentation,” in Proc. 28th ACM Int. Conf. Multimedia, Seattle, USA, 2020, pp. 4125–4134.
[48]	H. C. Lu, L. W. Fang, M. Lin, and Z. D. Deng, “Feature enhanced projection network for zero-shot semantic segmentation,” in Proc. IEEE Int. Conf. Robotics and Automation, Xi’an, China, 2021, pp. 14011–14017.
[49]	P. Hu, S. Sclaroff, K. Saenko, “Uncertainty-aware learning for zero-shot semantic segmentation,” in Advances in Neural Information Processing Systems, ed. H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, vol. 33, 2020.
[50]	N. Kato, T. Yamasaki, and K. Aizawa, “Zero-shot semantic segmentation via variational mapping,” in Proc. IEEE/CVF Int. Conf. Computer Vision Workshops, Seoul, Korea (South), 2019, pp. 1363–1370.
[51]	D. Baek, Y. Oh, and B. Ham, “Exploiting a joint embedding space for generalized zero-shot semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 9516–9525.
[52]	M. Bucher, T. H. Vu, M. Cord, and P. Pérez, “Zero-shot semantic segmentation,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 468–479.
[53]	Z. X. Gu, S. Y. Zhou, L. Niu, Z. H. Zhao, and L. Q. Zhang, “Context-aware feature generation for zero-shot semantic segmentation,” in Proc. 28th ACM Int. Conf. Multimedia, Seattle, USA, 2020, pp. 1921–1929.
[54]	Z. X. Gu, S. Y. Zhou, L. Niu, Z. H. Zhao, and L. Q. Zhang, “From pixel to patch: Synthesize context-aware features for zero-shot semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., 2022, DOI: 10.1109/TNNLS.2022.3145962
[55]	Li, Peike and Wei, Yunchao and Yang, Yi, “Consistent structural relation learning for zero-shot segmentation,” Advances in Neural Information Processing Systems, vol. 33, pp. 10317–10327, 20202.
[56]	N. Xu, L. J. Yang, Y. C. Fan, J. C. Yang, D. C. Yue, Y. C. Liang, B. Price, S. Cohen, and T. Huang, “YouTube-VOS: Sequence-to-sequence video object segmentation,” in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 603–619.
[57]	F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung, “A benchmark dataset and evaluation methodology for video object segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 724–732.
[58]	Q. Wang, L. Zhang, L. Bertinetto, W. M. Hu, and P. H. S. Torr, “Fast online object tracking and segmentation: A unifying approach,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 1328–1338.
[59]	Y. J. Yin, D. Xu, X. G. Wang, and L. Zhang, “Directional deep embedding and appearance learning for fast video object segmentation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 8, pp. 3884–3894, Aug. 2022. doi: 10.1109/TNNLS.2021.3054769
[60]	P. Hu, G. Wang, X. F. Kong, J. Kuen, and Y. P. Tan, “Motion-guided cascaded refinement network for video object segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 8, pp. 1957–1967, Aug. 2020. doi: 10.1109/TPAMI.2019.2906175
[61]	P. Voigtlaender, Y. N. Chai, F. Schroff, H. Adam, B. Leibe, and L. C. Chen, “FEELVOS: Fast end-to-end embedding learning for video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 9473–9482.
[62]	Z. X. Yang, Y. C. Wei, and Y. Yang, “Collaborative video object segmentation by foreground-background integration,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 332–348.
[63]	W. C. Zhu, J. H. Li, J. W. Lu, and J. Zhou, “Separable structure modeling for semi-supervised video object segmentation,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 1, pp. 330–344, Jan. 2022. doi: 10.1109/TCSVT.2021.3060015
[64]	Y. Z. Zhang, Z. R. Wu, H. W. Peng, and S. Lin, “A transductive approach for video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 6947–6956.
[65]	W. D. Liu, G. S. Lin, T. Y. Zhang, and Z. C. Liu, “Guided co-segmentation network for fast video object segmentation,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 4, pp. 1607–1617, Apr. 2021. doi: 10.1109/TCSVT.2020.3010293
[66]	L. Y. Hong, W. Zhang, L. Y. Chen, W. Q. Zhang, and J. P. Fan, “Adaptive selection of reference frames for video object segmentation,” IEEE Trans. Image Process., vol. 31, pp. 1057–1071, Jan. 2022. doi: 10.1109/TIP.2021.3137660
[67]	H. Park, J. Yoo, S. Jeong, G. Venkatesh, and N. Kwak, “Learning dynamic network using a reuse gate function in semi-supervised video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 8401–8410.
[68]	A. Khoreva, R. Benenson, E. Ilg, T. Brox, and B. Schiele, “Lucid data dreaming for video object segmentation,” Int. J. Comput. Vis., vol. 127, no. 9, pp. 1175–1197, 2019. doi: 10.1007/s11263-019-01164-6
[69]	A. Robinson, F. J. Lawin, M. Danelljan, F. S. Khan, and M. Felsberg, “Learning fast and robust target models for video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 7404–7413.
[70]	H. X. Xiao, B. Y. Kang, Y. Liu, M. J. Zhang, and J. S. Feng, “Online meta adaptation for fast video object segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 5, pp. 1205–1217, May 2020.
[71]	Meinhardt, Tim and Leal-Taixé, Laura, “Make one-shot video object segmentation efficient again,” Advances in Neural Information Processing Systems, vol. 33, pp. 10607–10619, 2020.
[72]	C. Y. Xu, L. Wei, Z. Cui, T. Zhang, and J. Yang, “Meta-VOS: Learning to adapt online target-specific segmentation,” IEEE Trans. Image Process., vol. 30, pp. 4760–4772, May 2021. doi: 10.1109/TIP.2021.3075086
[73]	S. W. Oh, J. Y. Lee, N. Xu, and S. J. Kim, “Video object segmentation using space-time memory networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9225–9234.
[74]	S. W. Oh, J. Y. Lee, N. Xu, and S. J. Kim, “Space-time memory networks for video object segmentation with user guidance,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 1, pp. 442–455, Jan. 2022. doi: 10.1109/TPAMI.2020.3008917
[75]	H. Seong, J. Hyun, and E. Kim, “Kernelized memory network for video object segmentation,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 629–645.
[76]	H. Seong, J. Hyun, and E. Kim, “Video object segmentation using kernelized memory network with multiple kernels,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 2595–2612, Feb. 2023. doi: 10.1109/TPAMI.2022.3163375
[77]	H. Xie, H. Yao, S. Zhou, S. Zhang, and W. Sun, “Efficient regional memory network for video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 1286–1295.
[78]	H. Seong, S. W. Oh, J. Y. Lee, S. Lee, S. Lee, and E. Kim, “Hierarchical memory matching network for video object segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 12869–12878.
[79]	Y. Li, Z. R. Shen, and Y. Shan, “Fast video object segmentation using the global context module,” in Proc. 16th European C. Computer Vision, Glasgow, UK, 2020, pp. 735–750.
[80]	Liang, Yongqing and Li, Xin and Jafari, Navid and Chen, Jim, “Video object segmentation with adaptive feature bank and uncertain-region refinement,” Advances in Neural Information Processing Systems, vol. 33, pp. 3430–3441, 2020.
[81]	H. K. Cheng, Y. W. Tai, and C. K. Tang, “Rethinking space-time networks with improved memory coverage for efficient video object segmentation,” in Proc. 35th Conf. Neural Information Processing Systems, 2021, pp. 11781–11794.
[82]	X. K. Lu, W. G. Wang, M. Danelljan, T. F. Zhou, J. B. Shen, and L. Van Gool, “Video object segmentation with episodic graph memory networks,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 661–679.
[83]	K. H. Zhang, L. Wang, D. Liu, B. Liu, Q. S. Liu, and Z. Li, “Dual temporal memory network for efficient video object segmentation,” in Proc. 28th ACM Int. Conf. Multimedia, Seattle, USA, 2020, pp. 1515–1523.
[84]	Y. Lyu, G. Vosselman, G. S. Xia, and M. Y. Yang, “LIP: Learning instance propagation for video object segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision Workshops, Seoul, Korea (South), 2019, pp. 2739–2748.
[85]	Z. Yang, Q. Wang, L. Bertinetto, S. Bai, W. M. Hu, and P. Torr, “Anchor diffusion for unsupervised video object segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 931–940.
[86]	D. Z. Liu, D. D. Yu, C. H. Wang, and P. Zhou, “F2Net: Learning to focus on the foreground for unsupervised video object segmentation,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 2109–2117.
[87]	Y. C. Gu, L. J. Wang, Z. Q. Wang, Y. Liu, M. M. Cheng, and S. P. Lu, “Pyramid constrained self-attention network for fast video salient object detection,” in Proc. 34th AAAI Conf. Artificial Intelligence, New York, USA, 2020, pp. 10869–10876.
[88]	X. K. Lu, W. G. Wang, J. B. Shen, D. Crandall, and J. B. Luo, “Zero-shot video object segmentation with co-attention Siamese networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 4, pp. 2228–2242, Apr. 2022.
[89]	T. Zhuo, Z. Y. Cheng, P. Zhang, Y. Wong, and M. Kankanhalli, “Unsupervised online video object segmentation with motion property understanding,” IEEE Trans. Image Process., vol. 29, pp. 237–249, 2020. doi: 10.1109/TIP.2019.2930152
[90]	T. F. Zhou, S. Z. Wang, Y. Zhou, Y. Z. Yao, J. W. Li, and L. Shao, “Motion-attentive transition for zero-shot video object segmentation,” in Proc. 34th AAAI Conf. Artificial Intelligence, New York, USA, 2020, pp. 13066–13073.
[91]	Y. F. Zhou, X. Xu, F. M. Shen, X. F. Zhu, and H. T. Shen, “Flow-edge guided unsupervised video object segmentation,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 12, pp. 8116–8127, Dec. 2022. doi: 10.1109/TCSVT.2021.3057872
[92]	S. Yang, L. Zhang, J. Q. Qi, H. C. Lu, S. Wang, and X. X. Zhang, “Learning motion-appearance co-attention for zero-shot video object segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 1544–1553.
[93]	G. P. Ji, K. Fu, Z. Wu, D. P. Fan, J. B. Shen, and L. Shao, “Full-duplex strategy for video object segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 4902–4913.
[94]	X. Q. Zhao, Y. W. Pang, J. X. Yang, L. H. Zhang, and H. C. Lu, “Multi-source fusion and automatic predictor selection for zero-shot video object segmentation,” in Proc. 29th ACM Int. Conf. Multimedia, China, 2021, pp. 2645–2653.
[95]	S. Seo, J. Y. Lee, and B. Han, “URVOS: Unified referring video object segmentation network with a large-scale benchmark,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 208–223.
[96]	M. Bellver, C. Ventura, C. Silberer, I. Kazakos, J. Torres, and X. Giro-i-Nieto, RefVOS: A closer look at referring expressions for video object segmentation. 2020. [Online]. Available: arXiv: 2010.00263.
[97]	Z. Yang, Y. S. Tang, L. Bertinetto, H. S. Zhao, and P. H. S. Torr, “Hierarchical interaction network for video object segmentation from referring expressions,” in Proc. 32nd British Machine Vision Conf., 2021.
[98]	A. Botach, E. Zheltonozhskii, and C. Baskin, “End-to-end referring video object segmentation with multimodal transformers,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 4975–4985.
[99]	D. Z. Li, R. Q. Li, L. J. Wang, Y. F. Wang, J. Q. Qi, L. Zhang, T. Liu, Q. Q. Xu, and H. C. Lu, “You only infer once: Cross-modal meta-transfer for referring video object segmentation,” in Proc. 36th AAAI Conf. Artificial Intelligence, 2022, pp. 1297–1305.
[100]	C. Ventura, M. Bellver, A. Girbau, A. Salvador, F. Marques, and X. Giro-i-Nieto, “RVOS: End-to-end recurrent network for video object segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 5272–5281.
[101]	P. Tokmakov, C. Schmid, and K. Alahari, “Learning to segment moving objects,” Int. J. Comput. Vis., vol. 127, no. 3, pp. 282–301, Mar. 2019. doi: 10.1007/s11263-018-1122-2
[102]	S. Mahadevan, A. Athar, A. Ošep, S. Hennen, L. Leal-Taixé, and B. Leibe, “Making a case for 3D convolutions for object segmentation in videos,” in Proc. 31st British Machine Vision Conf., UK, 2020.
[103]	W. G. Wang, X. K. Lu, J. B. Shen, D. Crandall, and L. Shao, “Zero-shot video object segmentation via attentive graph neural networks,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9235–9244.
[104]	H. Q. Fan, H. Su, and L. Guibas, “A point set generation network for 3D object reconstruction from a single image,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2463–2471.
[105]	A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nie\betaner, “ScanNet: Richly-annotated 3D reconstructions of indoor scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2432–2443.
[106]	X. Y. Chen, C. Zhang, G. S. Lin, and J. Han, Compositional prototype network with multi-view comparision for few-shot point cloud semantic segmentation. 2020. [Online]. Available: arXiv: 2012.14255.
[107]	N. Zhao, T. S. Chua, and G. H. Lee, “Few-shot 3D point cloud semantic segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 8869–8878.
[108]	S. H. Yuan and Y. Fang, “ROSS: Robust learning of one-shot 3D shape segmentation,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Snowmass, USA, 2020, pp. 1950–1958.
[109]	L. J. Wang, X. Li, and Y. Fang, “Few-shot learning of part-specific probability space for 3D shape segmentation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 4503–4512.
[110]	Y. Hao and Y. Fang, 3D meta-segmentation neural network. 2021. [Online]. Available: arXiv: 2110.04297.
[111]	H. Huang, X. Li, L. J. Wang, and Y. Fang, “3D-MetaConNet: Meta-learning for 3D shape classification and segmentation,” in Proc. Int. Conf. 3D Vision, London, UK, 2021, pp. 982–991.
[112]	G. Sharma, E. Kalogerakis, and S. Maji, “Learning point embeddings from shape repositories for few-shot segmentation,” in Proc. Int. Conf. 3D Vision, Quebec City, Canada, 2019, pp. 67–75.
[113]	Sharma, Charu and Kaul, Manohar, “Self-supervised few-shot learning on point clouds,” Advances in Neural Information Processing Systems, vol. 33, pp. 7212–7221, 2020.
[114]	Z. Q. Chen, K. X. Yin, M. Fisher, S. Chaudhuri, and H. Zhang, “BAE-NET: Branched autoencoder for shape co-segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 8489–8498.
[115]	I. Armeni, O. Sener, A. R. Zamir, H. Jiang, I. Brilakis, M. Fischer, and S. Savarese, “3D semantic parsing of large-scale indoor spaces,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 1534–1543.
[116]	B. Michele, A. Boulch, G. Puy, M. Bucher, and R. Marlet, “Generative zero-shot learning for semantic segmentation of 3D point clouds,” in Proc. Int. Conf. 3D Vision, London, UK, 2021, pp. 992–1002.
[117]	B. Liu, Q. L. Dong, and Z. Y. Hu, Segmenting 3D hybrid scenes via zero-shot learning. 2021. [Online]. Available: arXiv: 2107.00430.
[118]	Y. D. Hong, H. H. Pan, W. C. Sun, and Y. S. Jia, Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. 2021. [Online]. Available: arXiv: 2101.06085.
[119]	C. Q. Yu, C. X. Gao, J. B. Wang, G. Yu, C. H. Shen, and N. Sang, “BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation,” Int. J. Comput. Vis., vol. 129, no. 11, pp. 3051–3068, Nov. 2021. doi: 10.1007/s11263-021-01515-2
[120]	Y. L. Guo, H. Y. Wang, Q. Y. Hu, H. Liu, L. Liu, and M. Bennamoun, “Deep learning for 3D point clouds: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 12, pp. 4338–4364, Dec. 2021. doi: 10.1109/TPAMI.2020.3005434
[121]	K. H. Liu, Z. H. Ye, H. Y. Guo, D. P. Cao, L. Chen, and F.-Y. Wang, “FISS GAN: A generative adversarial network for foggy image semantic segmentation,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 8, pp. 1428–1439, Aug. 2021. doi: 10.1109/JAS.2021.1004057
[122]	M. M. Cheng, L. Hui, J. Xie, and J. Yang, “SSPC-Net: Semi-supervised semantic 3D point cloud segmentation network,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 1140–1147.
[123]	I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, USA: MIT Press, 2016.
[124]	Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015. doi: 10.1038/nature14539
[125]	Q. Y. Sun, C. Q. Zhao, Y. Tang, and F. Qian, “A survey on unsupervised domain adaptation in computer vision tasks,” Sci. Sinica Technol., vol. 52, no. 1, pp. 26–54, Jan. 2022. doi: 10.1360/SST-2021-0150
[126]	Y. C. Zhang, Z. H. Li, Y. Xie, Y. Y. Qu, C. H. Li, and T. Mei, “Weakly supervised semantic segmentation for large-scale point cloud,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 3421–3429.
[127]	B. Kim, S. Han, and J. Kim, “Discriminative region suppression for weakly-supervised semantic segmentation,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 1754–1761.
[128]	J. Lee, J. Choi, J. Mok, and S. Yoon, “Reducing information bottleneck for weakly supervised semantic segmentation,” in Proc. 35th Conf. Neural Information Processing Systems, 2021, pp. 27408–27421.
[129]	Y. Q. Wang, Q. M. Yao, J. T. Kwok, and L. M. Ni, “Generalizing from a few examples: A survey on few-shot learning,” ACM Comput. Surv., vol. 53, no. 3, p. 63, May 2021.
[130]	W. Wang, V. W. Zheng, H. Yu, and C. Y. Miao, “A survey of zero-shot learning: Settings, methods, and applications,” ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, p. 13, Mar. 2019.
[131]	Y. Hu, A. Chapman, G. H. Wen, and D. W. Hall, “What can knowledge bring to machine learning?—A survey of low-shot learning for structured data,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 3, p. 48, Jun. 2022.
[132]	S. Luo, Y. J. Li, P. X. Gao, Y. C. Wang, and S. Serikawa, “Meta-Seg: A survey of meta-learning for image segmentation,” Pattern Recogn., vol. 126, p. 108586, Jun. 2022. doi: 10.1016/j.patcog.2022.108586
[133]	R. Yao, G. S. Lin, S. X. Xia, J. Q. Zhao, and Y. Zhou, “Video object segmentation and tracking: A survey,” ACM Trans. Intell. Syst. Technol., vol. 11, no. 4, p. 36, Aug. 2020.
[134]	T. F. Zhou, F. Porikli, D. Crandall, L. Van Gool, and W. G. Wang, A survey on deep learning technique for video segmentation. 2021. [Online]. Available: arXiv: 2107.01153.
[135]	T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in Proc. 1st Int. Conf. Learning Representations, Scottsdale, USA, 2013.
[136]	D. H. Wang, Y. N. Li, Y. T. Lin, and Y. T. Zhuang, “Relational knowledge transfer for zero-shot learning,” in Proc. 30th AAAI Conf. Artificial Intelligence, Phoenix, USA, 2016, pp. 2145–2151.
[137]	A. Parnami and M. Lee, Learning from few examples: A summary of approaches to few-shot learning. 2022. [Online]. Available: arXiv: 2203.04291.
[138]	S. Antonelli, D. Avola, L. Cinque, D. Crisostomi, G. L. Foresti, F. Galasso, M. R. Marini, A. Mecca, and D. Pannone, “Few-shot object detection: A survey,” ACM Comput. Surv., vol. 54, no. 11s, p. 242, Jan. 2022.
[139]	Song, Yisheng and Wang, Ting and Cai, Puyu and Mondal, Subrota K and Sahoo, Jyoti Prakash, A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities, ACM Computing Surveys, 2023, ACM New York, NY
[140]	B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Simultaneous detection and segmentation,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 297–312.
[141]	T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 740–755.
[142]	J. Lu, P. H. Gong, J. P. Ye, J. W. Zhang, and C. S. Zhang, A survey on machine learning from few samples. 2020. [Online]. Available: arXiv: 2009.02653.
[143]	Y. H. Guo, N. C. Codella, L. Karlinsky, J. V. Codella, J. R. Smith, K. Saenko, T. Rosing, and R. Feris, “A broader study of cross-domain few-shot learning,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 124–141.
[144]	Pourpanah, Farhad and Abdar, Moloud and Luo, Yuxuan and Zhou, Xinlei and Wang, Ran and Lim, Chee Peng and Wang, Xi-Zhao and Wu, QM Jonathan, “A review of generalized zero-shot learning methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, IEEE.
[145]	J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Red Hook, USA, 2017, pp. 4080–4090.
[146]	T. F. Zhou, W. G. Wang, E. Konukoglu, and L. Van Gool, “Rethinking semantic segmentation: A prototype view,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 2572–2583.
[147]	W. G. Wang, T. F. Zhou, F. Yu, J. F. Dai, E. Konukoglu, and L. Van Gool, “Exploring cross-image pixel contrast for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Montreal, Canada, 2021, pp. 7283–7293.
[148]	U. Michieli and M. Ozay, Prototype guided federated learning of visual feature representations. 2021. [Online]. Available: arXiv: 2105.08982.
[149]	J. J. Hwang, S. Yu, J. B. Shi, M. Collins, T. J. Yang, X. Zhang, and L. C. Chen, “SegSort: Segmentation by discriminative sorting of segments,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 7333–7343.
[150]	L. Ke, X. Li, M. Danelljan, Y. W. Tai, C. K. Tang, and F. Yu, “Prototypical cross-attention networks for multiple object tracking and segmentation,” Proc. 35th Conf. Neural Information Processing Systems, pp. 1192–1203, 2021.
[151]	A. Kendall and Y. Gal, “What uncertainties do we need in Bayesian deep learning for computer vision?” Proc. 31st Int. Conf. Neural Information Processing Systems, Red Hook, USA, 2017, pp. 5580–5590.
[152]	A. K. Jain, R. P. W. Duin, and J. C. Mao, “Statistical pattern recognition: A review,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 1, pp. 4–37, Jan. 2000. doi: 10.1109/34.824819
[153]	C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 1126–1135.
[154]	C. Z. Zhang, J. R. Wang, G. G. Yen, C. Q. Zhao, Q. Y. Sun, Y. Tang, F. Qian, and J. Kurths, “When autonomous systems meet accuracy and transferability through AI: A survey,” Patterns, vol. 1, no. 4, p. 100050, Jul. 2020. doi: 10.1016/j.patter.2020.100050
[155]	A. Zhao, M. Y. Ding, Z. W. Lu, T. Xiang, Y. L. Niu, J. C. Guan, and J. R. Wen, “Domain-adaptive few-shot learning,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2021, pp. 1389–1398.
[156]	M. S. Amac, A. Sencan, O. B. Baran, N. Ikizler-Cinbis, and R. G. Cinbis, “MaskSplit: Self-supervised meta-learning for few-shot semantic segmentation,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2022, pp. 428–438.
[157]	K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. 3rd Int. Conf. Learning Representations, San Diego, USA, 2015.
[158]	C. Q. Zhao, Q. Y. Sun, C. Z. Zhang, Y. Tang, and F. Qian, “Monocular depth estimation based on deep learning: An overview,” Sci. China Technol. Sci., vol. 63, no. 9, pp. 1612–1627, Sept. 2020. doi: 10.1007/s11431-020-1582-8
[159]	S. Bonechi, P. Andreini, M. Bianchini, and F. Scarselli, “Generating bounding box supervision for semantic segmentation with deep learning,” in Proc. 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition, Siena, Italy, 2018, pp. 190–200.
[160]	Q. Lu, Q. Li, L. K. Hu, and L. F. Huang, “An effective low-contrast SF_6 gas leakage detection method for infrared imaging,” IEEE Trans. Instrum. Meas., vol. 70, p. 5009009, Apr. 2021.
[161]	P. Ravishankar, S. Hwang, J. Zhang, I. X. Khalilullah, and B. Eren-Tokgoz, “DARTS—drone and artificial intelligence reconsolidated technological solution for increasing the oil and gas pipeline resilience,” Int. J. Disaster Risk Sci., vol. 13, no. 5, pp. 810–821, Oct. 2022. doi: 10.1007/s13753-022-00439-w
[162]	W. J. Zhou, J. F. Liu, J. S. Lei, L. Yu, and J. N. Hwang, “GMNet: Graded-feature multilabel-learning network for RGB-Thermal urban scene semantic segmentation,” IEEE Trans. Image Process., vol. 30, pp. 7790–7802, Sept. 2021. doi: 10.1109/TIP.2021.3109518
[163]	P. W. Patil, A. Dudhane, S. Chaudhary, and S. Murala, “Multi-frame based adversarial learning approach for video surveillance,” Pattern Recogn., vol. 122, p. 108350, Feb. 2022. doi: 10.1016/j.patcog.2021.108350

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(8) / Tables(6)

Get Citation

PDF

XML

Article Metrics

Article views (840) PDF downloads(93)

Highlights

The comparison among the problem settings of different few/zero-shot visual segmentation tasks and a summary of technical solutions are provided
The leading-edge advancements in few/zero-shot visual semantic segmentation are reviewed and the dissimilarities of technical solutions in different segmentation tasks are specified
The open challenges that involve data, algorithms, and applications for few/zero-shot visual semantic segmentation are discussed to provide enlightenment to follow-up researchers

Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

doi: 10.1109/JAS.2023.123207

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content