Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization

Hanjiang Hu; Hesheng Wang; Zhe Liu; Weidong Chen

doi:10.1109/JAS.2021.1003907

Volume 9 Issue 2

Feb. 2022

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 19.2, Top 1 (SCI Q1)

CiteScore: 28.2, Top 1% (Q1)
Google Scholar h5-index: 95， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2022 > 9(2): 313-328

H. J. Hu, H. S. Wang, Z. Liu, and W. D. Chen, “Domain-invariant similarity activation map contrastive learning for retrieval-based long-term visual localization,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 2, pp. 313–328, Feb. 2022. doi: 10.1109/JAS.2021.1003907

Citation:

H. J. Hu, H. S. Wang, Z. Liu, and W. D. Chen, “Domain-invariant similarity activation map contrastive learning for retrieval-based long-term visual localization,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 2, pp. 313–328, Feb. 2022. doi: 10.1109/JAS.2021.1003907

Citation:

PDF( 2771 KB)

Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization

doi: 10.1109/JAS.2021.1003907

Hanjiang Hu^1
,,
Hesheng Wang^{2
,
,},
Zhe Liu^3
,,
Weidong Chen^1
,

1.
Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China
2.
Department of Automation, Institute of Medical Robotics, Key Laboratory of System Control and Information Processing of Ministry of Education, Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
3.
Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, United Kingdom

More Information

Author Bio:
Hanjiang Hu received the B.Eng degree in mechanical engineering from Shanghai Jiao Tong University in 2018. He is currently working toward the M.S. degree in control science and engineering at Shanghai Jiao Tong University. His current research interests include visual perception and localization, autonomous driving, computer vision and machine learning

Hesheng Wang (Senior Member, IEEE) received the B.Eng. degree in electrical engineering from the Harbin Institute of Technology in 2002, and the M.Phil. and Ph.D. degrees in automation and computer-aided engineering from The Chinese University of Hong Kong, China, in 2004 and 2007, respectively. He was a Post-Doctoral Fellow and Research Assistant with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, China, from 2007 to 2009. He is currently a Professor with the Department of Automation, Shanghai Jiao Tong University. His current research interests include visual servoing, service robot, adaptive robot control, and autonomous driving. Dr. Wang is an Associate Editor of Assembly Automation and the International Journal of Humanoid Robotics, a Technical Editor of the IEEE/ASME Transactions on Mechatronics. He served as an Associate Editor of the IEEE Transactions on Robotics from 2015 to 2019. He was the General Chair of the IEEE RCAR 2016, and the Program Chair of the IEEE ROBIO 2014 and IEEE/ASME AIM 2019

Zhe Liu received the B.S. degree in automation from Tianjin University in 2010, and the Ph.D. degree in control technology and control engineering from Shanghai Jiao Tong University in 2016. From 2017 to 2020, he was a Post-Doctoral Fellow with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, China. He is currently a Research Associate with the Department of Computer Science and Technology, University of Cambridge, UK. His research interests include autonomous mobile robot, multirobot cooperation and autonomous driving system

Weidong Chen (Member, IEEE) received the B.S. and M.S. degrees in control engineering in 1990 and 1993, and the Ph.D. degree in mechatronics in 1996, respectively, all from the Harbin Institute of Technology. Since 1996, he has been at the Shanghai Jiao Tong University where he is currently Professor of the Department of Automation, and Deputy Dean of the Institute of Medical Robotics. He is the Founder of the Autonomous Robot Laboratory. From 2013 to 2019, he served as Chair of the Department of Automation. His current research interests include perception and control of robotic systems, multi-robot systems and medical robotics
Corresponding author: Hesheng Wang, e-mail: wanghesheng@sjtu.edu.cn
Received Date: 2020-09-23
Revised Date: 2020-10-25
Accepted Date: 2020-12-06

Available Online: 2020-12-18

Abstract

Abstract

Visual localization is a crucial component in the application of mobile robot and autonomous driving. Image retrieval is an efficient and effective technique in image-based localization methods. Due to the drastic variability of environmental conditions, e.g., illumination changes, retrieval-based visual localization is severely affected and becomes a challenging problem. In this work, a general architecture is first formulated probabilistically to extract domain-invariant features through multi-domain image translation. Then, a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy. We also propose a new adaptive triplet loss to boost the contrastive learning of the embedding in a self-supervised manner. The final coarse-to-fine image retrieval pipeline is implemented as the sequential combination of models with and without Grad-SAM loss. Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMU-Seasons dataset. The strong generalization ability of our approach is verified with the RobotCar dataset using models pre-trained on urban parts of the CMU-Seasons dataset. Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision, especially under challenging environments with illumination variance, vegetation, and night-time images. Moreover, real-site experiments have been conducted to validate the efficiency and effectiveness of the coarse-to-fine strategy for localization.
- Deep representation learning,
- place recognition,
- visual localization

FullText(HTML)

References(76)

References

[1]	T. Sattler, B. Leibe, and L. Kobbelt, “Efficient & effective prioritized matching for large-scale image-based localization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 9, pp. 1744–1756, Sept. 2017.
[2]	B. Bescos, J. M. Fácil, J. Civera, and J. Neira, “DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 4076–4083, Oct. 2018.
[3]	L. P. Wang and H. Wei, “Avoiding non-Manhattan obstacles based on projection of spatial corners in indoor environment,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 4, pp. 1190–1200, Jul. 2020.
[4]	R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “NetVLAD: CNN architecture for weakly supervised place recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 6, pp. 1437–1451, Jun. 2018.
[5]	S. Lowry, N. Sünderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford, “Visual place recognition: A survey,” IEEE Trans. Robot., vol. 32, no. 1, pp. 1–19, Feb. 2016.
[6]	A. Torii, R. Arandjelovic, J. Sivic, M. Okutomi, and T. Pajdla, “4/7 place recognition by view synthesis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 2, pp. 257–271, Feb. 2018.
[7]	T. Sattler, Q. J. Zhou, M. Pollefeys, and L. Leal-Taixé, “Understanding the limitations of CNN-based absolute camera pose regression,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 3297–3307.
[8]	P. E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, “From coarse to fine: Robust hierarchical localization at large scale,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 12708–12717.
[9]	T. Sattler, W. Maddern, C. Toft, et al., “Benchmarking 6DOF outdoor visual localization in changing conditions,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8601–8610.
[10]	Y. F. Ma, Z. Y. Wang, H. Yang, and L. Yang, “Artificial intelligence applications in the development of autonomous vehicles: A survey,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 315–329, Mar. 2020.
[11]	D. Doan, Y. Latif, T. J. Chin, Y. Liu, T. T. Do, and I. Reid, “Scalable place recognition under appearance change for autonomous driving,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9318–9327.
[12]	P. Yin, L. Y. Xu, X. Q. Li, C. Yin, Y. L. Li, R. A. Srivatsan, L. Li, J. M. Ji, and Y. Q. He, “A multi-domain feature learning method for visual place recognition,” in Proc. IEEE Int. Conf. Robotics and Automation, Montreal, QC, Canada, 2019, pp. 319–324.
[13]	Z. T. Chen, A. Jacobson, N. Sünderhauf, B. Upcroft, L. Q. Liu, C. H. Shen, I. Reid, and M. Milford, “Deep learning features at scale for visual place recognition,” in Proc. IEEE Int. Conf. Robotics and Automation, Singapore, 2017, pp. 3223–3230.
[14]	J. Wang, Y. Song, T. Leung, C. Rosenberg, J. B. Wang, J. Philbin, B. Chen, and Y. Wu, “Learning fine-grained image similarity with deep ranking,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 1386–1393.
[15]	P. Wohlhart and V. Lepetit, “Learning descriptors for object recognition and 3D pose estimation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3109–3118.
[16]	J. W. Lu, J. L. Hu, and Y. P. Tan, “Discriminative deep metric learning for face and kinship verification,” IEEE Trans. Image Process., vol. 26, no. 9, pp. 4269–4282, Sept. 2017.
[17]	S. Lowry and M. J. Milford, “Supervised and unsupervised linear learning techniques for visual place recognition in changing environments,” IEEE Trans. Robot., vol. 32, no. 3, pp. 600–613, Jun. 2016.
[18]	A. Gordo, J. Almazán, J. Revaud, and D. Larlus, “End-to-end learning of deep visual representations for image retrieval,” Int. J. Comput. Vis., vol. 124, no. 2, pp. 237–254, Sept. 2017.
[19]	F. Radenović, G. Tolias, and O. Chum, “Fine-tuning CNN image retrieval with no human annotation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 7, pp. 1655–1668, Jul. 2019.
[20]	B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning deep features for discriminative localization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 2921–2929.
[21]	R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual explanations from deep networks via gradient-based localization,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 618–626.
[22]	A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian, “Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Lake Tahoe, NV, USA, 2018, pp. 839–847.
[23]	H. J. Kim, E. Dunn, and J. M. Frahm, “Learned contextual feature reweighting for image geo-localization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 3251–3260.
[24]	Z. T. Chen, L. Q. Liu, I. Sa, Z. Y. Ge, and M. Chli, “Learning context flexible attention model for long-term visual place recognition,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 4015–4022, Oct. 2018.
[25]	H. Y. Gu, H. S. Wang, F. Xu, Z. Liu, and W. D. Chen, “Active fault detection of soft manipulator in visual servoing,” IEEE Trans. Ind. Electron., 2020. DOI: 10.1109/TIE.2020.3028813
[26]	L. J. Han, H. S. Wang, Z. Liu, W. D. Chen, and X. F. Zhang, “Vision-based cutting control of deformable objects with surface tracking,” IEEE/ASME Trans. Mechatron., 2020. DOI: 10.1109/TMECH.2020.3029114
[27]	D. Gálvez-López and J. D. Tardos, “Bags of binary words for fast place recognition in image sequences,” IEEE Trans. Robot., vol. 28, no. 5, pp. 1188–1197, Oct. 2012.
[28]	W. Shi, P. X. Liu, and M. H. Zheng, “A mixed-depth visual rendering method for bleeding simulation,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 4, pp. 917–925, Jul. 2019.
[29]	Z. Y. Liu and H. Qiao, “GNCCP-graduated NonConvexity and concavity procedure,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 6, pp. 1258–1267, Jun. 2014.
[30]	H. Qiao, Y. L. Li, T. Tang, and P. Wang, “Introducing memory and association mechanism into a biologically inspired visual model,” IEEE Trans. Cybernet., vol. 44, no. 9, pp. 1485–1496, Sept. 2014.
[31]	R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Trans. Robot., vol. 33, no. 5, pp. 1255–1262, Oct. 2017.
[32]	H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in Proc. IEEE Computer Society Conf. Computer Vision & Pattern Recognition, San Francisco, CA, USA, 2010, pp. 3304–3311.
[33]	M. J. Milford and G. F. Wyeth, “SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights,” in Proc. IEEE Int. Conf. Robotics and Automation, Saint Paul, MN, USA, 2012, pp. 1643–1649.
[34]	S. M. Siam and H. Zhang, “Fast-SeqSLAM: A fast appearance based place recognition algorithm,” in Proc. IEEE Int. Conf. Robotics and Automation, Singapore, 2017, pp. 5702–5708.
[35]	Y. Xing, C. Lv, L. Chen, H. J. Wang, H. Wang, D. P. Cao, E. Velenis, and F. Y. Wang, “Advances in vision-based lane detection: Algorithms, integration, assessment, and perspectives on ACP-based parallel vision,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 3, pp. 645–661, May 2018.
[36]	T. Jenicek and O. Chum, “No fear of the dark: Image retrieval under varying illumination conditions,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 9696–9703.
[37]	P. Isola, J. Y. Zhu, T. H. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 5967–5976.
[38]	A. Anoosheh, E. Agustsson, R. Timofte, and L. Van Gool, “ComboGAN: Unrestrained scalability for image domain translation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 2018, pp. 896–903.
[39]	M. Y. Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Red Hook, NY, United States, 2017, pp. 700–708.
[40]	X. Huang, M. Y. Liu, S. Belongie, and J. Kautz, “Multimodal unsupervised image-to-image translation,” in Proc. European Conf. Computer Vision, Munich, Germany, 2018, pp. 179–196.
[41]	I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Cambridge, MA, United States, 2014, pp. 2672–2680.
[42]	A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv: 1511.06434, 2015.
[43]	H. Porav, W. Maddern, and P. Newman, “Adversarial training for adverse conditions: Robust metric localisation using appearance transfer,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, QLD, Australia, 2018, pp. 1011–1018.
[44]	J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2242–2251.
[45]	A. Anoosheh, T. Sattler, R. Timofte, M. Pollefeys, and L. Van Gool, “Night-to-day image translation for retrieval-based localization,” in Proc. IEEE Int. Conf. Robotics and Automation, Montreal, QC, Canada, 2019, pp. 5958–5964.
[46]	T. Naseer, G. L. Oliveira, T. Brox, and W. Burgard, “Semantics-aware visual localization under challenging perceptual conditions,” in Proc. IEEE Int. Conf. Robotics and Automation, Singapore, 2017, pp. 2614–2620.
[47]	E. Stenborg, C. Toft, and L. Hammarstrand, “Long-term visual localization using semantically segmented images,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, QLD, Australia, 2018, pp. 6484–6490.
[48]	N. Piasco, D. Sidibé, V. Gouet-Brunet, and C. Demonceaux, “Learning scene geometry for visual localization in challenging conditions,” in Proc. IEEE Int. Conf. Robotics and Automation, Montreal, QC, Canada, 2019, pp. 9094–9100.
[49]	N. Piasco, D. Sidibé, V. Gouet-Brunet, and C. Demonceaux, “Improving image description with auxiliary modality for visual localization in challenging conditions,” Int. J. Comput. Vis., 2021, DOI: 10.1007/s11263-020-01363-6
[50]	X. H. Wang and H. B. Duan, “Hierarchical visual attention model for saliency detection inspired by avian visual pathways,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 2, pp. 540–552, Mar. 2019.
[51]	Z. Xin, Y. H. Cai, T. Lu, X. X. Xing, S. J. Cai, J. P. Zhang, Y. P. Yang, and Y. Q. Wang, “Localizing discriminative visual landmarks for place recognition,” in Proc. IEEE Int. Conf. Robotics and Automation, Montreal, QC, Canada, 2019, pp. 5979–5985.
[52]	X. Luo, D. X. Wang, M. C. Zhou, and H. Q. Yuan, “Latent factor-based recommenders relying on extended stochastic gradient descent algorithms,” IEEE Trans. Syst. Man Cybernet.:Syst., vol. 51, no. 2, pp. 916–926, 2019. doi: 10.1109/TSMC.2018.2884191
[53]	D. Wu, Q. He, X. Luo, M. S. Shang, Y. He, and G. Y. Wang, “A posterior-neighborhood-regularized latent factor model for highly accurate web service QoS prediction,” IEEE Trans. Serv. Comput., 2019. DOI: 10.1109/TSC.2019.2961895
[54]	D. Wu, X. Luo, M. S. Shang, Y. He, G. Y. Wang, and M. C. Zhou, “A deep latent factor model for high-dimensional and sparse matrices in recommender systems,” IEEE Trans. Syst. Man Cybernet.: Syst., 2019. DOI: 10.1109/TSMC.2019.2931393
[55]	R. Gong, W. Li, Y. H. Chen, and L. Van Gool, “DLOW: Domain flow for adaptation and generalization,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, 2019, pp. 2472–2481.
[56]	A. Achille and S. Soatto, “Emergence of invariance and disentanglement in deep representations,” in Proc. Information Theory and Applications Workshop, San Diego, CA, USA, 2018, pp. 1–9.
[57]	H. Kim and A. Mnih, “Disentangling by factorising,” in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 2649–2658.
[58]	A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey, “Adversarial autoencoders,” arXiv preprint arXiv: 1511.05644, 2015.
[59]	M. Mathieu, J. B. Zhao, P. Sprechmann, A. Ramesh, and Y. LeCun, “Disentangling factors of variation in deep representations using adversarial training,” in Proc. 30th Int. Conf. Neural Information Processing Systems, Red Hook, NY, United States, 2016, pp. 5047–5055.
[60]	X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets,” in Proc. 30th Int. Conf. Neural Information Processing Systems, Red Hook, NY, United States, 2016, pp. 2180–2188.
[61]	C. Donahue, Z. C. Lipton, A. Balsubramani, and J. McAuley, “Semantically decomposing the latent spaces of generative adversarial networks,” arXiv preprint arXiv: 1705.07904, 2018.
[62]	M. Lopez-Antequera, R. Gomez-Ojeda, N. Petkov, and J. Gonzalez-Jimenez, “Appearance-invariant place recognition by discriminatively training a convolutional neural network,” Pattern Recogn. Lett., vol. 92, pp. 89–95, Jun. 2017.
[63]	H. J. Hu, H. S. Wang, Z. Liu, C. G. Yang, W. D. Chen, and L. Xie, “Retrieval-based localization based on domain-invariant feature learning under changing environments,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Macau, China, 2019, pp. 3684–3689.
[64]	L. Tang, Y. Wang, Q. H. Luo, X. Q. Ding, and R. Xiong, “Adversarial feature disentanglement for place recognition across changing appearance,” in Proc. IEEE Int. Conf. Robotics and Automation, Paris, France, 2020, pp. 1301–1307.
[65]	S. Hausler, A. Jacobson, and M. Milford, “Filter early, match late: Improving network-based visual place recognition,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Macau, China, 2019, pp. 3268–3275.
[66]	S. Garg, N. Suenderhauf, and M. Milford, “Don’t look back: Robustifying place categorization for viewpoint- and condition-invariant place recognition,” in Proc. IEEE Int. Conf. Robotics and Automation, Brisbane, QLD, Australia, 2018, pp. 3645–3652.
[67]	H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese, “Deep metric learning via lifted structured feature embedding,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 4004–4012.
[68]	E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russell, “Distance metric learning, with application to clustering with side-information,” in Proc. 15th Int. Conf. Neural Information Processing Systems, Cambridge, MA, United States, 2002, pp. 521–528.
[69]	K. Q. Weinberger and L. K. Saul, “Fast solvers and efficient implementations for distance metric learning,” in Proc. 25th Int. Conf. Machine Learning, Helsinki, Finland, 2008, pp. 1160–1167.
[70]	R. R. Varior, M. Haloi, and G. Wang, “Gated siamese convolutional neural network architecture for human re-identification,” in Proc. European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 791–808.
[71]	V. Balntas, S. D. Li, and V. Prisacariu, “RelocNet: Continuous metric learning relocalisation using neural nets,” in Proc. European Conf. Computer Vision, Munich, Germany, 2018, pp. 782–799.
[72]	E. Hoffer and N. Ailon, “Deep metric learning using triplet network,” in Proc. Int. Workshop on Similarity-Based Pattern Recognition, Copenhagen, Denmark, 2015, pp. 84–92.
[73]	B. G. V. Kumar, G. Carneiro, and I. Reid, “Learning local image descriptors with deep siamese and triplet convolutional networks by minimizing global loss functions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 5385–5394.
[74]	M. Cummins and P. Newman, “Fab-map: Probabilistic localization and mapping in the space of appearance,” Int. J. Robot. Res., vol. 27, no. 6, pp. 647–665, Jun. 2008.
[75]	H. Badino, D. Huber, and T. Kanade, “Visual topometric localization,” in Proc. IEEE Intelligent Vehicles Symp., 2011.
[76]	W. Maddern, G. Pascoe, C. Linegar, and P. Newman, “1 year, 1000 km: The Oxford RobotCar dataset,” Int. J. Robot. Res., vol. 36, no. 1, pp. 3–15, Jan. 2017.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(11) / Tables(7)

Get Citation

PDF

XML

Article Metrics

Article views (3803) PDF downloads(90)

Highlights

A domain-invariant feature learning framework is proposed based with feature consistency loss
A new gradient-weighted similarity activation map is proposed for high-accuracy retrieval
A novel self--supervised contrastive learning is proposed with adaptive triplet
Our results keep on par with state-of-the-art baselines with efficient two-stage pipeline

Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization

doi: 10.1109/JAS.2021.1003907

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content