IEEE/CAA Journal of Automatica Sinica
Citation: | J. Y. Ma, K. N. Zhang, and J. J. Jiang, “Loop closure detection via locality preserving matching with global consensus,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 2, pp. 411–426, Feb. 2023. doi: 10.1109/JAS.2022.105926 |
[1] |
C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. J. Leonard, “Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age,” IEEE Trans. Robot., vol. 32, no. 6, pp. 1309–1332, 2016. doi: 10.1109/TRO.2016.2624754
|
[2] |
W. Huang, G. Zhang, and X. Han, “Dense mapping from an accurate tracking SLAM,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 6, pp. 1565–1574, 2020. doi: 10.1109/JAS.2020.1003357
|
[3] |
S. Lowry, N. Sünderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford, “Visual place recognition: A survey,” IEEE Trans. Robot., vol. 32, no. 1, pp. 1–19, 2015.
|
[4] |
D. Gálvez-López and J. D. Tardos, “Bags of binary words for fast place recognition in image sequences,” IEEE Trans. Robot., vol. 28, no. 5, pp. 1188–1197, 2012. doi: 10.1109/TRO.2012.2197158
|
[5] |
L. Bampis, A. Amanatiadis, and A. Gasteratos, “Encoding the description of image sequences: A two-layered pipeline for loop closure detection,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2016, pp. 4530–4536.
|
[6] |
K. Zhang, X. Jiang, and J. Ma, “Appearance-based loop closure detection via locality-driven accurate motion field learning,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2350–2365, 2022. doi: 10.1109/TITS.2021.3086822
|
[7] |
R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “NetVLAD: CNN architecture for weakly supervised place recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 5297–5307.
|
[8] |
Y. Xu, J. Huang, J. Wang, Y. Wang, H. Qin, and K. Nan, “ESA-VLAD: A lightweight network based on second-order attention and netvlad for loop closure detection,” IEEE Robot. Autom. Lett., vol. 6, no. 4, pp. 6545–6552, 2021. doi: 10.1109/LRA.2021.3094228
|
[9] |
S. An, G. Che, F. Zhou, X. Liu, X. Ma, and Y. Chen, “Fast and incremental loop closure detection using proximity graphs,” arXiv preprint arXiv: 1911.10752, 2019.
|
[10] |
S. An, H. Zhu, D. Wei, K. A. Tsintotas, and A. Gasteratos, “Fast and incremental loop closure detection with deep features and proximity graphs,” J. Field Robot., vol. 39, no. 4, pp. 473–493, 2022. doi: 10.1002/rob.22060
|
[11] |
J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. IEEE Int. Conf. Comput. Vis., 2003, pp. 1–8.
|
[12] |
J. MacQueen, et al., “Some methods for classification and analysis of multivariate observations,” in Proc. 5th Berkeley Symp. Math. Statist. Probab., 1967, pp. 281–297.
|
[13] |
F. Perronnin and C. Dance, “Fisher kernels on visual vocabularies for image categorization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2007, pp. 1–8.
|
[14] |
H. Jégou, M. Douze, and C. Schmid, “Improving bag-of-features for large scale image search,” Int. J. Comput. Vis., vol. 87, no. 3, pp. 316–336, 2010. doi: 10.1007/s11263-009-0285-2
|
[15] |
R. Ji, L.-Y. Duan, J. Chen, H. Yao, J. Yuan, Y. Rui, and W. Gao, “Location discriminative vocabulary coding for mobile landmark search,” Int. J. Comput. Vis., vol. 96, no. 3, pp. 290–314, 2012. doi: 10.1007/s11263-011-0472-9
|
[16] |
G. Tolias, Y. Avrithis, and H. Jégou, “Image search with selective match kernels: Aggregation across single and multiple images,” Int. J. Comput. Vis., vol. 116, no. 3, pp. 247–261, 2016. doi: 10.1007/s11263-015-0810-4
|
[17] |
A. M. Andrew, “Multiple view geometry in computer vision,” Kybernetes, 2001.
|
[18] |
M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981. doi: 10.1145/358669.358692
|
[19] |
J. Ma, J. Wu, J. Zhao, J. Jiang, H. Zhou, and Q. Z. Sheng, “Nonrigid point set registration with robust transformation learning under manifold regularization,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 12, pp. 3584–3597, 2019. doi: 10.1109/TNNLS.2018.2872528
|
[20] |
P. Weinzaepfel, T. Lucas, D. Larlus, and Y. Kalantidis, “Learning super-features for image retrieval,” in Proc. Int. Conf. Learn Representations, 2022.
|
[21] |
J. Ma, X. Jiang, A. Fan, J. Jiang, and J. Yan, “Image matching from handcrafted to deep features: A survey,” Int. J. Comput. Vis., vol. 129, no. 1, pp. 23–79, 2021. doi: 10.1007/s11263-020-01359-2
|
[22] |
M. Cummins and P. Newman, “Appearance-only SLAM at large scale with fab-map 2.0,” Int. J. Rob. Res., vol. 30, no. 9, pp. 1100–1123, 2011. doi: 10.1177/0278364910385483
|
[23] |
M. Cummins and P. Newman, “Fab-map: Probabilistic localization and mapping in the space of appearance,” Int. J. Rob. Res., vol. 27, no. 6, pp. 647–665, 2008. doi: 10.1177/0278364908090961
|
[24] |
R. Mur-Artal and J. D. Tardós, “Fast relocalisation and loop closing in keyframe-based SLAM,” in Proc. IEEE Int. Conf. Robot. Automat., 2014, pp. 846–853.
|
[25] |
E. S. Stumm, C. Mei, and S. Lacroix, “Building location models for visual place recognition,” Int. J. Rob. Res., vol. 35, no. 4, pp. 334–356, 2016. doi: 10.1177/0278364915570140
|
[26] |
A. Angeli, D. Filliat, S. Doncieux, and J.-A. Meyer, “A fast and incremental method for loop-closure detection using bags of visual words,” IEEE Trans. Robot., vol. 24, no. 5, pp. 1027–1037, 2008. doi: 10.1109/TRO.2008.2004514
|
[27] |
K. A. Tsintotas, L. Bampis, and A. Gasteratos, “Modest-vocabulary loop-closure detection with incremental bag of tracked words,” Robot. Auto. Syst., vol. 141, p. 103782, 2021.
|
[28] |
E. Garcia-Fidalgo and A. Ortiz, “iBoW-LCD: An appearance-based loop-closure detection approach using incremental bags of binary words,” IEEE Robot. Autom. Lett., vol. 3, no. 4, pp. 3051–3057, 2018. doi: 10.1109/LRA.2018.2849609
|
[29] |
K. A. Tsintotas, L. Bampis, and A. Gasteratos, “Probabilistic appearance-based place recognition through bag of tracked words,” IEEE Robot. Autom. Lett., vol. 4, no. 2, pp. 1737–1744, 2019. doi: 10.1109/LRA.2019.2897151
|
[30] |
K. A. Tsintotas, L. Bampis, and A. Gasteratos, “Assigning visual words to places for loop closure detection,” in Proc. IEEE Int. Conf. Robot Automat., 2018, pp. 1–7.
|
[31] |
L. Bampis, A. Amanatiadis, and A. Gasteratos, “Fast loop-closure detection using visual-word-vectors from image sequences,” Int. J. Rob. Res., vol. 37, no. 1, pp. 62–82, 2018. doi: 10.1177/0278364917740639
|
[32] |
Y. Xia, J. Li, L. Qi, and H. Fan, “Loop closure detection for visual SLAM using pcanet features,” in Proc. Int. Joint Conf. Neural Netw., 2016, pp. 2274–2281.
|
[33] |
J. Ma, S. Wang, K. Zhang, Z. He, J. Huang, and X. Mei, “Fast and robust loop-closure detection via convolutional auto-encoder and motion consensus,” IEEE Trans. Ind. Informat., vol. 18, no. 6, pp. 3681–3691, 2022. doi: 10.1109/TII.2021.3120141
|
[34] |
J. Ma, L. Tang, F. Fan, J. Huang, X. Mei, and Y. Ma, “SwinFusion: Cross-domain long-range learning for general image fusion via swin transformer,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1200–1217, 2022. doi: 10.1109/JAS.2022.105686
|
[35] |
O. Chum, J. Matas, and J. Kittler, “Locally optimized RANSAC,” in Proc. Joint Pattern Recognit. Symp., 2003, pp. 236–243.
|
[36] |
R. Raguram, O. Chum, M. Pollefeys, J. Matas, and J.-M. Frahm, “USAC: A universal framework for random sample consensus,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 2022–2038, 2012.
|
[37] |
D. Barath, J. Matas, and J. Noskova, “MAGSAC: Marginalizing sample consensus,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 10197–10205.
|
[38] |
D. Barath, J. Noskova, M. Ivashechkin, and J. Matas, “MAGSAC++, a fast, reliable and accurate robust estimator,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 1304–1312.
|
[39] |
X. Li and Z. Hu, “Rejecting mismatches by correspondence function,” Int. J. Comput. Vis., vol. 89, no. 1, pp. 1–17, 2010. doi: 10.1007/s11263-010-0318-x
|
[40] |
J. Ma, J. Zhao, J. Tian, A. L. Yuille, and Z. Tu, “Robust point matching via vector field consensus,” IEEE Trans. Image Process., vol. 23, no. 4, pp. 1706–1721, 2014. doi: 10.1109/TIP.2014.2307478
|
[41] |
H. Chen, X. Zhang, S. Du, Z. Wu, and N. Zheng, “A correntropy-based affine iterative closest point algorithm for robust point set registration,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 4, pp. 981–991, 2019. doi: 10.1109/JAS.2019.1911579
|
[42] |
C. Leng, H. Zhang, G. Cai, Z. Chen, and A. Basu, “Total variation constrained non-negative matrix factorization for medical image registration,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 5, pp. 1025–1037, 2021. doi: 10.1109/JAS.2021.1003979
|
[43] |
X. Jiang, Y. Xia, X.-P. Zhang, and J. Ma, “Robust image matching via local graph structure consensus,” Pattern Recognit., p. 108588, 2022.
|
[44] |
J. Ma, A. Fan, X. Jiang, and G. Xiao, “Feature matching via motion-consistency driven probabilistic graphical model,” Int. J. Comput. Vis., 2022.
|
[45] |
J. Ma, J. Zhao, J. Jiang, H. Zhou, and X. Guo, “Locality preserving matching,” Int. J. Comput. Vis., vol. 127, no. 5, pp. 512–531, 2019. doi: 10.1007/s11263-018-1117-z
|
[46] |
J. Bian, W.-Y. Lin, Y. Matsushita, S.-K. Yeung, T.-D. Nguyen, and M.-M. Cheng, “GMS: Grid-based motion statistics for fast, ultra-robust feature correspondence,” Int. J. Comput. Vis., vol. 128, no. 6, pp. 1580–1593, 2020. doi: 10.1007/s11263-019-01280-3
|
[47] |
X. Jiang, J. Ma, J. Jiang, and X. Guo, “Robust feature matching using spatial clustering with heavy outliers,” IEEE Trans. Image Process., vol. 29, pp. 736–746, 2020. doi: 10.1109/TIP.2019.2934572
|
[48] |
J. Ma, Z. Li, K. Zhang, Z. Shao, and G. Xiao, “Robust feature matching via neighborhood manifold representation consensus,” ISPRS J. Photogramm. Remote Sens., vol. 183, pp. 196–209, 2022. doi: 10.1016/j.isprsjprs.2021.11.004
|
[49] |
K. Zhang, J. Ma, and J. Jiang, “Loop closure detection with reweighting netvlad and local motion and structure consensus,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 1087–1090, 2022. doi: 10.1109/JAS.2022.105635
|
[50] |
K. M. Yi, E. Trulls, Y. Ono, V. Lepetit, M. Salzmann, and P. Fua, “Learning to find good correspondences,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2666–2674.
|
[51] |
J. Ma, X. Jiang, J. Jiang, J. Zhao, and X. Guo, “LMR: Learning a two-class classifier for mismatch removal,” IEEE Trans. Image Process., vol. 28, no. 8, pp. 4045–4059, 2019. doi: 10.1109/TIP.2019.2906490
|
[52] |
P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superglue: Learning feature matching with graph neural networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 4938–4947.
|
[53] |
G. Tolias, T. Jenicek, and O. Chum, “Learning and aggregating deep local descriptors for instance-level recognition,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 460–477.
|
[54] |
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Adv. Neural Inf. Process. Syst., 2017, pp. 1–11.
|
[55] |
D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004. doi: 10.1023/B:VISI.0000029664.99615.94
|
[56] |
Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 8, pp. 790–799, 1995. doi: 10.1109/34.400568
|
[57] |
A. Fan, J. Ma, X. Jiang, and H. Ling, “Efficient deterministic search with robust loss functions for geometric model fitting,” IEEE Trans. Pattern Anal. Mach. Intell., 2021.
|
[58] |
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” Int. J. Rob. Res., vol. 32, no. 11, pp. 1231–1237, 2013. doi: 10.1177/0278364913491297
|
[59] |
A. J. Glover, W. P. Maddern, M. J. Milford, and G. F. Wyeth, “FAB-MAP + RatSLAM: Appearance-based slam for multiple times of day,” in Proc. IEEE Int. Conf. Robot. Automat., 2010, pp. 3507–3512.
|
[60] |
J.-L. Blanco, F.-A. Moreno, and J. Gonzalez, “A collection of outdoor robotic datasets with centimeter-accuracy ground truth,” Auton. Robots, vol. 27, no. 4, p. 327, 2009.
|
[61] |
H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robust features, ” in Proc. Eur. Conf. Comput. Vis., 2006, pp. 404–417.
|
[62] |
K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool, “A comparison of affine region detectors,” Int. J. Comput. Vis., vol. 65, no. 1–2, pp. 43–72, 2005. doi: 10.1007/s11263-005-3848-x
|
[63] |
F. Radenović, G. Tolias, and O. Chum, “Fine-tuning CNN image retrieval with no human annotation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 7, pp. 1655–1668, 2018.
|
[64] |
A. B. Laguna and K. Mikolajczyk, “Key.Net: Keypoint detection by handcrafted and learned CNN filters revisited,” IEEE Trans. Pattern Anal. Mach. Intell., 2022. DOI: 10.1109/TPAMI.2022.3145820
|
[65] |
Y. Tian, A. Barroso Laguna, T. Ng, V. Balntas, and K. Mikolajczyk, “HyNet: Learning local descriptor with hybrid similarity measure and triplet loss,” in Adv. Neural Inf. Process. Syst., 2020, pp. 7401–7412.
|
[66] |
S. A. M. Kazmi and B. Mertsching, “Detecting the expectancy of a place using nearby context for appearance-based mapping,” IEEE Trans. Robot., vol. 35, no. 6, pp. 1352–1366, 2019. doi: 10.1109/TRO.2019.2926475
|