Cognitive Navigation for Intelligent Mobile Robots: A Learning-Based Approach With Topological Memory Configuration

Qiming Liu; Xinru Cui; Zhe Liu; Hesheng Wang

doi:10.1109/JAS.2024.124332

Volume 11 Issue 9

Sep. 2024

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > 11(9): 1933-1943

Q. Liu, X. Cui, Z. Liu, and H. Wang, “Cognitive navigation for intelligent mobile robots: A learning-based approach with topological memory configuration,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1933–1943, Sept. 2024. doi: 10.1109/JAS.2024.124332

Citation:

Q. Liu, X. Cui, Z. Liu, and H. Wang, “Cognitive navigation for intelligent mobile robots: A learning-based approach with topological memory configuration,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1933–1943, Sept. 2024. doi: 10.1109/JAS.2024.124332

Citation:

PDF( 9713 KB)

Cognitive Navigation for Intelligent Mobile Robots: A Learning-Based Approach With Topological Memory Configuration

doi: 10.1109/JAS.2024.124332

Funds: This work was supported in part by the National Natural Science Foundation of China (62225309, 62073222, U21A20480, 62361166632)

More Information

Author Bio:
Qiming Liu received the B.Eng. degree in automation from Shanghai Jiao Tong University in 2020. He is currently a Ph.D. candidate in control science and engineering at Shanghai Jiao Tong University. His current research interests include robot navigation, embodied AI, and reinforcement learning

Xinru Cui is currently a graduate student in automation at Shanghai Jiao Tong University. His current research interests include robot learning, embodied AI, and intelligent navigation

Zhe Liu received the Ph.D. degree in control technology and control engineering from Shanghai Jiao Tong University in 2016. From 2017 to 2020, he was a Post-Doctoral Fellow with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, China. From 2020 to 2022, he was a Research Associate with the Department of Computer Science and Technology, University of Cambridge, UK. Currently he is an Associate Professor with the MoE Key Laboratory of Artificial Intelligence, Shanghai Jiao Tong University. His current research interests include multi-robot cooperation and autonomous driving system

Hesheng Wang (Senior Member, IEEE) received the B.Eng. degree in electrical engineering from the Harbin Institute of Technology in 2002, and the M.Phil. and Ph.D. degrees in automation and computeraided engineering from The Chinese University of Hong Kong, Hong Kong, China, in 2004 and 2007, respectively. He is currently a Professor with the Department of Automation, Shanghai Jiao Tong University. His current research interests include visual servoing, service robot, computer vision, and autonomous driving. Dr. Wang is an Associate Editor of IEEE Transactions on Automation Science and Engineering, IEEE Robotics and Automation Letters, Robotic Intelligence and Automation and the International Journal of HumanoidRobotics, a Technical Editor of the IEEE/ASME Transactions on Mechatronics, an Editor of Conference Editorial Board (CEB) of IEEE Robotics and Automation Society. He served as an Associate Editor of the IEEE Transactions on Robotics from 2015 to 2019. He was the General Chair of IEEE ROBIO 2022 and IEEE RCAR 2016, and the Program Chair of the IEEE ROBIO 2014 and IEEE/ASME AIM 2019. He will be the General Chair ofIEEE/RSJ IROS 2025
Corresponding author: Hesheng Wang, e-mail: wanghesheng@sjtu.edu.cn
Received Date: 2023-07-16
Revised Date: 2023-12-04
Accepted Date: 2024-02-14

Available Online: 2024-05-17

Abstract

Abstract

Autonomous navigation for intelligent mobile robots has gained significant attention, with a focus on enabling robots to generate reliable policies based on maintenance of spatial memory. In this paper, we propose a learning-based visual navigation pipeline that uses topological maps as memory configurations. We introduce a unique online topology construction approach that fuses odometry pose estimation and perceptual similarity estimation. This tackles the issues of topological node redundancy and incorrect edge connections, which stem from the distribution gap between the spatial and perceptual domains. Furthermore, we propose a differentiable graph extraction structure, the topology multi-factor transformer (TMFT). This structure utilizes graph neural networks to integrate global memory and incorporates a multi-factor attention mechanism to underscore elements closely related to relevant target cues for policy generation. Results from photorealistic simulations on image-goal navigation tasks highlight the superior navigation performance of our proposed pipeline compared to existing memory structures. Comprehensive validation through behavior visualization, interpretability tests, and real-world deployment further underscore the adaptability and efficacy of our method.
- Graph neural networks (GNNs),
- spatial memory,
- topological map,
- visual navigation

FullText(HTML)

References(44)

References

[1]	B. Li, Z. Huang, T. Chen, T. Dai, Y. Zang, W. Xie, B. Tian, and K. Cai, “MSN: Mapless short-range navigation based on time critical deep reinforcement learning,” IEEE Trans. Intelligent Transportation Systems, vol. 24, no. 8, pp. 8628–8637, 2023. doi: 10.1109/TITS.2022.3192480
[2]	Y. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, L. Fei-Fei, and A. Farhadi, “Target-driven visual navigation in indoor scenes using deep reinforcement learning,” in Proc. IEEE Int. Conf. Robotics and Automation, 2017, pp. 3357–3364.
[3]	P. Mirowski, R. Pascanu, F. Viola, H. Soyer, A. Ballard, A. Banino, M. Denil, R. Goroshin, L. Sifre, K. Kavukcuoglu, D. Kumaran, and R. Hadsell, “Learning to navigate in complex environments,” in Proc. Int. Conf. Learning Representations, 2017, pp. 1–11.
[4]	L. Jiang, H. Huang, and Z. Ding, “Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge,” IEEE/CAA J. Autom. Sinica, vol. 7, pp. 1179–1189, 2020. doi: 10.1109/JAS.2020.1003375
[5]	A. Singla, S. Padakandla, and S. Bhatnagar, “Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge,” IEEE Trans. Intelligent Transportation Systems, vol. 22, no. 1, pp. 107–118, 2021. doi: 10.1109/TITS.2019.2954952
[6]	G. Georgakis, K. Schmeckpeper, K. Wanchoo, S. Dan, E. Miltsakaki, D. Roth, and K. Daniilidis, “Cross-modal map learning for vision and language navigation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2022, pp. 15439–15449.
[7]	Z. Gao, J. Qin, S. Wang, and Y. Wang, “Boundary GAP based reactive navigation in unknown environments,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 468–477, 2021. doi: 10.1109/JAS.2021.1003841
[8]	S. Gupta, J. Davidson, S. Levine, R. Sukthankar, and J. Malik, “Cognitive mapping and planning for visual navigation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017, pp. 7272–7281.
[9]	T. Wang, X. Xu, F. Shen, and Y. Yang, “A cognitive memory-augmented network for visual anomaly detection,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 7, pp. 1296–1307, 2021. doi: 10.1109/JAS.2021.1004045
[10]	N. Savinov, A. Dosovitskiy, and V. Koltun, “Semi-parametric topological memory for navigation,” in Proc. Int. Conf. Learning Representations, 2018, pp. 1–12.
[11]	X. Meng, N. Ratliff, Y. Xiang, and D. Fox, “Scaling local control to large-scale topological navigation,” in Proc. IEEE Int. Conf. Robotics and Automation, 2020, pp. 672–678.
[12]	L. Mezghan, S. Sukhbaatar, T. Lavril, O. Maksymets, D. Batra, P. Bojanowski, and K. Alahari, “Memory-augmented reinforcement learning for image-goal navigation,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, 2022, pp. 3316–3323.
[13]	O. Kwon, N. Kim, Y. Choi, H. Yoo, J. Park, and S. Oh, “Visual graph memory with unsupervised representation for visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 15870–15879.
[14]	R. R. Wiyatno, A. Xu, and L. Paull, “Lifelong topological visual navigation,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9271–9278, 2022. doi: 10.1109/LRA.2022.3189164
[15]	N. Kim, O. Kwon, H. Yoo, Y. Choi, J. Park, and S. Oh, “Topological semantic graph memory for image-goal navigation,” in Proc. 6th Annual Conf. Robot Learning, 2023, pp. 393–402.
[16]	Y. Wu, Y. Wu, A. Tamar, S. Russell, G. Gkioxari, and Y. Tian, “Bayesian relational memory for semantic visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2019, pp. 2769–2779.
[17]	D. S. Chaplot, D. Gandhi, S. Gupta, A. Gupta, and R. Salakhutdinov, “Learning to explore using active neural slam,” in Proc. Int. Conf. Learning Representations, 2020, pp. 1–13.
[18]	H. Choset and K. Nagatani, “Topological simultaneous localization and mapping (SLAM): Toward exact localization without explicit localization,” IEEE Trans. Robotics and Automation, vol. 17, pp. 125–137, 2001.
[19]	D. S. Chaplot, R. Salakhutdinov, A. Gupta, and S. Gupta, “Neural topological slam for visual navigation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020, pp. 12872–12881.
[20]	M. Hahn, D. S. Chaplot, S. Tulsiani, M. Mukadam, J. M. Rehg, and A. Gupta, “No RL, no simulation: Learning to navigate without navigating,” Advances in Neural Information Processing Systems, vol. 34, pp. 26661–26673, 2021.
[21]	Y. Hu, B. Subagdja, A.-H. Tan, and Q. Yin, “Vision-based topological mapping and navigation with self-organizing neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7101–7113, 2022. doi: 10.1109/TNNLS.2021.3084212
[22]	D. Li, Q. Zhang, and D. Zhao, “Graph attention memory for visual navigation,” in Proc. 4th Int. Conf. Data-Driven Optimization of Complex Systems, 2022, pp. 1–7.
[23]	A. Taniguchi, F. Sasaki, and R. Yamashina, “Pose invariant topological memory for visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 15364–15373.
[24]	A. Taniguchi, F. Sasaki, M. Muroi, and R. Yamashina, “Planning on topological map using omnidirectional images and spherical CNNS,” Advanced Robotics, vol. 36, no. 3, pp. 153–166, 2022. doi: 10.1080/01691864.2021.1997641
[25]	K. Liu, T. Kurutach, C. Tung, P. Abbeel, and A. Tamar, “Hallucinative topological memory for zero-shot visual planning,” in Proc. 37th Int. Conf. Machine Learning, 2020, vol. 119, pp. 6259–6270.
[26]	B. Eysenbach, R. R. Salakhutdinov, and S. Levine, “Search on the replay buffer: Bridging planning and reinforcement learning,” Advances in Neural Information Processing Systems, vol. 32, pp. 15246–15257, 2019.
[27]	Z.-H. Yin and W.-J. Li, “Toma: Topological map abstraction for reinforcement learning,” arXiv preprint arXiv: 2005.06061, 2020.
[28]	Z. Ravichandran, L. Peng, N. Hughes, J. D. Griffith, and L. Carlone, “Hierarchical representations and explicit memory: Learning effective navigation policies on 3d scene graphs using graph neural networks,” in Proc. Int. Conf. Robotics and Automation, 2022, pp. 9272–9279.
[29]	R. Mur-Artal, J. M. M. Montiel, and J. D. Tardós, “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Robotics, vol. 31, no. 5, pp. 1147–1163, 2015. doi: 10.1109/TRO.2015.2463671
[30]	F. Blochliger, M. Fehr, M. Dymczyk, T. Schneider, and R. Siegwart, “Topomap: Topological mapping and navigation based on visual SLAM maps,” in Proc. IEEE Int. Conf. Robotics and Automation, 2018, pp. 3818–3825.
[31]	N. Ganganath, C.-T. Cheng, T. Fernando, H. H. C. Iu, and C. K. Tse, “Shortest path planning for energy-constrained mobile platforms navigating on uneven terrains,” IEEE Trans. Industrial Informatics, vol. 14, no. 9, pp. 4264–4272, 2018. doi: 10.1109/TII.2018.2844370
[32]	B. K. Patle, S.-L. Chen, A. Singh, and S. K. Kashyap, “Optimal trajectory planning of the industrial robot using hybrid s-curve-PSO approach,” Robotic Intelligence and Automation, vol. 43, pp. 153–174, 2023.
[33]	H. Sang, R. Jiang, Z. Wang, Y. Zhou, and B. He, “A novel neural multistore memory network for autonomous visual navigation in unknown environment,” IEEE Robotics and Autom. Letters, vol. 7, no. 2, pp. 2039–2046, 2022. doi: 10.1109/LRA.2022.3140795
[34]	K. Chen, J. P. de Vicente, G. Sepulveda, F. Xia, A. Soto, M. Vazquez, and S. Savarese, “A behavioral approach to visual navigation with graph localization networks,” in Proc. Robotics: Science and Systems, 2019, pp. 1–10.
[35]	E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numer. Math., vol. 1, no. 1, pp. 269–271, 1959.
[36]	F. Zhu, X. Liang, Y. Zhu, Q. Yu, X. Chang, and X. Liang, “SOON: Scenario oriented object navigation with graph-based exploration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2021, pp. 12684–12694.
[37]	E. Beeching, J. Dibangoye, O. Simonin, and C. Wolf, “Learning to plan with uncertain topological maps,” in Proc. European Conf. Computer Vision, 2020, pp. 473–490.
[38]	K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[39]	X. Zhao, H. Agrawal, D. Batra, and A. Schwing, “The surprising effectiveness of visual odometry techniques for embodied pointgoal navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 16107–16116.
[40]	T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learning Representations, 2017, pp. 1–10.
[41]	Y. Rong, Y. Bian, T. Xu, W. Xie, Y. WEI, W. Huang, and J. Huang, “Self-supervised graph transformer on large-scale molecular data,” in Advances in Neural Information Processing Systems, vol. 33, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds. San Francisco, USA: Curran Associates, Inc., 2020, pp. 12559–12571.
[42]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008, 2017.
[43]	M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik, D. Parikh, and D. Batra, “Habitat: A platform for embodied AI research,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 201, pp. 9338–93469.
[44]	F. Xia, A. R. Zamir, Z. He, A. Sax, J. Malik, and S. Savarese, “Gibson ENV: Real-world perception for embodied agents,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2018, pp. 9068– 9079.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(7) / Tables(1)

Get Citation

PDF

XML

Article Metrics

Article views (499) PDF downloads(115)

Highlights

Introduce a learning-based visual navigation pipeline that uses topological memory
Construct precise topological memory by integrating perceptual and spatial relations
Develop a neural-based pipeline TMFT to extract topological memory for navigation

Cognitive Navigation for Intelligent Mobile Robots: A Learning-Based Approach With Topological Memory Configuration

doi: 10.1109/JAS.2024.124332

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content