A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 9
Sep.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Q. Liu, X. Cui, Z. Liu, and  H. Wang,  “Cognitive navigation for intelligent mobile robots: A learning-based approach with topological memory configuration,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1933–1943, Sept. 2024. doi: 10.1109/JAS.2024.124332
Citation: Q. Liu, X. Cui, Z. Liu, and  H. Wang,  “Cognitive navigation for intelligent mobile robots: A learning-based approach with topological memory configuration,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1933–1943, Sept. 2024. doi: 10.1109/JAS.2024.124332

Cognitive Navigation for Intelligent Mobile Robots: A Learning-Based Approach With Topological Memory Configuration

doi: 10.1109/JAS.2024.124332
Funds:  This work was supported in part by the National Natural Science Foundation of China (62225309, 62073222, U21A20480, 62361166632)
More Information
  • Autonomous navigation for intelligent mobile robots has gained significant attention, with a focus on enabling robots to generate reliable policies based on maintenance of spatial memory. In this paper, we propose a learning-based visual navigation pipeline that uses topological maps as memory configurations. We introduce a unique online topology construction approach that fuses odometry pose estimation and perceptual similarity estimation. This tackles the issues of topological node redundancy and incorrect edge connections, which stem from the distribution gap between the spatial and perceptual domains. Furthermore, we propose a differentiable graph extraction structure, the topology multi-factor transformer (TMFT). This structure utilizes graph neural networks to integrate global memory and incorporates a multi-factor attention mechanism to underscore elements closely related to relevant target cues for policy generation. Results from photorealistic simulations on image-goal navigation tasks highlight the superior navigation performance of our proposed pipeline compared to existing memory structures. Comprehensive validation through behavior visualization, interpretability tests, and real-world deployment further underscore the adaptability and efficacy of our method.

     

  • loading
  • [1]
    B. Li, Z. Huang, T. Chen, T. Dai, Y. Zang, W. Xie, B. Tian, and K. Cai, “MSN: Mapless short-range navigation based on time critical deep reinforcement learning,” IEEE Trans. Intelligent Transportation Systems, vol. 24, no. 8, pp. 8628–8637, 2023. doi: 10.1109/TITS.2022.3192480
    [2]
    Y. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, L. Fei-Fei, and A. Farhadi, “Target-driven visual navigation in indoor scenes using deep reinforcement learning,” in Proc. IEEE Int. Conf. Robotics and Automation, 2017, pp. 3357–3364.
    [3]
    P. Mirowski, R. Pascanu, F. Viola, H. Soyer, A. Ballard, A. Banino, M. Denil, R. Goroshin, L. Sifre, K. Kavukcuoglu, D. Kumaran, and R. Hadsell, “Learning to navigate in complex environments,” in Proc. Int. Conf. Learning Representations, 2017, pp. 1–11.
    [4]
    L. Jiang, H. Huang, and Z. Ding, “Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge,” IEEE/CAA J. Autom. Sinica, vol. 7, pp. 1179–1189, 2020. doi: 10.1109/JAS.2020.1003375
    [5]
    A. Singla, S. Padakandla, and S. Bhatnagar, “Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge,” IEEE Trans. Intelligent Transportation Systems, vol. 22, no. 1, pp. 107–118, 2021. doi: 10.1109/TITS.2019.2954952
    [6]
    G. Georgakis, K. Schmeckpeper, K. Wanchoo, S. Dan, E. Miltsakaki, D. Roth, and K. Daniilidis, “Cross-modal map learning for vision and language navigation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2022, pp. 15439–15449.
    [7]
    Z. Gao, J. Qin, S. Wang, and Y. Wang, “Boundary GAP based reactive navigation in unknown environments,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 468–477, 2021. doi: 10.1109/JAS.2021.1003841
    [8]
    S. Gupta, J. Davidson, S. Levine, R. Sukthankar, and J. Malik, “Cognitive mapping and planning for visual navigation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017, pp. 7272–7281.
    [9]
    T. Wang, X. Xu, F. Shen, and Y. Yang, “A cognitive memory-augmented network for visual anomaly detection,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 7, pp. 1296–1307, 2021. doi: 10.1109/JAS.2021.1004045
    [10]
    N. Savinov, A. Dosovitskiy, and V. Koltun, “Semi-parametric topological memory for navigation,” in Proc. Int. Conf. Learning Representations, 2018, pp. 1–12.
    [11]
    X. Meng, N. Ratliff, Y. Xiang, and D. Fox, “Scaling local control to large-scale topological navigation,” in Proc. IEEE Int. Conf. Robotics and Automation, 2020, pp. 672–678.
    [12]
    L. Mezghan, S. Sukhbaatar, T. Lavril, O. Maksymets, D. Batra, P. Bojanowski, and K. Alahari, “Memory-augmented reinforcement learning for image-goal navigation,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, 2022, pp. 3316–3323.
    [13]
    O. Kwon, N. Kim, Y. Choi, H. Yoo, J. Park, and S. Oh, “Visual graph memory with unsupervised representation for visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 15870–15879.
    [14]
    R. R. Wiyatno, A. Xu, and L. Paull, “Lifelong topological visual navigation,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9271–9278, 2022. doi: 10.1109/LRA.2022.3189164
    [15]
    N. Kim, O. Kwon, H. Yoo, Y. Choi, J. Park, and S. Oh, “Topological semantic graph memory for image-goal navigation,” in Proc. 6th Annual Conf. Robot Learning, 2023, pp. 393–402.
    [16]
    Y. Wu, Y. Wu, A. Tamar, S. Russell, G. Gkioxari, and Y. Tian, “Bayesian relational memory for semantic visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2019, pp. 2769–2779.
    [17]
    D. S. Chaplot, D. Gandhi, S. Gupta, A. Gupta, and R. Salakhutdinov, “Learning to explore using active neural slam,” in Proc. Int. Conf. Learning Representations, 2020, pp. 1–13.
    [18]
    H. Choset and K. Nagatani, “Topological simultaneous localization and mapping (SLAM): Toward exact localization without explicit localization,” IEEE Trans. Robotics and Automation, vol. 17, pp. 125–137, 2001.
    [19]
    D. S. Chaplot, R. Salakhutdinov, A. Gupta, and S. Gupta, “Neural topological slam for visual navigation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020, pp. 12872–12881.
    [20]
    M. Hahn, D. S. Chaplot, S. Tulsiani, M. Mukadam, J. M. Rehg, and A. Gupta, “No RL, no simulation: Learning to navigate without navigating,” Advances in Neural Information Processing Systems, vol. 34, pp. 26661–26673, 2021.
    [21]
    Y. Hu, B. Subagdja, A.-H. Tan, and Q. Yin, “Vision-based topological mapping and navigation with self-organizing neural networks,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7101–7113, 2022. doi: 10.1109/TNNLS.2021.3084212
    [22]
    D. Li, Q. Zhang, and D. Zhao, “Graph attention memory for visual navigation,” in Proc. 4th Int. Conf. Data-Driven Optimization of Complex Systems, 2022, pp. 1–7.
    [23]
    A. Taniguchi, F. Sasaki, and R. Yamashina, “Pose invariant topological memory for visual navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 15364–15373.
    [24]
    A. Taniguchi, F. Sasaki, M. Muroi, and R. Yamashina, “Planning on topological map using omnidirectional images and spherical CNNS,” Advanced Robotics, vol. 36, no. 3, pp. 153–166, 2022. doi: 10.1080/01691864.2021.1997641
    [25]
    K. Liu, T. Kurutach, C. Tung, P. Abbeel, and A. Tamar, “Hallucinative topological memory for zero-shot visual planning,” in Proc. 37th Int. Conf. Machine Learning, 2020, vol. 119, pp. 6259–6270.
    [26]
    B. Eysenbach, R. R. Salakhutdinov, and S. Levine, “Search on the replay buffer: Bridging planning and reinforcement learning,” Advances in Neural Information Processing Systems, vol. 32, pp. 15246–15257, 2019.
    [27]
    Z.-H. Yin and W.-J. Li, “Toma: Topological map abstraction for reinforcement learning,” arXiv preprint arXiv: 2005.06061, 2020.
    [28]
    Z. Ravichandran, L. Peng, N. Hughes, J. D. Griffith, and L. Carlone, “Hierarchical representations and explicit memory: Learning effective navigation policies on 3d scene graphs using graph neural networks,” in Proc. Int. Conf. Robotics and Automation, 2022, pp. 9272–9279.
    [29]
    R. Mur-Artal, J. M. M. Montiel, and J. D. Tardós, “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Robotics, vol. 31, no. 5, pp. 1147–1163, 2015. doi: 10.1109/TRO.2015.2463671
    [30]
    F. Blochliger, M. Fehr, M. Dymczyk, T. Schneider, and R. Siegwart, “Topomap: Topological mapping and navigation based on visual SLAM maps,” in Proc. IEEE Int. Conf. Robotics and Automation, 2018, pp. 3818–3825.
    [31]
    N. Ganganath, C.-T. Cheng, T. Fernando, H. H. C. Iu, and C. K. Tse, “Shortest path planning for energy-constrained mobile platforms navigating on uneven terrains,” IEEE Trans. Industrial Informatics, vol. 14, no. 9, pp. 4264–4272, 2018. doi: 10.1109/TII.2018.2844370
    [32]
    B. K. Patle, S.-L. Chen, A. Singh, and S. K. Kashyap, “Optimal trajectory planning of the industrial robot using hybrid s-curve-PSO approach,” Robotic Intelligence and Automation, vol. 43, pp. 153–174, 2023.
    [33]
    H. Sang, R. Jiang, Z. Wang, Y. Zhou, and B. He, “A novel neural multistore memory network for autonomous visual navigation in unknown environment,” IEEE Robotics and Autom. Letters, vol. 7, no. 2, pp. 2039–2046, 2022. doi: 10.1109/LRA.2022.3140795
    [34]
    K. Chen, J. P. de Vicente, G. Sepulveda, F. Xia, A. Soto, M. Vazquez, and S. Savarese, “A behavioral approach to visual navigation with graph localization networks,” in Proc. Robotics: Science and Systems, 2019, pp. 1–10.
    [35]
    E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numer. Math., vol. 1, no. 1, pp. 269–271, 1959.
    [36]
    F. Zhu, X. Liang, Y. Zhu, Q. Yu, X. Chang, and X. Liang, “SOON: Scenario oriented object navigation with graph-based exploration,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2021, pp. 12684–12694.
    [37]
    E. Beeching, J. Dibangoye, O. Simonin, and C. Wolf, “Learning to plan with uncertain topological maps,” in Proc. European Conf. Computer Vision, 2020, pp. 473–490.
    [38]
    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 770–778.
    [39]
    X. Zhao, H. Agrawal, D. Batra, and A. Schwing, “The surprising effectiveness of visual odometry techniques for embodied pointgoal navigation,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 2021, pp. 16107–16116.
    [40]
    T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learning Representations, 2017, pp. 1–10.
    [41]
    Y. Rong, Y. Bian, T. Xu, W. Xie, Y. WEI, W. Huang, and J. Huang, “Self-supervised graph transformer on large-scale molecular data,” in Advances in Neural Information Processing Systems, vol. 33, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds. San Francisco, USA: Curran Associates, Inc., 2020, pp. 12559–12571.
    [42]
    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008, 2017.
    [43]
    M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik, D. Parikh, and D. Batra, “Habitat: A platform for embodied AI research,” in Proc. IEEE/CVF Int. Conf. Computer Vision, 201, pp. 9338–93469.
    [44]
    F. Xia, A. R. Zamir, Z. He, A. Sax, J. Malik, and S. Savarese, “Gibson ENV: Real-world perception for embodied agents,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2018, pp. 9068– 9079.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)  / Tables(1)

    Article Metrics

    Article views (223) PDF downloads(81) Cited by()

    Highlights

    • Introduce a learning-based visual navigation pipeline that uses topological memory
    • Construct precise topological memory by integrating perceptual and spatial relations
    • Develop a neural-based pipeline TMFT to extract topological memory for navigation

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return