Volume 12
Issue 12
IEEE/CAA Journal of Automatica Sinica
| Citation: | L. Ren, Z. Fang, W. Liu, C. Mu, and C. Sun, “Deep reinforcement learning for UAV indoor navigation through task decomposition,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 12, pp. 2627–2629, Dec. 2025. doi: 10.1109/JAS.2025.125642 |
| [1] |
S. Awasthi, M. Fernandez-Cortizas, C. Reining, P. Arias-Perez, M. A. Luna, D. Perez-Saura, M. Roidl, N. Gramse, P. Klokowski, and P. Campoy, “Micro UAV swarm for industrial applications in indoor environment: A systematic literature review,” Logist. Res., vol. 16, no. 1, pp. 1–43, 2023. doi: 10.21009/logistik.v16i01.34732
|
| [2] |
D. Mourtzis, J. Angelopoulos, and N. Panopoulos, “Unmanned aerial vehicle (UAV) path planning and control assisted by augmented reality (AR): The case of indoor drones,” Int. J. Prod. Res., vol. 62, no. 9, pp. 3361–3382, 2024. doi: 10.1080/00207543.2023.2232470
|
| [3] |
L. Xu, J. Liu, X. Chang, X. Liu, and C. Sun, “Hazard-aware weighted advantage combination for UAV target tracking and obstacle avoidance,” IEEE/CAA J. Autom. Sinica, vol.12, pp. 1260–1271, 2024.
|
| [4] |
V. R. Miranda, A. A. Neto, G. M. Freitas, and L. A. Mozelli, “Generalization in deep reinforcement learning for robotic navigation by reward shaping,” IEEE Trans. Industrial Electronics, vol. 71, no. 6, pp. 6013–6020, 2023.
|
| [5] |
B. Han, Z. Ren, Z. Wu, Y. Zhou, and J. Peng, “Off-policy reinforcement learning with delayed rewards,” in Proc. Int. Conf. Machine Learning, 2022, pp. 8280–8303.
|
| [6] |
R. Devidze, P. Kamalaruban, and A. Singla, “Exploration-guided reward shaping for reinforcement learning under sparse rewards,” Advances in Neural Information Processing Systems, vol. 35, pp. 5829–5842, 2022.
|
| [7] |
C. Wang, J. Wang, J. Wang, and X. Zhang, “Deep-reinforcement-learning-based autonomous UAV navigation with sparse rewards,” IEEE Internet Things J., vol. 7, no. 7, pp. 6180–6190, 2020. doi: 10.1109/JIOT.2020.2973193
|
| [8] |
B. Eysenbach, R. R. Salakhutdinov, and S. Levine, “Search on the replay buffer: Bridging planning and reinforcement learning,” Adv. Neural Inf. Process. Syst., vol. 32, pp. 15246–15257, 2019.
|
| [9] |
S. Lee, J. Kim, I. Jang, and H. J. Kim, “DHRL: A graph-based approach for long-horizon and sparse hierarchical reinforcement learning,” Adv. Neural Inf. Process. Syst., vol. 35, pp. 13668–13678, 2022.
|
| [10] |
A. Sivaramakrishnan, S. Tangirala, E. Granados, N. R. Carver, and K. E. Bekris, “Roadmaps with gaps over controllers: Achieving efficiency in planning under dynamics,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, 2024, pp. 11064–11069.
|
| [11] |
J. Panerati, H. Zheng, S. Zhou, J. Xu, A. Prorok, and A. P. Schoellig, “Learning to flya GYM environment with pybullet physics for reinforcement learning of multi-agent quadcopter control,” in Proc. IROS, 2021, pp. 7512–7519.
|
| [12] |
W. Liu, W. Cai, K. Jiang, G. Cheng, Y. Wang, J. Wang, J. Cao, L. Xu, C. Mu, and C. Sun, “Xuance: A comprehensive and unified deep reinforcement learning library,” arXiv preprint arXiv: 2312.16248, 2023.
|
| [13] |
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Proc. Int. Conf. Learn. Represent, 2016. [Online], Available: http://arxiv.org/abs/1509.02971.
|