IEEE/CAA Journal of Automatica Sinica
Citation: | X. Chen, B. Xu, M. Hu, Y. Bian, Y. Li, and X. Xu, “Safe efficient policy optimization algorithm for unsignalized intersection navigation,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 2011–2026, Sept. 2024. doi: 10.1109/JAS.2024.124287 |
[1] |
W. Schwarting, J. Alonso-Mora, and D. Rus, “Planning and decision-making for autonomous vehicles,” Annu. Rev. Control Rob. Auton. Syst., vol. 1, no. 1, pp. 187–210, 2018. doi: 10.1146/annurev-control-060117-105157
|
[2] |
S. Mariani, G. Cabri, and F. Zambonelli, “Coordination of autonomous vehicles,” ACM Comput. Surv., vol. 54, no. 1, pp. 1–33, Jan. 2022.
|
[3] |
K. Dresner and P. Stone, “A multiagent approach to autonomous intersection management,” J. Artif. Intell. Res., vol. 31, pp. 591–656, Mar. 2008. doi: 10.1613/jair.2502
|
[4] |
X. Chen, M. Hu, B. Xu, Y. Bian, and H. Qin, “Improved reservation-based method with controllable gap strategy for vehicle coordination at non-signalized intersections,” Physica A, vol. 604, p. 127953, Oct. 2022. doi: 10.1016/j.physa.2022.127953
|
[5] |
R. Tian, N. Li, I. Kolmanovsky, Y. Yildiz, and A. R. Girard, “Game-theoretic modeling of traffic in unsignalized intersection network for autonomous vehicle control verification and validation,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2211–2226, Mar. 2022. doi: 10.1109/TITS.2020.3035363
|
[6] |
N. Li, Y. Yao, I. Kolmanovsky, E. Atkins, and A. R. Girard, “Game-theoretic modeling of multi-vehicle interactions at uncontrolled intersections,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 1428–1442, Feb. 2022. doi: 10.1109/TITS.2020.3026160
|
[7] |
S. Yan, T. Welschehold, D. Buscher, and W. Burgard, “Courteous behavior of automated vehicles at unsignalized intersections via reinforcement learning,” IEEE Robot. Autom. Lett., vol. 7, no. 1, pp. 191–198, Oct. 2021.
|
[8] |
N. Parvez Farazi, B. Zou, T. Ahamed, and L. Barua, “Deep reinforcement learning in transportation research: A review,” Transp. Res. Interdiscip. Persp., vol. 11, p. 100425, Sept. 2021.
|
[9] |
F. Azadi, N. Mitrovic, and A. Stevanovic, “Impact of shared lanes on performance of the combined flexible lane assignment and reservation-based intersection control,” Transp. Res. Rec., vol. 2676, no. 12, pp. 51–68, Dec. 2021.
|
[10] |
Z. Guo, D. Sun, and L. Zhou, “Game algorithm of intelligent driving vehicle based on left-turn scene of crossroad traffic flow,” Comput. Intell. Neurosci., vol. 2022, p. e9318475, Sept. 2022.
|
[11] |
P. Hang, C. Lv, and X. Chen, “Human-like decision making for autonomous vehicles with noncooperative game theoretic method,” Human-Like Decision Making and Control for Autonomous Driving. CRC Press, 2022.
|
[12] |
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. doi: 10.1038/nature14236
|
[13] |
D. P. Bertsekas, “Feature-based aggregation and deep reinforcement learning: A survey and some new implementations,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 1–31, Jan. 2019. doi: 10.1109/JAS.2018.7511249
|
[14] |
Z. Zhu and H. Zhao, “A survey of deep rl and il for autonomous driving policy learning,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, p. 14, Sept. 2022.
|
[15] |
Y. Zhang, B. Gao, L. Guo, H. Guo, and H. Chen, “Adaptive decision-making for automated vehicles under roundabout scenarios using optimization embedded reinforcement learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5526–5538, Dec. 2021. doi: 10.1109/TNNLS.2020.3042981
|
[16] |
J. Wu, Z. Huang, W. Huang, and C. Lv, “Prioritized experience-based reinforcement learning with human guidance for autonomous driving,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 1, pp. 855–869, Jan. 2024. doi: 10.1109/TNNLS.2022.3177685
|
[17] |
Y. Wang, S. Hou, and X. Wang, “Reinforcement learning-based birdview automated vehicle control to avoid crossing traffic,” Comput.-Aided Civ. Infrastruct. Eng., vol. 36, no. 7, pp. 890–901, Jul. 2021. doi: 10.1111/mice.12572
|
[18] |
C. Huang, R. Zhang, M. Ouyang, P. Wei, J. Lin, J. Su, and L. Lin, “Deductive reinforcement learning for visual autonomous urban driving navigation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5379–5391, Dec. 2021. doi: 10.1109/TNNLS.2021.3109284
|
[19] |
H. Shu, T. Liu, X. Mu, and D. Cao, “Driving tasks transfer using deep reinforcement learning for decision-making of autonomous vehicles in unsignalized intersection,” IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 41–52, Jan. 2022. doi: 10.1109/TVT.2021.3121985
|
[20] |
Z. Qiao, K. Muelling, J. Dolan, P. Palanisamy, and P. Mudalige, “POMDP and hierarchical options MDP with continuous actions for autonomous driving at intersections,” in Proc. 21st Int. Conf. Intelligent Transportation Systems, Nov. 2018, pp. 2377–2382.
|
[21] |
Y. Ren, J. Duan, S. E. Li, Y. Guan, and Q. Sun, “Improving generalization of reinforcement learning with minimax distributional soft actor-critic,” in Proc. IEEE 23rd Int. Conf. Intelligent Transportation Systems. IEEE, 2020, pp. 1–6.
|
[22] |
H. Seong, C. Jung, S. Lee, and D. H. Shim, “Learning to drive at unsignalized intersections using attention-based deep reinforcement learning,” in Proc. IEEE Int. Intelligent Transportation Systems Conf.. Indianapolis, USA: Oct. 2021, pp. 559–566.
|
[23] |
M. Martinson, A. Skrynnik, and A. I. Panov, “Navigating autonomous vehicle at the road intersection simulator with reinforcement learning,” in Proc. Russian Conf. Artificial Intelligence, ser. Lecture Notes in Computer Science, S. O. Kuznetsov, A. I. Panov, and K. S. Yakovlev, Eds. Cham: Springer, 2020, pp. 71–84.
|
[24] |
R. Bautista-Montesano, R. Galluzzi, K. Ruan, Y. Fu, and X. Di, “Autonomous navigation at unsignalized intersections: A coupled reinforcement learning and model predictive control approach,” Transp. Res. Part C Emerg. Technol., vol. 139, p. 103662, Jun. 2022. doi: 10.1016/j.trc.2022.103662
|
[25] |
E. Candela, O. Doustaly, L. Parada, F. Feng, Y. Demiris, and P. Angeloudis, “Risk-aware controller for autonomous vehicles using model-based collision prediction and reinforcement learning,” Artif. Intell., vol. 320, p. 103923, Jul. 2023. doi: 10.1016/j.artint.2023.103923
|
[26] |
M. Selim, A. Alanwar, M. W. El-Kharashi, H. M. Abbas, and K. H. Johansson, “Safe reinforcement learning using data-driven predictive control,” in Proc. 5th Int. Conf. Communications, Signal Processing, and Their Applications, Dec. 2022, pp. 1–6.
|
[27] |
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347 [cs], Jul. 2017.
|
[28] |
L. Zhang, R. Zhang, T. Wu, R. Weng, M. Han, and Y. Zhao, “Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5435–5444, Jul. 2021. doi: 10.1109/TNNLS.2021.3084685
|
[29] |
T. De Ceunynck, E. Polders, S. Daniels, E. Hermans, T. Brijs, and G. Wets, “Road safety differences between priority-controlled intersections and right-hand priority intersections: behavioral analysis of vehicle-vehicle interactions,” Transp. Res. Rec., vol. 2365, no. 1, pp. 39–48, Jan. 2013. doi: 10.3141/2365-06
|
[30] |
P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y.-P. Flötteröd, R. Hilbrich, L. Lucken, J. Rummel, P. Wagner, and E. Wiessner, “Microscopic traffic simulation using SUMO,” in Proc. 21st Int. Conf. Intelligent Transportation Systems, Nov. 2018, pp. 2575–2582.
|
[31] |
J. Song, Y. Wu, Z. Xu, and X. Lin, “Research on car-following model based on SUMO,” in Proc. IEEE/Int. Conf. Advanced Infocomm Technology, Nov. 2014, pp. 47–55.
|
[32] |
H. Li, G. Yu, B. Zhou, P. Chen, Y. Liao, and D. Li, “Semantic-level maneuver sampling and trajectory planning for on-road autonomous driving in dynamic scenarios,” IEEE Trans. Veh. Technol., vol. 70, no. 2, pp. 1122–1134, Feb. 2021. doi: 10.1109/TVT.2021.3051178
|
[33] |
Y. Qin, W. Hua, J. Jin, J. Ge, X. Dai, L. Li, X. Wang, and F.-Y. Wang, “AUTOSIM: Automated urban traffic operation simulation via meta-learning,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 9, pp. 1871–1881, 2023. doi: 10.1109/JAS.2023.123780
|
[34] |
R. K. R. Pallavali, “Synchronous intelligent intersections for sustainable urban mobility,” Ph.D. dissertation, Universidade do Porto, Portugal, 2023.
|
[35] |
S. Das and A. K. Maurya, “Defining time-to-collision thresholds by the type of lead vehicle in non-lane-based traffic environments,” IEEE Trans. Intell. Transport. Syst., vol. 21, no. 12, pp. 4972–4982, Dec. 2020. doi: 10.1109/TITS.2019.2946001
|
[36] |
J. Achiam, D. Held, A. Tamar, and P. Abbeel, “Constrained policy optimization,” in Proc. 34th Int. Conf. Machine Learning. PMLR, Jul. 2017, pp. 22–31.
|
[37] |
J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proc. 32nd Int. Conf. Machine Learning. PMLR, Jun. 2015, pp. 1889–1897.
|
[38] |
T.-Y. Yang, J. Rosca, K. Narasimhan, and P. J. Ramadge, “Projection-based constrained policy optimization,” in Proc.8th Int. Conf. Learning Representations, Apr. 2020.
|
[39] |
G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, OpenAI Gym, Jun. 2016.
|
[40] |
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “PyTorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., 2019.
|