A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 1
Jan.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
X. Tang, Y. Yang, T. Liu, X. Lin, K. Yang, and S. Li, “Path planning and tracking control for parking via soft actor-critic under non-ideal scenarios,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 1, pp. 181–195, Jan. 2024. doi: 10.1109/JAS.2023.123975
Citation: X. Tang, Y. Yang, T. Liu, X. Lin, K. Yang, and S. Li, “Path planning and tracking control for parking via soft actor-critic under non-ideal scenarios,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 1, pp. 181–195, Jan. 2024. doi: 10.1109/JAS.2023.123975

Path Planning and Tracking Control for Parking via Soft Actor-Critic Under Non-Ideal Scenarios

doi: 10.1109/JAS.2023.123975
Funds:  This work was supported by National Natural Science Foundation of China (52222215, 52272420, 52072051)
More Information
  • Parking in a small parking lot within limited space poses a difficult task. It often leads to deviations between the final parking posture and the target posture. These deviations can lead to partial occupancy of adjacent parking lots, which poses a safety threat to vehicles parked in these parking lots. However, previous studies have not addressed this issue. In this paper, we aim to evaluate the impact of parking deviation of existing vehicles next to the target parking lot (PDEVNTPL) on the automatic ego vehicle (AEV) parking, in terms of safety, comfort, accuracy, and efficiency of parking. A segmented parking training framework (SPTF) based on soft actor-critic (SAC) is proposed to improve parking performance. In the proposed method, the SAC algorithm incorporates strategy entropy into the objective function, to enable the AEV to learn parking strategies based on a more comprehensive understanding of the environment. Additionally, the SPTF simplifies complex parking tasks to maintain the high performance of deep reinforcement learning (DRL). The experimental results reveal that the PDEVNTPL has a detrimental influence on the AEV parking in terms of safety, accuracy, and comfort, leading to reductions of more than 27%, 54%, and 26% respectively. However, the SAC-based SPTF effectively mitigates this impact, resulting in a considerable increase in the parking success rate from 71% to 93%. Furthermore, the heading angle deviation is significantly reduced from 2.25 degrees to 0.43 degrees.

     

  • loading
  • [1]
    B. Li, Z. Yin, Y. Ouyang, Y. Zhang, X. Zhong, and S. Tang, “Online trajectory replanning for sudden environmental changes during automated parking: A parallel stitching method,” IEEE Trans. Intelligent Vehicles, vol. 7, no. 3, pp. 748–757, 2022. doi: 10.1109/TIV.2022.3156429
    [2]
    K. Yang, X. Tang, Y. Qin, Y. Huang, H. Wang, and H. Pu, “Comparative study of trajectory tracking control for automated vehicles via model predictive control and robust H-infinity state feedback control,” Chinese J. Mechanical Engineering, vol. 34, no. 1, pp. 1–14, 2021. doi: 10.1186/s10033-020-00524-5
    [3]
    X. Tang, K. Yang, H. Wang, J. Wu, Y. Qin, W. Yu, and D. Cao, “Prediction-uncertainty-aware decision-making for autonomous vehicles,” IEEE Trans. Intelligent Vehicles, vol. 7, no. 4, pp. 849–862, 2022. doi: 10.1109/TIV.2022.3188662
    [4]
    L. Hu, Q. Tian, C. Zou, J. Huang, Y. Ye, and X. Wu, “A study on energy distribution strategy of electric vehicle hybrid energy storage system considering driving style based on real urban driving data,” Renewable and Sustainable Energy Reviews, vol. 162, p. 112416, 2022. doi: 10.1016/j.rser.2022.112416
    [5]
    L. Hu, H. Li, P. Yi, J. Huang, M. Lin, and H. Wang, “Investigation on AEB key parameters for improving car to two-wheeler collision safety using in-depth traffic accident data,” IEEE Trans. Vehicular Technology, vol. 72, no. 1, pp. 113–124, 2023. doi: 10.1109/TVT.2022.3199969
    [6]
    F.-Y. Wang, R. Song, R. Zhou, X. Wang, L. Chen, L. Li, L. Zeng, J. Zhou, S. Teng, and X. Zhu, “Verification and validation of intelligent vehicles: Objectives and efforts from china,” IEEE Trans. Intelligent Vehicles, vol. 7, no. 2, pp. 164–169, 2022. doi: 10.1109/TIV.2022.3179104
    [7]
    L. Li, N. Zheng, and F.-Y. Wang, “A theoretical foundation of intelligence testing and its application for intelligent vehicles,” IEEE Trans. Intelligent Transportation Systems, vol. 22, no. 10, pp. 6297–6306, 2020.
    [8]
    L. Le Mero, D. Yi, M. Dianati, and A. Mouzakitis, “A survey on imitation learning techniques for end-to-end autonomous vehicles,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 9, pp. 14128–14147, 2022. doi: 10.1109/TITS.2022.3144867
    [9]
    S. Khan and J. Guivant, “Design and implementation of proximal planning and control of an unmanned ground vehicle to operate in dynamic environments,” IEEE Trans. Intelligent Vehicles, vol. 8, no. 2, pp. 1787–1799, 2023. doi: 10.1109/TIV.2022.3210000
    [10]
    Y. Shi, P. Wang, and X. Wang, “An autonomous valet parking algorithm for path planning and tracking,” in Proc. IEEE 96th Vehicular Technology Conf., 2022, pp. 1–7.
    [11]
    Y. Qian, Z. Wang, W. Liang, and C. Lu, “Research on automatic parking system based on linear quadratic regulator,” Engineering Comput., vol. 39, no. 3, pp. 1161–1179, 2022. doi: 10.1108/EC-02-2021-0115
    [12]
    M. Cai, W. Wu, and X. Zhou, “Trajectory tracking control for autonomous parking based on adaptive reduced-horizon model predictive control,” in Proc. IEEE Int. Conf. Networking, Sensing and Control, 2022, pp. 1–6.
    [13]
    Y. Wu, S. Li, Q. Zhang, K. Sun-Woo, and L. Yan, “Route planning and tracking control of an intelligent automatic unmanned transportation system based on dynamic nonlinear model predictive control,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 9, pp. 16576–16589, 2022. doi: 10.1109/TITS.2022.3141214
    [14]
    C. Xuezhi, “Automatic vertical parking path planning based on clothoid curve and stanley algorithm,” in Proc. IEEE 5th Int. Conf. Infor. Systems and Computer Aided Education, 2022, pp. 761–766.
    [15]
    P. Zhang, L. Xiong, Z. Yu, P. Fang, S. Yan, J. Yao, and Y. Zhou, “Reinforcement learning-based end-to-end parking for automatic parking system,” Sensors, vol. 19, no. 18, p. 3996, 2019. doi: 10.3390/s19183996
    [16]
    S. Teng, L. Chen, Y. Ai, Y. Zhou, Z. Xuanyuan, and X. Hu, “Hierarchical interpretable imitation learning for end-to-end autonomous driving,” IEEE Trans. Intelligent Vehicles, vol. 8, no. 1, pp. 673–683, 2023. doi: 10.1109/TIV.2022.3225340
    [17]
    X. Tang, B. Huang, T. Liu, and X. Lin, “Highway decision-making and motion planning for autonomous driving via soft actor-critic,” IEEE Trans. Vehicular Technology, vol. 71, no. 5, pp. 4706–4717, 2022. doi: 10.1109/TVT.2022.3151651
    [18]
    Y. Lin, J. McPhee, and N. L. Azad, “Comparison of deep reinforcement learning and model predictive control for adaptive cruise control,” IEEE Trans. Intelligent Vehicles, vol. 6, no. 2, pp. 221–231, 2020.
    [19]
    X. Zhang, C. Zhao, F. Liao, X. Li, and Y. Du, “Online parking assignment in an environment of partially connected vehicles: A multi-agent deep reinforcement learning approach,” Transportation Research Part C: Emerging Technologies, vol. 138, p. 103624, 2022. doi: 10.1016/j.trc.2022.103624
    [20]
    Z. Cong, Z. Xin-yuan, L. Xing-hua, and D. Yu-chuan, “Intelligent delay matching method for parking allocation system via multi-agent deep reinforcement learning,” China J. Highway and Transport, vol. 35, no. 7, p. 261, 2022.
    [21]
    Z. Du, Q. Miao, and C. Zong, “Trajectory planning for automated parking systems using deep reinforcement learning,” Int. J. Automotive Technology, vol. 21, pp. 881–887, 2020. doi: 10.1007/s12239-020-0085-9
    [22]
    S. Song, H. Chen, H. Sun, and M. Liu, “Data efficient reinforcement learning for integrated lateral planning and control in automated parking system,” Sensors, vol. 20, no. 24, p. 7297, 2020. doi: 10.3390/s20247297
    [23]
    J. Bernhard, R. Gieselmann, K. Esterle, and A. Knol, “Experience-based heuristic search: Robust motion planning with deep Q-learning,” in Proc. 21st Int. Conf. Intelligent Transportation Systems. 2018, pp. 3175–3182.
    [24]
    J. Zhang, H. Chen, S. Song, and F. Hu, “Reinforcement learning-based motion planning for automatic parking system,” IEEE Access, vol. 8, pp. 154 485–154 501, 2020. doi: 10.1109/ACCESS.2020.3017770
    [25]
    J. Kong, M. Pfeiffer, G. Schildbach, and F. Borrelli, “Kinematic and dynamic vehicle models for autonomous driving control design,” in Proc. IEEE Intelligent Vehicles Symposium, 2015, pp. 1094–1099.
    [26]
    R. Rajamani, “Vehicle dynamics and control, SER,” in Mechanical Engineering Series. Berlin, Germany: Springer, pp. 21–26, 2011.
    [27]
    Y. Yuan, X. Luo, M. Shang, and Z. Wang, “A Kalman-filter-incorporated latent factor analysis model for temporally dynamic sparse data,” IEEE Trans. Cyber., vol. 53, no. 9, pp. 5788–5801, 2023. doi: 10.1109/TCYB.2022.3185117
    [28]
    E. Akanksha, N. Sharma, K. Gulati, et al., “Review on reinforcement learning, research evolution and scope of application,” in Proc. 5th Int. Conf. Computing Methodologies and Communication, 2021, pp. 1416–1423.
    [29]
    S. Alighanbari and N. L. Azad, “Deep reinforcement learning with nmpc assistance nash switching for urban autonomous driving,” IEEE Trans. Intelligent Vehicles, vol. 8, no. 3, pp. 2604–2615, 2023. doi: 10.1109/TIV.2022.3167616
    [30]
    B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. Al Sallab, S. Yogamani, and P. Pérez, “Deep reinforcement learning for autonomous driving: A survey,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 6, pp. 4909–4926, 2022. doi: 10.1109/TITS.2021.3054625
    [31]
    J. Wu, Z. Huang, and C. Lv, “Uncertainty-aware model-based reinforcement learning: Methodology and application in autonomous driving,” IEEE Trans. Intelligent Vehicles, vol. 8, no. 1, pp. 194–203, 2023. doi: 10.1109/TIV.2022.3185159
    [32]
    Y. Zheng, R. Yan, B. Jia, and R. Jiang, “Feedback forecasting based deep deterministic policy gradient algorithm for car-following of autonomous vehicle,” in Proc. IEEE Int. Conf. Unmanned Systems, 2021, pp. 396–401.
    [33]
    D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” in Proc. Int. Conf. Machine Learning, 2014, pp. 387–395.
    [34]
    Y. Yuan, Q. He, X. Luo, and M. Shang, “A multilayered-and-randomized latent factor model for high-dimensional and sparse matrices,” IEEE Trans. Big Data, vol. 8, no. 3, pp. 784–794, 2020.
    [35]
    M. Shang, Y. Yuan, X. Luo, and M. Zhou, “An αβ-divergence-generalized recommender for highly accurate predictions of missing user preferences,” IEEE Trans. Cyber., vol. 52, no. 8, pp. 8006–8018, 2021.
    [36]
    T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in Proc. Int. Conf. Machine Learning, 2018, pp. 1861–1870.
    [37]
    J. Duan, Y. Ren, F. Zhang, Y. Guan, D. Yu, S. E. Li, B. Cheng, and L. Zhao, “Encoding distributional soft actor-critic for autonomous driving in multi-lane scenarios,” arXiv preprint arXiv: 2109.05540, 2021.
    [38]
    L. Zhang, R. Zhang, T. Wu, R. Weng, M. Han, and Y. Zhao, “Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles,” IEEE Trans. Neural Networks and Learning Systems, vol. 32, no. 12, pp. 5435–5444, 2021. doi: 10.1109/TNNLS.2021.3084685
    [39]
    T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, “Reinforcement learning with deep energy-based policies,” in Proc. Int. Conf. Machine Learning, 2017, pp. 1352–1361.
    [40]
    H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double Q-learning,” in Proc. AAAI Conf. Artificial Intelligence, 2016, vol. 30, p. 1.
    [41]
    Y. Zhuang, Q. Gu, B. Wang, J. Luo, H. Zhang, and W. Liu, “Robust auto-parking: Reinforcement learning based real-time planning approach with domain template,” 2018. [Online], Available: https://openreview.net/forum?id=BylpU61C97.
    [42]
    S. Peicheng, L. Li, X. Ni, and A. Yang, “Intelligent vehicle path tracking control based on improved MPC and hybrid PID,” IEEE Access, vol. 10, pp. 94133–94144, 2022. doi: 10.1109/ACCESS.2022.3203451
    [43]
    Z. Yuan, Z. Wang, X. Li, L. Li, and L. Zhang, “Hierarchical trajectory planning for narrow-space automated parking with deep reinforcement learning: A federated learning scheme,” Sensors, vol. 23, no. 8, p. 4087, 2023. doi: 10.3390/s23084087
    [44]
    E. Leurent, “An environment for autonomous driving decision-making,” GitHub, 2018. [Online]. Available: https://github.com/eleurent/highway-env.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(12)  / Tables(12)

    Article Metrics

    Article views (273) PDF downloads(51) Cited by()

    Highlights

    • The parking deviations of existing vehicles have an impact on automatic parking
    • This impact is evaluated in terms of safety, comfort, deviation, and efficiency
    • The parking is decomposed to several simplify tasks, improve parking performance
    • Soft Actor-Critic is used to optimize the end-to-end automatic parking policy

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return