Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control

Yisha Li; Ya Zhang; Xinde Li; Changyin Sun

doi:10.1109/JAS.2024.124365

Volume 11 Issue 9

Sep. 2024

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > 11(9): 1987-1998

Y. Li, Y. Zhang, X. Li, and C. Sun, “Regional multi-agent cooperative reinforcement learning for city-level traffic grid signal control,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1987–1998, Sept. 2024. doi: 10.1109/JAS.2024.124365

Citation:

Y. Li, Y. Zhang, X. Li, and C. Sun, “Regional multi-agent cooperative reinforcement learning for city-level traffic grid signal control,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 9, pp. 1987–1998, Sept. 2024. doi: 10.1109/JAS.2024.124365

Citation:

PDF( 2917 KB)

Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control

doi: 10.1109/JAS.2024.124365

Yisha Li^,,
Ya Zhang^{,
,},
Xinde Li^{,
,},
Changyin Sun^,

Funds: This work was supported by the National Science and Technology Major Project (2021ZD0112702), the National Natural Science Foundation (NNSF) of China (62373100, 62233003), and the Natural Science Foundation of Jiangsu Province of China (BK20202006)

More Information

Author Bio:
Yisha Li received the B.S. degree in automation from Jiangsu University in 2021. She is a master student in electronic information at Southeast University. Her research interests include multiagent systems, reinforcement learning and intelligent transportation systems

Ya Zhang (Senior Member, IEEE) received the B.S. degree in applied mathematics from China University of Mining and Technology in 2004, and the Ph.D. degree in control engineering from Southeast University in 2010. Since 2010, she has been with Southeast University, where she is currently a Professor with the School of Automation. Her research interests include multiagent systems, reinforcement learning, and network security

Xinde Li (Senior Member, IEEE) received the Ph.D. degree in control theory and control engineering from the Department of Control Science and Engineering, Huazhong University of Science and Technology (HUST) in 2007. Afterward, he joined the School of Automation, Southeast University where he is currently a Professor and the Ph.D. Supervisor. His research interests include information fusion, object recognition, computer vision, and intelligent robot

Changyin Sun (Senior Member, IEEE) received the B.S. degree in applied mathematics from the College of Mathematics, Sichuan University in 1996, and the M.S. and Ph.D. degrees in electrical engineering from Southeast University, in 2001 and 2004, respectively. He is currently a Professor with the School of Automation, Southeast University. His current research interests include intelligent control, flight control, pattern recognition, and optimal theory. He is an Associate Editor of the IEEE Transactions on Neural Networks and Learning Systems, the IEEE Neural Processing Letters, and the IEEE/CAA Journal of Automatica Sinica
Corresponding author: Ya Zhang, e-mail: yazhang@seu.edu.cn; Xinde Li, e-mail: xindeli@seu.edu.cn
Received Date: 2024-01-22
Revised Date: 2024-02-23
Accepted Date: 2024-02-29

Available Online: 2024-06-07

Abstract

Abstract

This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system. A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency. Firstly a regional multi-agent Q-learning framework is proposed, which can equivalently decompose the global Q value of the traffic system into the local values of several regions. Based on the framework and the idea of human-machine cooperation, a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to real-time traffic flow densities. In order to achieve better cooperation inside each region, a lightweight spatio-temporal fusion feature extraction network is designed. The experiments in synthetic, real-world and city-level scenarios show that the proposed RegionSTLight converges more quickly, is more stable, and obtains better asymptotic performance compared to state-of-the-art models.
- Human-machine cooperation,
- mixed domain attention mechanism,
- multi-agent reinforcement learning,
- spatio-temporal feature,
- traffic signal control

FullText(HTML)

References(39)

References

[1]	K. N. Qureshi and A. H. Abdullah, “A survey on intelligent transportation systems,” Middle-East J. Scientific Research, vol. 15, no. 5, pp. 629–642, 2013.
[2]	H. Wei, G. Zheng, V. Gayah, and Z. Li, “A survey on traffic signal control methods,” arXiv preprint arXiv: 1904.08117, 2019.
[3]	A. J. Miller, “Settings for fixed-cycle traffic signals,” J. Operational Research Society, vol. 14, no. 4, pp. 373–386, 1963. doi: 10.1057/jors.1963.61
[4]	A. Salkham, R. Cunningham, A. Garg, and V. Cahill, “A collaborative reinforcement learning approach to urban traffic control optimization,” in Proc. IEEE/WIC/ACM Int. Conf. Web Intelligence and Intelligent Agent Tech., 2008, vol. 2, pp. 560–566.
[5]	G. F. Newell, “Approximation methods for queues with application to the fixed-cycle traffic light,” Siam Review, vol. 7, no. 2, pp. 223–240, 1965. doi: 10.1137/1007038
[6]	P. Varaiya, “The max-pressure controller for arbitrary networks of signalized intersections,” in Advances in Dynamic Network Modeling in Complex Transportation Systems. New York, USA: Springer, 2013, pp. 27–66.
[7]	X. Zang, H. Yao, G. Zheng, N. Xu, K. Xu, and Z. Li, “MetaLight: Value-based meta-reinforcement learning for traffic signal control,” in Proc. AAAI Conf. Artificial Intelligence, 2020, vol. 34, no. 1, pp. 1153–1160.
[8]	B. Abdulhai, R. Pringle, and G. J. Karakoulas, “Reinforcement learning for true adaptive traffic signal control,” J. Transportation Engineering, vol. 129, no. 3, pp. 278–285, 2003.
[9]	X. Liang, X. Du, G. Wang, and Z. Han, “A deep reinforcement learning network for traffic light cycle control,” IEEE Trans. Vehicular Technology, vol. 68, no. 2, pp. 1243–1253, 2019. doi: 10.1109/TVT.2018.2890726
[10]	L. Li, Y. Lv, and F.-Y. Wang, “Traffic signal timing via deep reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 247–254, 2016. doi: 10.1109/JAS.2016.7508798
[11]	J. Wu and Y. Lou, “Efficient centralized traffic grid signal control based on meta-reinforcement learning,” IEEE/CAA J. Autom. Sinica, 2023. DOI: 10.1109/JAS.2023.123270
[12]	L. Prashanth and S. Bhatnagar, “Reinforcement learning with function approximation for traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 12, no. 2, pp. 412–421, 2010.
[13]	L. N. Alegre, T. Ziemke, and A. L. Bazzan, “Using reinforcement learning to control traffic signals in a real-world scenario: An approach based on linear function approximation,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 7, pp. 9126–9135, 2021.
[14]	Y. Liu, L. Liu, and W.-P. Chen, “Intelligent traffic light control using distributed multi-agent Q learning,” in Proc. IEEE 20th Int. Conf. Intelligent Transportation Systems, 2017, pp. 1–8.
[15]	T. Chu, J. Wang, L. Codecà, and Z. Li, “Multi-agent deep reinforcement learning for large-scale traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 21, no. 3, pp. 1086–1095, 2019.
[16]	X. Wang, L. Ke, Z. Qiao, and X. Chai, “Large-scale traffic signal control using a novel multiagent reinforcement learning,” IEEE Trans. Cybern., vol. 51, no. 1, pp. 174–187, 2020.
[17]	Z. Li, H. Yu, G. Zhang, S. Dong, and C.-Z. Xu, “Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning,” Transportation Research Part C: Emerging Technologies, vol. 125, p. 103059, 2021. doi: 10.1016/j.trc.2021.103059
[18]	T. Chu, S. Qu, and J. Wang, “Large-scale traffic grid signal control with regional reinforcement learning,” in Proc. American Control Conf., 2016, pp. 815–820.
[19]	T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, 2019.
[20]	S. Jiang, Y. Huang, M. Jafari, and M. Jalayer, “A distributed multi-agent reinforcement learning with graph decomposition approach for large-scale adaptive traffic signal control,” IEEE Trans. Intelligent Transportation Systems, vol. 23, no. 9, pp. 14689–14701, 2023.
[21]	L. Yan, L. Zhu, K. Song, Z. Yuan, Y. Yan, Y. Tang, and C. Peng, “Graph cooperation deep reinforcement learning for ecological urban traffic signal control,” Applied Intelligence, vol. 53, no. 6, pp. 6248–6265, 2023. doi: 10.1007/s10489-022-03208-w
[22]	H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “Colight: Learning network-level cooperation for traffic signal control,” in Proc. 28th ACM Int. Conf. Information and Knowledge Management, 2019, pp. 1913–1922.
[23]	L. Wu, M. Wang, D. Wu, and J. Wu, “DynSTGAT: DynAMIC spatial-temporal graph attention network for traffic signal control,” in Proc. 30th ACM Int. Conf. Inform. & Knowledge Management, 2021, pp. 2150–2159.
[24]	H. Wei, C. Chen, G. Zheng, K. Wu, V. Gayah, K. Xu, and Z. Li, “PressLight: Learning max pressure control to coordinate traffic signals in arterial network,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2019, pp. 1290–1298.
[25]	S. Guicheng and W. Yang, “Review on DEC-Pomdp model for MARL algorithms,” in Proc. Smart Communi., Intelligent Algorithms and Interactive Methods; 4th Int. Conf. Wireless Communi. and Appli., 2022, pp. 29–35.
[26]	M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Machine Learning Proceedings 1994. New Brunswick, USA: Elsevier, 1994, pp. 157–163.
[27]	R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. London, UK: MIT Press, 2018.
[28]	R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Inform. Proc. Syst., 2017, vol. 30, pp. 6382–6393.
[29]	P. Sunehag, G. Lever, A. Gruslys, et al., “Value-decomposition networks for cooperative multi-agent learning,” arXiv preprint arXiv: 1706.05296, 2017.
[30]	T. Rashid, M. Samvelyan, C. Schroeder, G. Farquhar, J. Foerster, and S. Whiteson, “Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning,” in Proc. Int. Conf. Machine Learning, 2018, pp. 4295–4304.
[31]	C. Guestrin, D. Koller, and R. Parr, “Multiagent planning with factored MDPs,” in Proc. 14th Int. Conf. Neural Inform. Processing Syst.: Natural and Synthetic, 2001 vol. 14, pp. 1523–1530.
[32]	G. Tesauro, “Extending Q-learning to general adaptive multi-agent systems,” in Proc. 16th Int. Conf. Neural Inform. Proc. Syst., 2003 vol. 16, pp. 871–878.
[33]	M. Tan, “Multi-agent reinforcement learning: Independent vs. cooperative agents,” in Proc. 10th Int. Conf. Machine Learning, 1993, pp. 330–337.
[34]	P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” Stat, vol. 150, p. 20, 2017.
[35]	Q. Wang, B. Wu, P. Zhu, P. Li, and Q. Hu, “ECA-Net: Efficient channel attention for deep convolutional neural networks,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020.
[36]	H. Zhang, S. Feng, C. Liu, Y. Ding, Y. Zhu, Z. Zhou, W. Zhang, Y. Yu, H. Jin, and Z. Li, “CityFlow: A multi-agent reinforcement learning environment for large scale city traffic scenario,” in Proc. World Wide Web Conf., 2019, pp. 3620–3624.
[37]	P. Varaiya, The Max-Pressure Controller for Arbitrary Networks of Signalized Intersections. New York, USA: Springer, 2013.
[38]	T. Nishi, K. Otaki, K. Hayakawa, and T. Yoshimura, “Traffic signal control based on reinforcement learning with graph convolutional neural nets,” in Proc. 21st IEEE Int. Conf. Intelligent Transportation Systems, 2018.
[39]	P. Zhou, X. Chen, Z. Liu, T. Braud, P. Hui, and J. Kangasharju, “DRLE: Decentralized reinforcement learning at the edge for traffic light control in the IOV,” IEEE Trans. Intelligent Transportation Systems, vol. 22, no. 4, pp. 2262–2273, 2020.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10) / Tables(5)

Get Citation

PDF

XML

Article Metrics

Article views (150) PDF downloads(48)

Highlights

As to the city-level traffic signal control problem, a regional multi-agent Q-learning framework is developed to simplify the overall complex traffic signal control problem to several regional control problems
Based on the idea of human-machine cooperation, a dynamic zoning approach is designed to divided the entire traffic network into several strong-coupled regions
A lightweight spatio-temporal fusion feature extraction network is designed to achieve better cooperation inside each region
The numerical experiments are conducted under a synthetic scenario, a real-world scenario and a city-level scenario to illustrate the effectiveness of the proposed method

Regional Multi-Agent Cooperative Reinforcement Learning for City-Level Traffic Grid Signal Control

doi: 10.1109/JAS.2024.124365

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content