An Overview and Experimental Study of Learning-Based Optimization Algorithms for the Vehicle Routing Problem

Bingjie Li; Guohua Wu; Yongming He; Mingfeng Fan; Witold Pedrycz

doi:10.1109/JAS.2022.105677

Volume 9 Issue 7

Jul. 2022

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2022 > 9(7): 1115-1138

B. J. Li, G. H. Wu, Y. M. He, M. F. Fan, and W. Pedrycz, “An overview and experimental study of learning-based optimization algorithms for the vehicle routing problem,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1115–1138, Jul. 2022. doi: 10.1109/JAS.2022.105677

Citation:

B. J. Li, G. H. Wu, Y. M. He, M. F. Fan, and W. Pedrycz, “An overview and experimental study of learning-based optimization algorithms for the vehicle routing problem,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1115–1138, Jul. 2022. doi: 10.1109/JAS.2022.105677

Citation:

PDF( 2393 KB)

An Overview and Experimental Study of Learning-Based Optimization Algorithms for the Vehicle Routing Problem

doi: 10.1109/JAS.2022.105677

Bingjie Li^1
,,
Guohua Wu^{1
,
,},
Yongming He^2
,,
Mingfeng Fan^1
,,
Witold Pedrycz^{3, 4
,}

1.
School of Traffic and Transportation Engineering, Central South University, Changsha 410075, China
2.
College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
3.
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada, the Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4.
Systems Research Institute, Polish Academy of Sciences, Warsaw 01447, Poland

More Information

Author Bio:
Bingjie Li received the B.E. degree from Central South University in 2019. Currently, she is working toward the M.E. degree at the School of Traffic and Transportation Engineering, Central South University. Her research interests include machine learning and path planning

Guohua Wu received the B.S. degree in information systems and Ph.D degree in operations research from National University of Defense Technology in 2008 and 2014, respectively. During 2012 and 2014, he was a visiting Ph.D student at University of Alberta, Canada. He is currently a Professor at the School of Traffic and Transportation Engineering, Central South University. His current research interests include planning and scheduling, computational intelligence and machine learning. He has authored more than 100 referred papers including those published in IEEE TCYB, IEEE TSMCA and IEEE TEVC. He serves as an Associate Editor of Information Sciences, and an Associate Editor of Swarm and Evolutionary Computation Journal, an Editorial Board Member of International Journal of Bio-Inspired Computation, and Guest Editors of several journals

Yongming He received the B.S. degree in logistics engineering from Chang’an University in 2014. He is currently pursuing the Ph.D. degree in management science and engineering with the College of Systems Engineering, National University of Defense Technology. He was a visiting Ph.D. student with the University of Alberta, Canada, from November 2018 to November 2019. His research interests include operations research, artificial intelligence, intelligent decision, task scheduling, and planning

Mingfeng Fan received the B.S. degree in transport equipment and control engineering from Central South University in 2019. She is currently pursuing the Ph.D. degree in traffic and transportation engineering with Central South University. Her research interests include machine learning and UAV path planning

Witold Pedrycz (Fellow, IEEE) received the M.Sc. degree in computer science, the Ph.D. degree in computer engineering, and the D.Sci. degree in systems science from the Silesian University of Technology, Gliwice, Poland, in 1977, 1980, and 1984, respectively. He is a Professor and the Canada Research Chair (CRC-Computational Intelligence) with the Department of Electrical and Computer Engineering, University of Alberta, Canada, and also with the Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Saudi Arabia. He is also with the Systems Research Institute, Polish Academy of Sciences, Poland. In 2012, he was elected a Fellow of the Royal Society of Canada. His current research interests include computational intelligence, knowledge discovery and data mining, pattern recognition, knowledge-based neural networks, and software engineering. He has published numerous papers in the above areas. He is an Editor-in-Chief of Information Sciences. He currently serves as an Associate Editor of IEEE Transactions on Fuzzy Systems and IEEE Transactions on Systems, Man and Cybernetics: System and is a Member of a number of editorial boards of other international journals
Corresponding author: Guohua Wu, e-mail: guohuawu@csu.edu.cn
Received Date: 2021-11-12
Revised Date: 2022-01-01
Accepted Date: 2022-02-16

Available Online: 2022-05-20

Abstract

Abstract

The vehicle routing problem (VRP) is a typical discrete combinatorial optimization problem, and many models and algorithms have been proposed to solve the VRP and its variants. Although existing approaches have contributed significantly to the development of this field, these approaches either are limited in problem size or need manual intervention in choosing parameters. To solve these difficulties, many studies have considered learning-based optimization (LBO) algorithms to solve the VRP. This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches. We performed a statistical analysis of the reviewed articles from various aspects and designed three experiments to evaluate the performance of four representative LBO algorithms. Finally, we conclude the applicable types of problems for different LBO algorithms and suggest directions in which researchers can improve LBO algorithms.
- End-to-end approaches,
- learning-based optimization (LBO) algorithms,
- reinforcement learning,
- step-by-step approaches,
- vehicle routing problem (VRP)

FullText(HTML)

References(159)

References

[1]	T. Pinto, C. Alves, and J. V. de Carvalho, “Models and advanced optimization algorithms for the integrated management of logistics operations,” in Proc. IO2017, Valença, Portugal, pp. 313–324.
[2]	R. S. Khan, J. B. Yang, and J. Handl, “Interactive multi-objective vehicle routing via GA-based dynamic programming,” in Proc. Int. Conf. Transportation Information and Safety, Wuhan, China, 2015, pp. 318–322.
[3]	K. Dorling, J. Heinrichs, G. G. Messier, and S. Magierowski, “Vehicle routing problems for drone delivery,” IEEE Trans. Syst. Man Cybern. Syst., vol. 47, no. 1, pp. 70–85, Jan. 2017. doi: 10.1109/TSMC.2016.2582745
[4]	P. Toth and D. Vigo, “An overview of vehicle routing problems,” in The Vehicle Routing Problem, P. Toth and D. Vigo, Eds. Philadelphia, USA: Society for Industrial and Applied Mathematics, 2002, pp. 1–26.
[5]	J. F. Cordeau, G. Laporte, M. W. P. Savelsbergh, and D. Vigo, “Vehicle routing,” Handb. Oper. Res. Manage. Sci., vol. 14, pp. 367–428, 2007.
[6]	D. Pisinger and S. Ropke, “A general heuristic for vehicle routing problems,” Comput. Oper. Res., vol. 34, no. 8, pp. 2403–2435, Aug. 2007. doi: 10.1016/j.cor.2005.09.012
[7]	J. Euchi and H. Chabchoub, “A hybrid tabu search to solve the heterogeneous fixed fleet vehicle routing problem,” Logist. Res., vol. 2, no. 1, pp. 3–11, Jun. 2010. doi: 10.1007/s12159-010-0028-3
[8]	M. B. Guzairov, N. I. Yusupova, O. N. Smetanina, and E. Y. Rassadnikova, “Models and algorithms for the vehicle routing problem with time windows and other conditions,” in Proc. 13th Int. Scientific-Technical Conf. Actual Problems of Electronics Instrument Engineering, Novosibirsk, Russia, 2016, pp. 412–416.
[9]	H. Afaq and S. Saini, “A novel approach to solve graph based travelling salesman problem using particle swarm optimization technique,” in Proc. IEEE Int. Conf. Computational Intelligence and Computing Research, Coimbatore, India, 2012, pp. 1–4.
[10]	M. A. Mohammed, M. S. Ahmad, and S. A. Mostafa, “Using genetic algorithm in implementing capacitated vehicle routing problem,” in Proc. Int. Conf. Computer & Information Science, Kuala Lumpur, Malaysia, 2012, pp. 257–262.
[11]	J. H. Wilck IV and T. M. Cavalier, “A construction heuristic for the split delivery vehicle routing problem,” Am. J. Oper. Res., vol. 2, no. 2, pp. 153–162, Jun. 2012.
[12]	U. Ritzinger, J. Puchinger, and R. F. Hartl, “A survey on dynamic and stochastic vehicle routing problems,” Int. J. Prod. Res., vol. 54, no. 1, pp. 215–231, Jan. 2016. doi: 10.1080/00207543.2015.1043403
[13]	T. Jastrzab and A. Buchcik, “Practical applications of smart delivery systems: Mine evacuation as an example of rich vehicle routing problem,” Smart Delivery Systems, J. Nalepa, Ed. Amsterdam, The Netherlands: Elsevier, 2020, pp. 249–268.
[14]	J. K. Lenstra and A. H. G. R. Kan, “Complexity of vehicle routing and scheduling problems,” Networks, vol. 11, no. 2, pp. 221–227, 1981. doi: 10.1002/net.3230110211
[15]	B. Golden, S. Raghavan, and E. Wasil, The Vehicle Routing Problem: Latest Advances and New Challenges. New York, USA: Springer, 2008.
[16]	J. Mańdziuk, “New shades of the vehicle routing problem: Emerging problem formulations and computational intelligence solution methods,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 3, no. 3, pp. 230–244, Jun. 2019. doi: 10.1109/TETCI.2018.2886585
[17]	A. O. Adewumi and O. J. Adeleke, “A survey of recent advances in vehicle routing problems,” Int. J. Syst. Assur. Eng. Manage., vol. 9, no. 1, pp. 155–172, Feb. 2018. doi: 10.1007/s13198-016-0493-4
[18]	A. Dixit, A. Mishra, and A. Shukla, “Vehicle routing problem with time windows using meta-heuristic algorithms: A survey,” in Harmony Search and Nature Inspired Optimization Algorithms, N. Yadav, A. Yadav, J. C. Bansal, K. Deep, and J. H. Kim, Eds. Singapore: Springer, 2019, pp. 539–546.
[19]	W. J. Cao and W. S. Yang, “A survey of vehicle routing problem,” MATEC Web Conf., vol. 100, p. 01006. Mar. 2017.
[20]	R. Goel and R. Maini, “Vehicle routing problem and its solution methodologies: A survey,” Int. J. Logist. Syst. Manage., vol. 28, no. 4, pp. 419–435, Nov. 2017.
[21]	K. A. Smith, “Neural networks for combinatorial optimization: A review of more than a decade of research,” INFORMS J. Comput., vol. 11, no. 1, pp. 15–34, Feb. 1999. doi: 10.1287/ijoc.11.1.15
[22]	O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “State-of-the-art in artificial neural network applications: A survey,” Heliyon, vol. 4, no. 11, p. E00938, Nov. 2018.
[23]	X. D. Zhang, “Machine learning,” in A Matrix Algebra Approach to Artificial Intelligence, X. D. Zhang, Ed. Singapore: Springer, 2020, pp. 223–440.
[24]	I. El Naqa and M. J. Murphy, “What is machine learning?” in Machine Learning in Radiation Oncology, I. El Naqa, R. J. Li, M. J. Murphy, Eds. Cham, Germany: Springer, 2015, pp. 3–11.
[25]	A. François, Q. Cappart, and L. M. Rousseau, “How to evaluate machine learning approaches for combinatorial optimization: Application to the travelling salesman problem,” arXiv preprint arXiv: 1909.13121, 2019.
[26]	R. B. Bai, X. N. Chen, Z. L. Chen, T. X. Cui, S. H. Gong, W. T. He, X. P. Jiang, H. Jin, J. H. Jin, G. Kendall, J. W. Li, Z. Lu, J. F. Ren, P. Weng, N. Xue, and H. Y. Zhang, “Analytics and machine learning in vehicle routing research,” arXiv preprint arXiv: 2102.10012, 2021
[27]	G. B. Dantzig and J. H. Ramser, “The truck dispatching problem,” Manage Sci, vol. 6, no. 1, pp. 80–91, Oct. 1959.
[28]	G. Clarke and J. W. Wright, “Scheduling of vehicles from a central depot to a number of delivery points,” Oper. Res., vol. 12, no. 4, pp. 568–581, Aug. 1964. doi: 10.1287/opre.12.4.568
[29]	G. Laporte, “What you should know about the vehicle routing problem,” Naval Res. Logist., vol. 54, no. 8, pp. 811–819, Dec. 2007. doi: 10.1002/nav.20261
[30]	G. Laporte, “Fifty years of vehicle routing,” Transport Sci, vol. 43, no. 4, pp. 408–416, Oct. 2009. doi: 10.1287/trsc.1090.0301
[31]	R. Fukasawa, H. Longo, J. Lysgaard, M. P. de Aragão, M. Reis, E. Uchoa, and R. F. Werneck, “Robust branch-and-cut-and-price for the capacitated vehicle routing problem,” Math. Program., vol. 106, no. 3, pp. 491–511, May 2006. doi: 10.1007/s10107-005-0644-x
[32]	M. Junger, G. Reinelt, and G. Rinaldi, “The traveling salesman problem,” Handb. Oper. Res. Manage. Sci., vol. 7, pp. 225–330, 1995.
[33]	P. Toth and D. Vigo, “Models, relaxations and exact approaches for the capacitated vehicle routing problem,” Discrete Appl. Math., vol. 123, no. 1–3, pp. 487–512, Nov. 2002. doi: 10.1016/S0166-218X(01)00351-1
[34]	M. Fisher, “Vehicle routing,” Handb. Oper. Res. Manage. Sci., vol. 8, pp. 1–33, 1995.
[35]	J. Caceres-Cruz, P. Arias, D. Guimarans, D. Riera, and A. A. Juan, “Rich vehicle routing problem: Survey,” ACM Comput. Surv., vol. 47, no. 2, Article No. 32, Jan. 2015.
[36]	G. Laporte and Y. Nobert, “Exact algorithms for the vehicle routing problem,” North-Holland Math. Stud., vol. 132, pp. 147–184, 1987.
[37]	G. Laporte, H. Mercure, and Y. Nobert, “An exact algorithm for the asymmetrical capacitated vehicle routing problem,” Networks, vol. 16, no. 1, pp. 33–46, Mar. 1986. doi: 10.1002/net.3230160104
[38]	S. Eilon, C. D. T. Watson-Gandy, N. Christofides, and R. de Neufville, “Distribution management-mathematical modelling and practical analysis,” IEEE Trans. Syst. Man Cybern., vol. SMC-4, no. 6, pp. 589–589, Nov. 1974.
[39]	M. R. Rao and S. Zionts, “Allocation of transportation units to alternative trips—a column generation scheme with out-of-kilter subproblems,” Oper. Res., vol. 16, no. 1, pp. 52–63, Feb. 1968. doi: 10.1287/opre.16.1.52
[40]	N. Christofides and S. Eilon, “An algorithm for the vehicle-dispatching problem,” J. Oper. Res. Soc., vol. 20, no. 3, pp. 309–318, Sep. 1969. doi: 10.1057/jors.1969.75
[41]	N. Christofides, A. Mingozzi, and P. Toth, “Exact algorithms for the vehicle routing problem, based on spanning tree and shortest path relaxations,” Math. Program., vol. 20, no. 1, pp. 255–282, Dec. 1981. doi: 10.1007/BF01589353
[42]	B. L. Golden, E. A. Wasil, J. P. Kelly, and I. M. Chao, “The impact of metaheuristics on solving the vehicle routing problem: Algorithms, problem sets, and computational results,” in Fleet Management and Logistics, T. G. Crainic and G. Laporte, Eds. New York, USA: Springer, 1998, pp. 33–56.
[43]	S. N. Kumar and R. Panneerselvam, “A survey on the vehicle routing problem and its variants,” Intell. Inf. Manage., vol. 4, no. 3, pp. 66–74, May 2012.
[44]	J. F. Cordeau and G. Laporte, “Tabu search heuristics for the vehicle routing problem,” Metaheuristic Optimization via Memory and Evolution, R. Sharda, S. Voß, C. Rego, and B. Alidaee, Eds. Boston, USA: Springer, 2005, pp. 145–163.
[45]	G. A. P. Kindervater and M. W. P. Savelsbergh, “Vehicle routing: Handling edge exchanges,” Local Search in Combinatorial Optimization, E. Aarts and J. K. Lenstra, Eds. New York, USA: Wiley, 1997.
[46]	S. Desale, A. Rasool, S. Andhale, and P. Rane, “Heuristic and meta-heuristic algorithms and their relevance to the real world: A survey,” Int. J. Comput. Eng. Res. Trends, vol. 351, no. 5, pp. 2349–7084, 2015.
[47]	G. Laporte and F. Semet, “Classical heuristics for the capacitated vrp,” in The Vehicle Routing Problem, P. Toth and D. Vigo, Eds. Philadelphia, USA: Society for Industrial and Applied Mathematics, 2002, pp. 109–128.
[48]	T. O. Ayodele, “Types of machine learning algorithms,” in New Advances in Machine Learning, Y. G. Zhang, Ed. IntechOpen, 2010, pp. 19–48.
[49]	C. M. Bishop, Pattern Recognition and Machine Learning. New York, USA: Springer, 2006.
[50]	L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artif. Intell. Res., vol. 4, no. 1, pp. 237–285, Jun. 1996.
[51]	R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. 2nd ed. Cambridge, USA: MIT Press, 2018.
[52]	J. C. Yan, S. Yang, and E. Hancock, “Learning for graph matching and related combinatorial optimization problems,” in Proc. 29th Int. Joint Conf. Artificial Intelligence, Yokohama, Japan, 2020, pp. 4988–4996.
[53]	B. Mahesh, “Machine learning algorithms—a review,” Int. J. Sci. Res., vol. 9, no. 1, pp. 381–386, Jan. 2020.
[54]	M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” J. Artif. Intell. Res., vol. 47, pp. 253–279, Jun. 2013. doi: 10.1613/jair.3912
[55]	D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016. doi: 10.1038/nature16961
[56]	S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” J. Mach. Learn. Res., vol. 17, no. 1, pp. 1334–1373, Jan. 2016.
[57]	K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. 32nd Int. Conf. Int. Conf. Machine Learning, Lille, France, 2015, pp. 2048–2057.
[58]	H. D. Song, I. Triguero, and E. Özcan, “A review on the self and dual interactions between machine learning and optimisation,” Prog. Artif. Intell., vol. 8, no. 2, pp. 143–165, Jun. 2019. doi: 10.1007/s13748-019-00185-z
[59]	M. Karimi-Mamaghan, M. Mohammadi, P. Meyer, A. M. Karimi-Mamaghan, and E. G. Talbi, “Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art,” Eur. J. Oper. Res., vol. 296, no. 2, pp. 393–422, Jan. 2022. doi: 10.1016/j.ejor.2021.04.032
[60]	L. M. Gambardella and M. Dorigo, “Ant-Q: A Reinforcement Learning approach to the traveling salesman problem,” in Machine Learning Proceedings 1995, A. Prieditis and S. Russell, Eds. Amsterdam, The Netherlands: Elsevier, 1995, pp. 252–260.
[61]	F. C. de Lima Junior, J. D. de Melo, and A. D. D. Neto, “Using Q-learning algorithm for initialization of the GRASP metaheuristic and genetic algorithm,” in Proc. Int. Joint Conf. Neural Networks, Orlando, USA, 2007, pp. 1243–1248.
[62]	M. M. Alipour, S. N. Razavi, M. R. F. Derakhshi, and M. A. Balafar, “A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem,” Neural Comput. Appl., vol. 30, no. 9, pp. 2935–2951, Nov. 2018. doi: 10.1007/s00521-017-2880-4
[63]	F. Liu and G. Z. Zeng, “Study of genetic algorithm with reinforcement learning to solve the TSP,” Exp. Syst. Appl., vol. 36, no. 3, pp. 6995–7001, Apr. 2009. doi: 10.1016/j.eswa.2008.08.026
[64]	T. Phiboonbanakit, T. Horanont, T. Supnithi, and V. N. Huynh, “Knowledge-based learning for solving vehicle routing problem,” in Proc. ACM Int. Joint Conf. and 2018 Int. Symp. Pervasive and Ubiquitous Computing and Wearable Computers, Singapore, 2018, pp. 1103–1111.
[65]	J. Y. Ding, C. Zhang, L. Shen, S. Y. Li, B. Wang, Y. H. Xu, and L. Song, “Accelerating primal solution findings for mixed integer programs based on solution prediction,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 2, pp. 1452–1459, Apr. 2020.
[66]	Y. Sun, A. Ernst, X. D. Li, and J. Weiner, “Generalization of machine learning for problem reduction: A case study on travelling salesman problems,” OR Spectrum, vol. 43, no. 3, pp. 607–633, Sep. 2021. doi: 10.1007/s00291-020-00604-x
[67]	P. L. N. U. Cooray and T. D. Rupasinghe, “Machine learning-based parameter tuned genetic algorithm for energy minimizing vehicle routing problem,” J. Ind. Eng., vol. 2017, Jan. 2017.
[68]	F. Al-Duoli, G. Rabadi, M. Seck, and H. A. Handley, “Hybridizing meta-raps with machine learning algorithms,” in Proc. IEEE Technology and Engineering Management Conf., Evanston, USA, 2018, pp. 1–6.
[69]	P. Shaw, “A new local search algorithm providing high quality solutions to vehicle routing problems,” University of Strathclyde, Glasgow, Scotland, 1997.
[70]	G. Schrimpf, J. Schneider, H. Stamm-Wilbrandt, and G. Dueck, “Record breaking optimization results using the ruin and recreate principle,” J. Comput. Phys., vol. 159, no. 2, pp. 139–171, Apr. 2000. doi: 10.1006/jcph.1999.6413
[71]	D. Pisinger and S. Ropke, “Large neighborhood search,” in Handbook of Metaheuristics, M. Gendreau and J. Y. Potvin, Eds. New York, USA: Springer, 2010, pp. 399–419.
[72]	A. Hottung and K. Tierney, “Neural large neighborhood search for the capacitated vehicle routing problem,” Proc. 24th European Conf. Artificial Intelligence, Santiago de Compostela, Spain, 2020.
[73]	M. X. Chen, L. Gao, Q. C. Chen, and Z. X. Liu, “Dynamic partial removal: A neural network heuristic for large neighborhood search,” arXiv preprint arXiv: 2005.09330, 2020.
[74]	L. Gao, M. X. Chen, Q. C. Chen, G. Z. Luo, N. Y. Zhu, and Z. X. Liu, “Learn to design the heuristics for vehicle routing problem,” arXiv preprint arXiv: 2002.08539, 2020.
[75]	M. M. Solomon, “Algorithms for the vehicle routing and scheduling problems with time window constraints,” Oper. Res., vol. 35, no. 2, pp. 254–265, Apr. 1987. doi: 10.1287/opre.35.2.254
[76]	R. A. C. Bianchi, C. H. C. Ribeiro, and A. H. R. Costa, “On the relation between ant colony optimization and heuristically accelerated reinforcement learning,” in Proc. 1st Int. Workshop on Hybrid Control of Autonomous System, Pasadena, USA, 2009, pp. 49–55.
[77]	F. D. Yang, T. C. Jin, T. Y. Liu, X. M. Sun, and J. L. Zhang, “Boosting dynamic programming with neural networks for solving NP-hard problems,” in Proc. 10th Asian Conf. Machine Learning, Beijing, China, 2018, pp. 726–739.
[78]	W. Joe and H. C. Lau, “Deep reinforcement learning approach to solve dynamic vehicle routing problem with stochastic customers,” in Proc. Int. Conf. Autom. Plann. Sched., vol. 30, no. 1, pp. 394–402, Jun. 2020.
[79]	A. Delarue, R. Anderson, and C. Tjandraatmadja, “Reinforcement learning with combinatorial actions: An application to vehicle routing,” arXiv preprint arXiv: 2010.12001, 2020.
[80]	Y. Yao, Z. Peng, and B. Xiao, “Parallel hyper-heuristic algorithm for multi-objective route planning in a smart city,” IEEE Trans. Veh. Technol., vol. 67, no. 11, pp. 10307–10318, Nov. 2018. doi: 10.1109/TVT.2018.2868942
[81]	X. Y. Chen and Y. D. Tian, “Learning to perform local rewriting for combinatorial optimization,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 6281–6292.
[82]	H. Lu, X. W. Zhang, and S. Yang, “A learning-based iterative method for solving vehicle routing problems,” in Proc. 8th Int. Conf Learning Representations, Addis Ababa, Ethiopia, 2020.
[83]	P. R. de O. da Costa, J. Rhuggenaath, Y. Q. Zhang, and A. Akcay, “Learning 2-opt heuristics for the traveling salesman problem via deep reinforcement learning,” in Proc. 12th Asian Conf. Machine Learning, Bangkok, Thailand, 2020, pp. 465–480.
[84]	Y. X. Wu, W. Song, Z. G. Cao, J. Zhang, and A. Lim, “Learning improvement heuristics for solving routing problems,” IEEE Trans. Neural Netw. Learn. Syst., 2021, DOI: 10.1109/TNNLS.2021.3068828
[85]	O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Proc. 28th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 692–2700.
[86]	M. V. Vlastelica, A. Paulus, V. Musil, G. Martius, and M. Rolínek, “Differentiation of blackbox combinatorial solvers,” in Proc. 8th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020.
[87]	Y. N. Ma, J. W. Li, Z. G. Cao, W. Song, L. Zhang, Z. H. Chen, and J. Tang, “Learning to iteratively solve routing problems with dual-aspect collaborative transformer,” in Proc. 35th Conf. Neural Information Processing Systems, Sydney, Australia, 2021.
[88]	W. Qin, Z. L. Zhuang, Z. Z. Huang, and H. Z. Huang, “A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem,” Comput. Ind. Eng, vol. 156, p. 107252, Jun. 2021.
[89]	J. Mlejnek and J. Kubalik, “Evolutionary hyperheuristic for capacitated vehicle routing problem,” in Proc. 15th Annu. Conf. Companion on Genetic and Evolutionary Computation, Amsterdam, The Netherlands, 2013, pp. 219–220.
[90]	M. Okulewicz and J. Mańdziuk, “A particle swarm optimization hyper-heuristic for the dynamic vehicle routing problem,” arXiv preprint arXiv: 2006.08809, 2020.
[91]	D. Meignan, A. Koukam, and J. C. Créput, “Coalition-based metaheuristic: A self-adaptive metaheuristic using reinforcement learning and mimetism,” J. Heuristics, vol. 16, no. 6, pp. 859–879, Dec. 2010. doi: 10.1007/s10732-009-9121-7
[92]	S. Asta and E. Özcan, “An apprenticeship learning hyper-heuristic for vehicle routing in HyFlex,” in Proc. IEEE Symp. Evolving and Autonomous Learning Systems, Orlando, USA, 2014, pp. 65–72.
[93]	R. Tyasnurita, E. Özcan, and R. John, “Learning heuristic selection using a time delay neural network for open vehicle routing,” in Proc. IEEE Congr. Evolutionary Computation, Donostia, Spain, 2017, pp. 1474–1481.
[94]	P. Kerschke, L. Kotthoff, J. Bossek, H. H. Hoos, and H. Trautmann, “Leveraging TSP solver complementarity through machine learning,” Evol. Comput., vol. 26, no. 4, pp. 597–620, Dec. 2018. doi: 10.1162/evco_a_00215
[95]	K. F. Zhao, S. C. Liu, J. X. Yu, and Y. Rong, “Towards feature-free TSP solver selection: A deep learning approach,” in Proc. Int. Joint Conf. Neural Networks, Shenzhen, China, 2021, pp. 1–8.
[96]	A. E. Gutierrez-Rodríguez, S. E. Conant-Pablos, J. C. Ortiz-Bayliss, and H. Terashima-Marín, “Selecting meta-heuristics for solving vehicle routing problems with time windows via meta-learning,” Expert Syst. Appl., vol. 118, pp. 470–481, Mar. 2019. doi: 10.1016/j.eswa.2018.10.036
[97]	S. Martin, D. Ouelhadj, P. Beullens, E. Ozcan, A. A. Juan, and E. K. Burke, “A multi-agent based cooperative approach to scheduling and routing,” Eur. J. Oper. Res., vol. 254, no. 1, pp. 169–178, Oct. 2016. doi: 10.1016/j.ejor.2016.02.045
[98]	Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” arXiv preprint arXiv: 1506.00019, 2015.
[99]	J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proc. Natl. Acad. Sci. USA, vol. 79, no. 8, pp. 2554–2558, Apr. 1982. doi: 10.1073/pnas.79.8.2554
[100]	S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. doi: 10.1162/neco.1997.9.8.1735
[101]	M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681, Nov. 1997. doi: 10.1109/78.650093
[102]	I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3104–3112.
[103]	A. Karpathy and L. Fei-Fei, “Deep visual-semantic alignments for generating image descriptions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA: IEEE, 2015, pp. 3128–3137.
[104]	D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv: 1409.0473, 2014.
[105]	I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” arXiv preprint arXiv: 1611.09940, 2016.
[106]	D. Levy and L. Wolf, “Learning to align the source code to the compiled object code,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 2043–2051.
[107]	M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L. M. Rousseau, “Learning heuristics for the TSP by policy gradient,” in Proc. 15th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Delft, The Netherlands, 2018, pp. 170–181.
[108]	K. W. Li, T. Zhang, and R. Wang, “Deep reinforcement learning for multiobjective optimization,” IEEE Trans. Cybern., vol. 51, no. 6, pp. 3103–3114, Jun. 2021. doi: 10.1109/TCYB.2020.2977661
[109]	Y. Kaempfer and L. Wolf, “Learning the multiple traveling salesmen problem with permutation invariant pooling networks,” arXiv preprint arXiv: 1803.09621, 2018.
[110]	A. V. Le, P. Veerajagadheswar, P. Thiha Kyaw, M. R. Elara, and N. H. K. Nhan, “Coverage path planning using reinforcement learning-based TSP for htetran—a polyabolo-inspired self-reconfigurable tiling robot,” Sensors, vol. 21, no. 8, p. 2577, Apr. 2021.
[111]	Q. Ma, S. W. Ge, D. Y. He, D. Thaker, and I. Drori, “Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,” arXiv preprint arXiv: 1911.04936, 2019.
[112]	H. J. Dai, E. B. Khalil, Y. Y. Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6351–6361.
[113]	H. J. Dai, B. Dai, and L. Song, “Discriminative embeddings of latent variable models for structured data,” in Proc. 33rd Int. Conf. Int. Conf. Machine Learning, New York, USA, 2016, pp. 2702–2711.
[114]	A. L. C. Ottoni, E. G. Nepomuceno, and M. S. de Oliveira, “A response surface model approach to parameter estimation of reinforcement learning for the travelling salesman problem,” J. Control Autom. Electr. Syst., vol. 29, no. 3, pp. 350–359, Jun. 2018. doi: 10.1007/s40313-018-0374-y
[115]	Z. H. Fu, K. B. Qiu, and H. Y. Zha, “Generalize a small pre-trained model to arbitrarily large tsp instances,” in Proc. AAAI Conf. Artif. Intell., vo. 5, no. 8, pp. 7474–7482, May 2020.
[116]	R. K. Zhang, A. Prokhorchuk, and J. Dauwels, “Deep reinforcement learning for traveling salesman problem with time windows and rejections,” in Proc. Int. Joint Conf. Neural Networks, Glasgow, UK, 2020, pp. 1–8.
[117]	W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” arXiv preprint arXiv: 1803.08475, 2018.
[118]	Z. H. Xing and S. K. Tu, “A graph neural network assisted monte carlo tree search approach to traveling salesman problem,” IEEE Access, vol. 8, pp. 108418–108428, Jan. 2020. doi: 10.1109/ACCESS.2020.3000236
[119]	E. Groshev, A. Tamar, M. Goldstein, S. Srivastava, and P. Abbeel, “Learning generalized reactive policies using deep neural networks,” in Proc. 28th AAAI Spring Symposia, Stanford University, Palo Alto, USA, 2018.
[120]	C. K. Joshi, T. Laurent, and X. Bresson, “An efficient graph convolutional network technique for the travelling salesman problem,” arXiv preprint arXiv: 1906.01227, 2019.
[121]	M. Prates, P. H. C. Avelar, H. Lemos, L. C. Lamb, and M. Y. Vardi, “Learning to solve NP-complete problems: A graph neural network for decision TSP,” in Proc. 29th AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 4731–4738, Jul. 2019.
[122]	N. Sultana, J. Chan, T. Sarwar, and A. K. Qin, “Learning to optimise general TSP instances,” arXiv preprint arXiv: 2010.12214, 2020
[123]	C. K. Joshi, T. Laurent, and X. Bresson, “On learning paradigms for the travelling salesman problem,” arXiv preprint arXiv: 1910.07210, 2019.
[124]	M. Nazari, A. Oroojlooy, M. Takáč, and L. V. Snyder, “Reinforcement learning for solving the vehicle routing problem,” in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal Canada, 2018, pp. 9861–9871.
[125]	A. A. Ibrahim, N. Lo, R. O. Abdulaziz, and J. Ishaya, “Capacitated vehicle routing problem,” Int. J. Res. Granth., vol. 7, no. 3, pp. 310–327, Mar. 2019. doi: 10.29121/granthaalayah.v7.i3.2019.976
[126]	B. Peng, J. H. Wang, and Z. Z. Zhang, “A deep reinforcement learning algorithm using dynamic attention model for vehicle routing problems,” in Proc. 11th Int. Symp. Artificial Intelligence Algorithms and Applications. Guangzhou, China, 2019, pp. 636–650.
[127]	L. Duan, Y. Zhan, H. Y. Hu, Y. Gong, J. W. Wei, X. D. Zhang, and Y. H. Xu, “Efficiently solving the practical vehicle routing problem: A novel joint learning approach,” in Proc. 26th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2020, pp. 3054–3063.
[128]	J. M. Vera and A. G. Abad, “Deep reinforcement learning for routing a heterogeneous fleet of vehicles,” in Proc. IEEE Latin American Conf. Computational Intelligence, Guayaquil, Ecuador, 2019, pp. 1–6.
[129]	J. K. Falkner and L. Schmidt-Thieme, “Learning to solve vehicle routing problems with time windows through joint attention,” arXiv preprint arXiv: 2006.09100, 2020.
[130]	L. Xin, W. Song, Z. G. Cao, and J. Zhang, “Step-wise deep learning models for solving routing problems,” IEEE Trans. Ind. Inf., vol. 17, no. 7, pp. 4861–4871, Jul. 2020.
[131]	L. Xin, W. Song, Z. G. Cao, and, J. Zhang, “Multi-decoder attention model with embedding glimpse for solving vehicle routing problems,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 12042–12049.
[132]	K. Zhang, F. He, Z. C. Zhang, X. Lin, and M. Li, “Multi-vehicle routing problems with soft time windows: A multi-agent reinforcement learning approach,” Trans. Res. Part C Emerg. Technol., vol. 121, p. 102861, Dec. 2020.
[133]	J. X. Zhao, M. J. Mao, X. Zhao, and J. H. Zou, “A hybrid of deep reinforcement learning and local search for the vehicle routing problems,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 11, pp. 7208–7218, Nov. 2021. doi: 10.1109/TITS.2020.3003163
[134]	N. Sultana, J. Chan, A. K. Qin, and T. Sarwar, “Learning vehicle routing problems using policy optimisation,” arXiv preprint arXiv: 2012.13269, 2020.
[135]	I. Drori, A. Kharkar, W. R. Sickinger, B. Kates, Q. Ma, S. W. Ge, E. Dolev, B. Dietrich, D. P. Williamson, and M. Udell, “Learning to solve combinatorial optimization problems on real-world graphs in linear time,” in Proc. 19th IEEE Int. Conf. Machine Learning and Applications, Miami, USA, 2020, pp. 19–24.
[136]	G. Bono, J. S. Dibangoye, O. Simonin, L. Matignon, and F. Pereyron, “Solving multi-agent routing problems using deep attention mechanisms,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 12, pp. 7804–7813, Dec. 2021. doi: 10.1109/TITS.2020.3009289
[137]	B. Lin, B. Ghaddar, and J. Nathwani, “Deep reinforcement learning for the electric vehicle routing problem with time windows,” IEEE Trans. Intell. Transp. Syst., 2021, DOI: 10.1109/TITS.2021.3105232
[138]	J. W. Li, L. Xin, Z. G. Cao, A. Lim, W. Song, and J. Zhang, “Heterogeneous attentions for solving pickup and delivery problem via deep reinforcement learning,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2306–2315, Mar. 2022. doi: 10.1109/TITS.2021.3056120
[139]	J. W. Li, Y. N. Ma, R. Z. Gao, Z. G. Cao, A. Lim, W. Song, and J. Zhang, “Deep reinforcement learning for solving the heterogeneous capacitated vehicle routing problem,” IEEE Trans. Cybern., 2021, DOI: 10.1109/TCYB.2021.3111082
[140]	J. L. Li, D. W. Fu, Q. Yuan, H. H. Zhang, K. H. Chen, S. Yang, and F. C. Yang, “A traffic prediction enabled double rewarded value iteration network for route planning,” IEEE Trans. Veh. Technol., vol. 68, no. 5, pp. 4170–4181, May 2019. doi: 10.1109/TVT.2019.2893173
[141]	J. J. Q. Yu, W. Yu, and J. T. Gu, “Online vehicle routing with neural combinatorial optimization and deep reinforcement learning,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3806–3817, Oct. 2019. doi: 10.1109/TITS.2019.2909109
[142]	B. Balaji, J. Bell-Masterson, E. Bilgin, A. Damianou, P. M. Garcia, A. Jain, R. F. Luo, A. Maggiar, B. Narayanaswamy, and C. Ye, “Orl: Reinforcement learning benchmarks for online stochastic optimization problems,” arXiv preprint arXiv: 1911.10641, 2019.
[143]	O. M. Andrychowicz, B. Baker, M. Chociej, R. Józefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, J. Schneider, S. Sidor, J. Tobin, P. Welinder, L. L. Weng, and W. Zaremba, “Learning dexterous in-hand manipulation,” Int. J. Rob. Res., vol. 39, no. 1, pp. 3–20, Jan. 2020. doi: 10.1177/0278364919887447
[144]	D. Mukhutdinov, A. Filchenkov, A. Shalyto, and V. Vyatkin, “Multi-agent deep learning for simultaneous optimization for time and energy in distributed routing system,” Future Gener. Comput. Syst., vol. 94, pp. 587–600, May 2019. doi: 10.1016/j.future.2018.12.037
[145]	R. RamachandranPillai and M. Arock, “An adaptive spiking neural p system for solving vehicle routing problems,” Arabian J. Sci. Eng., vol. 45, no. 4, pp. 2513–2529, Apr. 2020. doi: 10.1007/s13369-019-04153-6
[146]	Y. X. Sheng, H. W. Ma, and W. Xia, “A pointer neural network for the vehicle routing problem with task priority and limited resources,” Inf Technol Control, vol. 49, no. 2, pp. 237–248, Sep. 2020. doi: 10.5755/j01.itc.49.2.24613
[147]	T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proc. Conf. Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1412–1421.
[148]	Z. G. Cao, H. L. Guo, W. Song, K. Z. Gao, Z. H. Chen, L. Zhang, and X. X. Zhang, “Using reinforcement learning to minimize the probability of delay occurrence in transportation,” IEEE Trans. Veh. Technol., vol. 69, no. 3, pp. 2424–2436, Mar. 2020. doi: 10.1109/TVT.2020.2964784
[149]	C. Chen, J. G. Jiang, N. Lv, and S. Y. Li, “An intelligent path planning scheme of autonomous vehicles platoon using deep reinforcement learning on network edge,” IEEE Access, vol. 8, pp. 99059–99069, May 2020. doi: 10.1109/ACCESS.2020.2998015
[150]	A. K. Kalakanti, S. Verma, T. Paul, and T. Yoshida, “RL SolVeR pro: Reinforcement learning for solving vehicle routing problem,” in Proc. 1st Int. Conf. Artificial Intelligence and Data Sciences, Ipoh, Malaysia, 2019, pp. 94–99.
[151]	J. Holler, R. Vuorio, Z. W. Qin, X. C. Tang, Y. Jiao, T. C. Jin, S. Singh, C. X. Wang, and J. P. Ye, “Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem,” in Proc. IEEE Int. Conf. Data Mining, Beijing, China, 2019, pp. 1090–1095.
[152]	N. Vesselinova, R. Steinert, D. F. Perez-Ramirez, and M. Boman, “Learning combinatorial optimization on graphs: A survey with applications to networking,” IEEE Access, vol. 8, pp. 120388–120416, Jun. 2020. doi: 10.1109/ACCESS.2020.3004964
[153]	A. Subramanian, E. Uchoa, and L. S. Ochi, “A hybrid algorithm for a class of vehicle routing problems,” Comput. Oper. Res., vol. 40, no. 10, pp. 2519–2531, Oct. 2013. doi: 10.1016/j.cor.2013.01.013
[154]	X. Yao and Y. Liu, “Machine learning,” in Search Methodologies, E. K. Burke and G. Kendall, Eds. Boston, USA: Springer, 2014, pp. 477–517.
[155]	N. Altman and M. Krzywinski, “The curse(s) of dimensionality,” Nat. Methods, vol. 15, no. 6, pp. 399–400, Jun. 2018. doi: 10.1038/s41592-018-0019-x
[156]	M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich, “A review of relational machine learning for knowledge graphs,” Proc. IEEE, vol. 104, no. 1, pp. 11–33, Jan. 2016. doi: 10.1109/JPROC.2015.2483592
[157]	K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh, “Machine learning for molecular and materials science,” Nature, vol. 559, no. 7715, pp. 547–555, Jul. 2018. doi: 10.1038/s41586-018-0337-2
[158]	Q. Cappart, D. Chetélat, E. B. Khalil, A. Lodi, C. Morris, and P. Veličkovič, “Combinatorial optimization and reasoning with graph neural networks,” in Proc. Thirtieth Int. Joint Conf. Artificial Intelligence, Montreal, Canada, 2021, pp. 4348–4355.
[159]	K. L. Wagstaff, “Machine learning that matters,” in Proc. 29th Int. Conf. Machine Learning, Edinburgh, UK, 2012.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(11) / Tables(7)

Get Citation

PDF

XML

Article Metrics

Article views (2426) PDF downloads(338)

Highlights

The learning-based optimization (LBO) algorithms have been applied to vehicle routing problems (VRP), and many remarkable research breakthroughs have been achieved. The paper briefly introduces the applications of LBO algorithms to the VRP to aid beginners in understanding the development of this field
Although the LBO algorithms can overcome the deficiencies of the tedious parameter tuning of exact and heuristic algorithms and rapidly solve online instances with the advantage of offline training, the LBO algorithms still have some technological bottlenecks to be settled. The paper discusses the advantages and disadvantages of different learning-based optimization algorithms based on extensive experiments on different datasets
Using the LBO algorithms to solve the VRP is still under research and there are several challenges in the LBO algorithms that need to be settled in the future. The paper suggests several potential research directions of applying the LBO algorithms in the VRP from these limitations

An Overview and Experimental Study of Learning-Based Optimization Algorithms for the Vehicle Routing Problem

doi: 10.1109/JAS.2022.105677

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content