IEEE/CAA Journal of Automatica Sinica
Citation:  B. J. Li, G. H. Wu, Y. M. He, M. F. Fan, and W. Pedrycz, “An overview and experimental study of learningbased optimization algorithms for the vehicle routing problem,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1115–1138, Jul. 2022. doi: 10.1109/JAS.2022.105677 
[1] 
T. Pinto, C. Alves, and J. V. de Carvalho, “Models and advanced optimization algorithms for the integrated management of logistics operations,” in Proc. IO2017, Valença, Portugal, pp. 313–324.

[2] 
R. S. Khan, J. B. Yang, and J. Handl, “Interactive multiobjective vehicle routing via GAbased dynamic programming,” in Proc. Int. Conf. Transportation Information and Safety, Wuhan, China, 2015, pp. 318–322.

[3] 
K. Dorling, J. Heinrichs, G. G. Messier, and S. Magierowski, “Vehicle routing problems for drone delivery,” IEEE Trans. Syst. Man Cybern. Syst., vol. 47, no. 1, pp. 70–85, Jan. 2017. doi: 10.1109/TSMC.2016.2582745

[4] 
P. Toth and D. Vigo, “An overview of vehicle routing problems,” in The Vehicle Routing Problem, P. Toth and D. Vigo, Eds. Philadelphia, USA: Society for Industrial and Applied Mathematics, 2002, pp. 1–26.

[5] 
J. F. Cordeau, G. Laporte, M. W. P. Savelsbergh, and D. Vigo, “Vehicle routing,” Handb. Oper. Res. Manage. Sci., vol. 14, pp. 367–428, 2007.

[6] 
D. Pisinger and S. Ropke, “A general heuristic for vehicle routing problems,” Comput. Oper. Res., vol. 34, no. 8, pp. 2403–2435, Aug. 2007. doi: 10.1016/j.cor.2005.09.012

[7] 
J. Euchi and H. Chabchoub, “A hybrid tabu search to solve the heterogeneous fixed fleet vehicle routing problem,” Logist. Res., vol. 2, no. 1, pp. 3–11, Jun. 2010. doi: 10.1007/s1215901000283

[8] 
M. B. Guzairov, N. I. Yusupova, O. N. Smetanina, and E. Y. Rassadnikova, “Models and algorithms for the vehicle routing problem with time windows and other conditions,” in Proc. 13th Int. ScientificTechnical Conf. Actual Problems of Electronics Instrument Engineering, Novosibirsk, Russia, 2016, pp. 412–416.

[9] 
H. Afaq and S. Saini, “A novel approach to solve graph based travelling salesman problem using particle swarm optimization technique,” in Proc. IEEE Int. Conf. Computational Intelligence and Computing Research, Coimbatore, India, 2012, pp. 1–4.

[10] 
M. A. Mohammed, M. S. Ahmad, and S. A. Mostafa, “Using genetic algorithm in implementing capacitated vehicle routing problem,” in Proc. Int. Conf. Computer & Information Science, Kuala Lumpur, Malaysia, 2012, pp. 257–262.

[11] 
J. H. Wilck IV and T. M. Cavalier, “A construction heuristic for the split delivery vehicle routing problem,” Am. J. Oper. Res., vol. 2, no. 2, pp. 153–162, Jun. 2012.

[12] 
U. Ritzinger, J. Puchinger, and R. F. Hartl, “A survey on dynamic and stochastic vehicle routing problems,” Int. J. Prod. Res., vol. 54, no. 1, pp. 215–231, Jan. 2016. doi: 10.1080/00207543.2015.1043403

[13] 
T. Jastrzab and A. Buchcik, “Practical applications of smart delivery systems: Mine evacuation as an example of rich vehicle routing problem,” Smart Delivery Systems, J. Nalepa, Ed. Amsterdam, The Netherlands: Elsevier, 2020, pp. 249–268.

[14] 
J. K. Lenstra and A. H. G. R. Kan, “Complexity of vehicle routing and scheduling problems,” Networks, vol. 11, no. 2, pp. 221–227, 1981. doi: 10.1002/net.3230110211

[15] 
B. Golden, S. Raghavan, and E. Wasil, The Vehicle Routing Problem: Latest Advances and New Challenges. New York, USA: Springer, 2008.

[16] 
J. Mańdziuk, “New shades of the vehicle routing problem: Emerging problem formulations and computational intelligence solution methods,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 3, no. 3, pp. 230–244, Jun. 2019. doi: 10.1109/TETCI.2018.2886585

[17] 
A. O. Adewumi and O. J. Adeleke, “A survey of recent advances in vehicle routing problems,” Int. J. Syst. Assur. Eng. Manage., vol. 9, no. 1, pp. 155–172, Feb. 2018. doi: 10.1007/s1319801604934

[18] 
A. Dixit, A. Mishra, and A. Shukla, “Vehicle routing problem with time windows using metaheuristic algorithms: A survey,” in Harmony Search and Nature Inspired Optimization Algorithms, N. Yadav, A. Yadav, J. C. Bansal, K. Deep, and J. H. Kim, Eds. Singapore: Springer, 2019, pp. 539–546.

[19] 
W. J. Cao and W. S. Yang, “A survey of vehicle routing problem,” MATEC Web Conf., vol. 100, p. 01006. Mar. 2017.

[20] 
R. Goel and R. Maini, “Vehicle routing problem and its solution methodologies: A survey,” Int. J. Logist. Syst. Manage., vol. 28, no. 4, pp. 419–435, Nov. 2017.

[21] 
K. A. Smith, “Neural networks for combinatorial optimization: A review of more than a decade of research,” INFORMS J. Comput., vol. 11, no. 1, pp. 15–34, Feb. 1999. doi: 10.1287/ijoc.11.1.15

[22] 
O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “Stateoftheart in artificial neural network applications: A survey,” Heliyon, vol. 4, no. 11, p. E00938, Nov. 2018.

[23] 
X. D. Zhang, “Machine learning,” in A Matrix Algebra Approach to Artificial Intelligence, X. D. Zhang, Ed. Singapore: Springer, 2020, pp. 223–440.

[24] 
I. El Naqa and M. J. Murphy, “What is machine learning?” in Machine Learning in Radiation Oncology, I. El Naqa, R. J. Li, M. J. Murphy, Eds. Cham, Germany: Springer, 2015, pp. 3–11.

[25] 
A. François, Q. Cappart, and L. M. Rousseau, “How to evaluate machine learning approaches for combinatorial optimization: Application to the travelling salesman problem,” arXiv preprint arXiv: 1909.13121, 2019.

[26] 
R. B. Bai, X. N. Chen, Z. L. Chen, T. X. Cui, S. H. Gong, W. T. He, X. P. Jiang, H. Jin, J. H. Jin, G. Kendall, J. W. Li, Z. Lu, J. F. Ren, P. Weng, N. Xue, and H. Y. Zhang, “Analytics and machine learning in vehicle routing research,” arXiv preprint arXiv: 2102.10012, 2021

[27] 
G. B. Dantzig and J. H. Ramser, “The truck dispatching problem,” Manage Sci, vol. 6, no. 1, pp. 80–91, Oct. 1959.

[28] 
G. Clarke and J. W. Wright, “Scheduling of vehicles from a central depot to a number of delivery points,” Oper. Res., vol. 12, no. 4, pp. 568–581, Aug. 1964. doi: 10.1287/opre.12.4.568

[29] 
G. Laporte, “What you should know about the vehicle routing problem,” Naval Res. Logist., vol. 54, no. 8, pp. 811–819, Dec. 2007. doi: 10.1002/nav.20261

[30] 
G. Laporte, “Fifty years of vehicle routing,” Transport Sci, vol. 43, no. 4, pp. 408–416, Oct. 2009. doi: 10.1287/trsc.1090.0301

[31] 
R. Fukasawa, H. Longo, J. Lysgaard, M. P. de Aragão, M. Reis, E. Uchoa, and R. F. Werneck, “Robust branchandcutandprice for the capacitated vehicle routing problem,” Math. Program., vol. 106, no. 3, pp. 491–511, May 2006. doi: 10.1007/s101070050644x

[32] 
M. Junger, G. Reinelt, and G. Rinaldi, “The traveling salesman problem,” Handb. Oper. Res. Manage. Sci., vol. 7, pp. 225–330, 1995.

[33] 
P. Toth and D. Vigo, “Models, relaxations and exact approaches for the capacitated vehicle routing problem,” Discrete Appl. Math., vol. 123, no. 1–3, pp. 487–512, Nov. 2002. doi: 10.1016/S0166218X(01)003511

[34] 
M. Fisher, “Vehicle routing,” Handb. Oper. Res. Manage. Sci., vol. 8, pp. 1–33, 1995.

[35] 
J. CaceresCruz, P. Arias, D. Guimarans, D. Riera, and A. A. Juan, “Rich vehicle routing problem: Survey,” ACM Comput. Surv., vol. 47, no. 2, Article No. 32, Jan. 2015.

[36] 
G. Laporte and Y. Nobert, “Exact algorithms for the vehicle routing problem,” NorthHolland Math. Stud., vol. 132, pp. 147–184, 1987.

[37] 
G. Laporte, H. Mercure, and Y. Nobert, “An exact algorithm for the asymmetrical capacitated vehicle routing problem,” Networks, vol. 16, no. 1, pp. 33–46, Mar. 1986. doi: 10.1002/net.3230160104

[38] 
S. Eilon, C. D. T. WatsonGandy, N. Christofides, and R. de Neufville, “Distribution managementmathematical modelling and practical analysis,” IEEE Trans. Syst. Man Cybern., vol. SMC4, no. 6, pp. 589–589, Nov. 1974.

[39] 
M. R. Rao and S. Zionts, “Allocation of transportation units to alternative trips—a column generation scheme with outofkilter subproblems,” Oper. Res., vol. 16, no. 1, pp. 52–63, Feb. 1968. doi: 10.1287/opre.16.1.52

[40] 
N. Christofides and S. Eilon, “An algorithm for the vehicledispatching problem,” J. Oper. Res. Soc., vol. 20, no. 3, pp. 309–318, Sep. 1969. doi: 10.1057/jors.1969.75

[41] 
N. Christofides, A. Mingozzi, and P. Toth, “Exact algorithms for the vehicle routing problem, based on spanning tree and shortest path relaxations,” Math. Program., vol. 20, no. 1, pp. 255–282, Dec. 1981. doi: 10.1007/BF01589353

[42] 
B. L. Golden, E. A. Wasil, J. P. Kelly, and I. M. Chao, “The impact of metaheuristics on solving the vehicle routing problem: Algorithms, problem sets, and computational results,” in Fleet Management and Logistics, T. G. Crainic and G. Laporte, Eds. New York, USA: Springer, 1998, pp. 33–56.

[43] 
S. N. Kumar and R. Panneerselvam, “A survey on the vehicle routing problem and its variants,” Intell. Inf. Manage., vol. 4, no. 3, pp. 66–74, May 2012.

[44] 
J. F. Cordeau and G. Laporte, “Tabu search heuristics for the vehicle routing problem,” Metaheuristic Optimization via Memory and Evolution, R. Sharda, S. Voß, C. Rego, and B. Alidaee, Eds. Boston, USA: Springer, 2005, pp. 145–163.

[45] 
G. A. P. Kindervater and M. W. P. Savelsbergh, “Vehicle routing: Handling edge exchanges,” Local Search in Combinatorial Optimization, E. Aarts and J. K. Lenstra, Eds. New York, USA: Wiley, 1997.

[46] 
S. Desale, A. Rasool, S. Andhale, and P. Rane, “Heuristic and metaheuristic algorithms and their relevance to the real world: A survey,” Int. J. Comput. Eng. Res. Trends, vol. 351, no. 5, pp. 2349–7084, 2015.

[47] 
G. Laporte and F. Semet, “Classical heuristics for the capacitated vrp,” in The Vehicle Routing Problem, P. Toth and D. Vigo, Eds. Philadelphia, USA: Society for Industrial and Applied Mathematics, 2002, pp. 109–128.

[48] 
T. O. Ayodele, “Types of machine learning algorithms,” in New Advances in Machine Learning, Y. G. Zhang, Ed. IntechOpen, 2010, pp. 19–48.

[49] 
C. M. Bishop, Pattern Recognition and Machine Learning. New York, USA: Springer, 2006.

[50] 
L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artif. Intell. Res., vol. 4, no. 1, pp. 237–285, Jun. 1996.

[51] 
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. 2nd ed. Cambridge, USA: MIT Press, 2018.

[52] 
J. C. Yan, S. Yang, and E. Hancock, “Learning for graph matching and related combinatorial optimization problems,” in Proc. 29th Int. Joint Conf. Artificial Intelligence, Yokohama, Japan, 2020, pp. 4988–4996.

[53] 
B. Mahesh, “Machine learning algorithms—a review,” Int. J. Sci. Res., vol. 9, no. 1, pp. 381–386, Jan. 2020.

[54] 
M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning environment: An evaluation platform for general agents,” J. Artif. Intell. Res., vol. 47, pp. 253–279, Jun. 2013. doi: 10.1613/jair.3912

[55] 
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016. doi: 10.1038/nature16961

[56] 
S. Levine, C. Finn, T. Darrell, and P. Abbeel, “Endtoend training of deep visuomotor policies,” J. Mach. Learn. Res., vol. 17, no. 1, pp. 1334–1373, Jan. 2016.

[57] 
K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. 32nd Int. Conf. Int. Conf. Machine Learning, Lille, France, 2015, pp. 2048–2057.

[58] 
H. D. Song, I. Triguero, and E. Özcan, “A review on the self and dual interactions between machine learning and optimisation,” Prog. Artif. Intell., vol. 8, no. 2, pp. 143–165, Jun. 2019. doi: 10.1007/s1374801900185z

[59] 
M. KarimiMamaghan, M. Mohammadi, P. Meyer, A. M. KarimiMamaghan, and E. G. Talbi, “Machine learning at the service of metaheuristics for solving combinatorial optimization problems: A stateoftheart,” Eur. J. Oper. Res., vol. 296, no. 2, pp. 393–422, Jan. 2022. doi: 10.1016/j.ejor.2021.04.032

[60] 
L. M. Gambardella and M. Dorigo, “AntQ: A Reinforcement Learning approach to the traveling salesman problem,” in Machine Learning Proceedings 1995, A. Prieditis and S. Russell, Eds. Amsterdam, The Netherlands: Elsevier, 1995, pp. 252–260.

[61] 
F. C. de Lima Junior, J. D. de Melo, and A. D. D. Neto, “Using Qlearning algorithm for initialization of the GRASP metaheuristic and genetic algorithm,” in Proc. Int. Joint Conf. Neural Networks, Orlando, USA, 2007, pp. 1243–1248.

[62] 
M. M. Alipour, S. N. Razavi, M. R. F. Derakhshi, and M. A. Balafar, “A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem,” Neural Comput. Appl., vol. 30, no. 9, pp. 2935–2951, Nov. 2018. doi: 10.1007/s0052101728804

[63] 
F. Liu and G. Z. Zeng, “Study of genetic algorithm with reinforcement learning to solve the TSP,” Exp. Syst. Appl., vol. 36, no. 3, pp. 6995–7001, Apr. 2009. doi: 10.1016/j.eswa.2008.08.026

[64] 
T. Phiboonbanakit, T. Horanont, T. Supnithi, and V. N. Huynh, “Knowledgebased learning for solving vehicle routing problem,” in Proc. ACM Int. Joint Conf. and 2018 Int. Symp. Pervasive and Ubiquitous Computing and Wearable Computers, Singapore, 2018, pp. 1103–1111.

[65] 
J. Y. Ding, C. Zhang, L. Shen, S. Y. Li, B. Wang, Y. H. Xu, and L. Song, “Accelerating primal solution findings for mixed integer programs based on solution prediction,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 2, pp. 1452–1459, Apr. 2020.

[66] 
Y. Sun, A. Ernst, X. D. Li, and J. Weiner, “Generalization of machine learning for problem reduction: A case study on travelling salesman problems,” OR Spectrum, vol. 43, no. 3, pp. 607–633, Sep. 2021. doi: 10.1007/s0029102000604x

[67] 
P. L. N. U. Cooray and T. D. Rupasinghe, “Machine learningbased parameter tuned genetic algorithm for energy minimizing vehicle routing problem,” J. Ind. Eng., vol. 2017, Jan. 2017.

[68] 
F. AlDuoli, G. Rabadi, M. Seck, and H. A. Handley, “Hybridizing metaraps with machine learning algorithms,” in Proc. IEEE Technology and Engineering Management Conf., Evanston, USA, 2018, pp. 1–6.

[69] 
P. Shaw, “A new local search algorithm providing high quality solutions to vehicle routing problems,” University of Strathclyde, Glasgow, Scotland, 1997.

[70] 
G. Schrimpf, J. Schneider, H. StammWilbrandt, and G. Dueck, “Record breaking optimization results using the ruin and recreate principle,” J. Comput. Phys., vol. 159, no. 2, pp. 139–171, Apr. 2000. doi: 10.1006/jcph.1999.6413

[71] 
D. Pisinger and S. Ropke, “Large neighborhood search,” in Handbook of Metaheuristics, M. Gendreau and J. Y. Potvin, Eds. New York, USA: Springer, 2010, pp. 399–419.

[72] 
A. Hottung and K. Tierney, “Neural large neighborhood search for the capacitated vehicle routing problem,” Proc. 24th European Conf. Artificial Intelligence, Santiago de Compostela, Spain, 2020.

[73] 
M. X. Chen, L. Gao, Q. C. Chen, and Z. X. Liu, “Dynamic partial removal: A neural network heuristic for large neighborhood search,” arXiv preprint arXiv: 2005.09330, 2020.

[74] 
L. Gao, M. X. Chen, Q. C. Chen, G. Z. Luo, N. Y. Zhu, and Z. X. Liu, “Learn to design the heuristics for vehicle routing problem,” arXiv preprint arXiv: 2002.08539, 2020.

[75] 
M. M. Solomon, “Algorithms for the vehicle routing and scheduling problems with time window constraints,” Oper. Res., vol. 35, no. 2, pp. 254–265, Apr. 1987. doi: 10.1287/opre.35.2.254

[76] 
R. A. C. Bianchi, C. H. C. Ribeiro, and A. H. R. Costa, “On the relation between ant colony optimization and heuristically accelerated reinforcement learning,” in Proc. 1st Int. Workshop on Hybrid Control of Autonomous System, Pasadena, USA, 2009, pp. 49–55.

[77] 
F. D. Yang, T. C. Jin, T. Y. Liu, X. M. Sun, and J. L. Zhang, “Boosting dynamic programming with neural networks for solving NPhard problems,” in Proc. 10th Asian Conf. Machine Learning, Beijing, China, 2018, pp. 726–739.

[78] 
W. Joe and H. C. Lau, “Deep reinforcement learning approach to solve dynamic vehicle routing problem with stochastic customers,” in Proc. Int. Conf. Autom. Plann. Sched., vol. 30, no. 1, pp. 394–402, Jun. 2020.

[79] 
A. Delarue, R. Anderson, and C. Tjandraatmadja, “Reinforcement learning with combinatorial actions: An application to vehicle routing,” arXiv preprint arXiv: 2010.12001, 2020.

[80] 
Y. Yao, Z. Peng, and B. Xiao, “Parallel hyperheuristic algorithm for multiobjective route planning in a smart city,” IEEE Trans. Veh. Technol., vol. 67, no. 11, pp. 10307–10318, Nov. 2018. doi: 10.1109/TVT.2018.2868942

[81] 
X. Y. Chen and Y. D. Tian, “Learning to perform local rewriting for combinatorial optimization,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 6281–6292.

[82] 
H. Lu, X. W. Zhang, and S. Yang, “A learningbased iterative method for solving vehicle routing problems,” in Proc. 8th Int. Conf Learning Representations, Addis Ababa, Ethiopia, 2020.

[83] 
P. R. de O. da Costa, J. Rhuggenaath, Y. Q. Zhang, and A. Akcay, “Learning 2opt heuristics for the traveling salesman problem via deep reinforcement learning,” in Proc. 12th Asian Conf. Machine Learning, Bangkok, Thailand, 2020, pp. 465–480.

[84] 
Y. X. Wu, W. Song, Z. G. Cao, J. Zhang, and A. Lim, “Learning improvement heuristics for solving routing problems,” IEEE Trans. Neural Netw. Learn. Syst., 2021, DOI: 10.1109/TNNLS.2021.3068828

[85] 
O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Proc. 28th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 692–2700.

[86] 
M. V. Vlastelica, A. Paulus, V. Musil, G. Martius, and M. Rolínek, “Differentiation of blackbox combinatorial solvers,” in Proc. 8th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020.

[87] 
Y. N. Ma, J. W. Li, Z. G. Cao, W. Song, L. Zhang, Z. H. Chen, and J. Tang, “Learning to iteratively solve routing problems with dualaspect collaborative transformer,” in Proc. 35th Conf. Neural Information Processing Systems, Sydney, Australia, 2021.

[88] 
W. Qin, Z. L. Zhuang, Z. Z. Huang, and H. Z. Huang, “A novel reinforcement learningbased hyperheuristic for heterogeneous vehicle routing problem,” Comput. Ind. Eng, vol. 156, p. 107252, Jun. 2021.

[89] 
J. Mlejnek and J. Kubalik, “Evolutionary hyperheuristic for capacitated vehicle routing problem,” in Proc. 15th Annu. Conf. Companion on Genetic and Evolutionary Computation, Amsterdam, The Netherlands, 2013, pp. 219–220.

[90] 
M. Okulewicz and J. Mańdziuk, “A particle swarm optimization hyperheuristic for the dynamic vehicle routing problem,” arXiv preprint arXiv: 2006.08809, 2020.

[91] 
D. Meignan, A. Koukam, and J. C. Créput, “Coalitionbased metaheuristic: A selfadaptive metaheuristic using reinforcement learning and mimetism,” J. Heuristics, vol. 16, no. 6, pp. 859–879, Dec. 2010. doi: 10.1007/s1073200991217

[92] 
S. Asta and E. Özcan, “An apprenticeship learning hyperheuristic for vehicle routing in HyFlex,” in Proc. IEEE Symp. Evolving and Autonomous Learning Systems, Orlando, USA, 2014, pp. 65–72.

[93] 
R. Tyasnurita, E. Özcan, and R. John, “Learning heuristic selection using a time delay neural network for open vehicle routing,” in Proc. IEEE Congr. Evolutionary Computation, Donostia, Spain, 2017, pp. 1474–1481.

[94] 
P. Kerschke, L. Kotthoff, J. Bossek, H. H. Hoos, and H. Trautmann, “Leveraging TSP solver complementarity through machine learning,” Evol. Comput., vol. 26, no. 4, pp. 597–620, Dec. 2018. doi: 10.1162/evco_a_00215

[95] 
K. F. Zhao, S. C. Liu, J. X. Yu, and Y. Rong, “Towards featurefree TSP solver selection: A deep learning approach,” in Proc. Int. Joint Conf. Neural Networks, Shenzhen, China, 2021, pp. 1–8.

[96] 
A. E. GutierrezRodríguez, S. E. ConantPablos, J. C. OrtizBayliss, and H. TerashimaMarín, “Selecting metaheuristics for solving vehicle routing problems with time windows via metalearning,” Expert Syst. Appl., vol. 118, pp. 470–481, Mar. 2019. doi: 10.1016/j.eswa.2018.10.036

[97] 
S. Martin, D. Ouelhadj, P. Beullens, E. Ozcan, A. A. Juan, and E. K. Burke, “A multiagent based cooperative approach to scheduling and routing,” Eur. J. Oper. Res., vol. 254, no. 1, pp. 169–178, Oct. 2016. doi: 10.1016/j.ejor.2016.02.045

[98] 
Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” arXiv preprint arXiv: 1506.00019, 2015.

[99] 
J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proc. Natl. Acad. Sci. USA, vol. 79, no. 8, pp. 2554–2558, Apr. 1982. doi: 10.1073/pnas.79.8.2554

[100] 
S. Hochreiter and J. Schmidhuber, “Long shortterm memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. doi: 10.1162/neco.1997.9.8.1735

[101] 
M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681, Nov. 1997. doi: 10.1109/78.650093

[102] 
I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3104–3112.

[103] 
A. Karpathy and L. FeiFei, “Deep visualsemantic alignments for generating image descriptions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA: IEEE, 2015, pp. 3128–3137.

[104] 
D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv: 1409.0473, 2014.

[105] 
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” arXiv preprint arXiv: 1611.09940, 2016.

[106] 
D. Levy and L. Wolf, “Learning to align the source code to the compiled object code,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 2043–2051.

[107] 
M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L. M. Rousseau, “Learning heuristics for the TSP by policy gradient,” in Proc. 15th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Delft, The Netherlands, 2018, pp. 170–181.

[108] 
K. W. Li, T. Zhang, and R. Wang, “Deep reinforcement learning for multiobjective optimization,” IEEE Trans. Cybern., vol. 51, no. 6, pp. 3103–3114, Jun. 2021. doi: 10.1109/TCYB.2020.2977661

[109] 
Y. Kaempfer and L. Wolf, “Learning the multiple traveling salesmen problem with permutation invariant pooling networks,” arXiv preprint arXiv: 1803.09621, 2018.

[110] 
A. V. Le, P. Veerajagadheswar, P. Thiha Kyaw, M. R. Elara, and N. H. K. Nhan, “Coverage path planning using reinforcement learningbased TSP for htetran—a polyaboloinspired selfreconfigurable tiling robot,” Sensors, vol. 21, no. 8, p. 2577, Apr. 2021.

[111] 
Q. Ma, S. W. Ge, D. Y. He, D. Thaker, and I. Drori, “Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,” arXiv preprint arXiv: 1911.04936, 2019.

[112] 
H. J. Dai, E. B. Khalil, Y. Y. Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6351–6361.

[113] 
H. J. Dai, B. Dai, and L. Song, “Discriminative embeddings of latent variable models for structured data,” in Proc. 33rd Int. Conf. Int. Conf. Machine Learning, New York, USA, 2016, pp. 2702–2711.

[114] 
A. L. C. Ottoni, E. G. Nepomuceno, and M. S. de Oliveira, “A response surface model approach to parameter estimation of reinforcement learning for the travelling salesman problem,” J. Control Autom. Electr. Syst., vol. 29, no. 3, pp. 350–359, Jun. 2018. doi: 10.1007/s403130180374y

[115] 
Z. H. Fu, K. B. Qiu, and H. Y. Zha, “Generalize a small pretrained model to arbitrarily large tsp instances,” in Proc. AAAI Conf. Artif. Intell., vo. 5, no. 8, pp. 7474–7482, May 2020.

[116] 
R. K. Zhang, A. Prokhorchuk, and J. Dauwels, “Deep reinforcement learning for traveling salesman problem with time windows and rejections,” in Proc. Int. Joint Conf. Neural Networks, Glasgow, UK, 2020, pp. 1–8.

[117] 
W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” arXiv preprint arXiv: 1803.08475, 2018.

[118] 
Z. H. Xing and S. K. Tu, “A graph neural network assisted monte carlo tree search approach to traveling salesman problem,” IEEE Access, vol. 8, pp. 108418–108428, Jan. 2020. doi: 10.1109/ACCESS.2020.3000236

[119] 
E. Groshev, A. Tamar, M. Goldstein, S. Srivastava, and P. Abbeel, “Learning generalized reactive policies using deep neural networks,” in Proc. 28th AAAI Spring Symposia, Stanford University, Palo Alto, USA, 2018.

[120] 
C. K. Joshi, T. Laurent, and X. Bresson, “An efficient graph convolutional network technique for the travelling salesman problem,” arXiv preprint arXiv: 1906.01227, 2019.

[121] 
M. Prates, P. H. C. Avelar, H. Lemos, L. C. Lamb, and M. Y. Vardi, “Learning to solve NPcomplete problems: A graph neural network for decision TSP,” in Proc. 29th AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 4731–4738, Jul. 2019.

[122] 
N. Sultana, J. Chan, T. Sarwar, and A. K. Qin, “Learning to optimise general TSP instances,” arXiv preprint arXiv: 2010.12214, 2020

[123] 
C. K. Joshi, T. Laurent, and X. Bresson, “On learning paradigms for the travelling salesman problem,” arXiv preprint arXiv: 1910.07210, 2019.

[124] 
M. Nazari, A. Oroojlooy, M. Takáč, and L. V. Snyder, “Reinforcement learning for solving the vehicle routing problem,” in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal Canada, 2018, pp. 9861–9871.

[125] 
A. A. Ibrahim, N. Lo, R. O. Abdulaziz, and J. Ishaya, “Capacitated vehicle routing problem,” Int. J. Res. Granth., vol. 7, no. 3, pp. 310–327, Mar. 2019. doi: 10.29121/granthaalayah.v7.i3.2019.976

[126] 
B. Peng, J. H. Wang, and Z. Z. Zhang, “A deep reinforcement learning algorithm using dynamic attention model for vehicle routing problems,” in Proc. 11th Int. Symp. Artificial Intelligence Algorithms and Applications. Guangzhou, China, 2019, pp. 636–650.

[127] 
L. Duan, Y. Zhan, H. Y. Hu, Y. Gong, J. W. Wei, X. D. Zhang, and Y. H. Xu, “Efficiently solving the practical vehicle routing problem: A novel joint learning approach,” in Proc. 26th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2020, pp. 3054–3063.

[128] 
J. M. Vera and A. G. Abad, “Deep reinforcement learning for routing a heterogeneous fleet of vehicles,” in Proc. IEEE Latin American Conf. Computational Intelligence, Guayaquil, Ecuador, 2019, pp. 1–6.

[129] 
J. K. Falkner and L. SchmidtThieme, “Learning to solve vehicle routing problems with time windows through joint attention,” arXiv preprint arXiv: 2006.09100, 2020.

[130] 
L. Xin, W. Song, Z. G. Cao, and J. Zhang, “Stepwise deep learning models for solving routing problems,” IEEE Trans. Ind. Inf., vol. 17, no. 7, pp. 4861–4871, Jul. 2020.

[131] 
L. Xin, W. Song, Z. G. Cao, and, J. Zhang, “Multidecoder attention model with embedding glimpse for solving vehicle routing problems,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 12042–12049.

[132] 
K. Zhang, F. He, Z. C. Zhang, X. Lin, and M. Li, “Multivehicle routing problems with soft time windows: A multiagent reinforcement learning approach,” Trans. Res. Part C Emerg. Technol., vol. 121, p. 102861, Dec. 2020.

[133] 
J. X. Zhao, M. J. Mao, X. Zhao, and J. H. Zou, “A hybrid of deep reinforcement learning and local search for the vehicle routing problems,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 11, pp. 7208–7218, Nov. 2021. doi: 10.1109/TITS.2020.3003163

[134] 
N. Sultana, J. Chan, A. K. Qin, and T. Sarwar, “Learning vehicle routing problems using policy optimisation,” arXiv preprint arXiv: 2012.13269, 2020.

[135] 
I. Drori, A. Kharkar, W. R. Sickinger, B. Kates, Q. Ma, S. W. Ge, E. Dolev, B. Dietrich, D. P. Williamson, and M. Udell, “Learning to solve combinatorial optimization problems on realworld graphs in linear time,” in Proc. 19th IEEE Int. Conf. Machine Learning and Applications, Miami, USA, 2020, pp. 19–24.

[136] 
G. Bono, J. S. Dibangoye, O. Simonin, L. Matignon, and F. Pereyron, “Solving multiagent routing problems using deep attention mechanisms,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 12, pp. 7804–7813, Dec. 2021. doi: 10.1109/TITS.2020.3009289

[137] 
B. Lin, B. Ghaddar, and J. Nathwani, “Deep reinforcement learning for the electric vehicle routing problem with time windows,” IEEE Trans. Intell. Transp. Syst., 2021, DOI: 10.1109/TITS.2021.3105232

[138] 
J. W. Li, L. Xin, Z. G. Cao, A. Lim, W. Song, and J. Zhang, “Heterogeneous attentions for solving pickup and delivery problem via deep reinforcement learning,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2306–2315, Mar. 2022. doi: 10.1109/TITS.2021.3056120

[139] 
J. W. Li, Y. N. Ma, R. Z. Gao, Z. G. Cao, A. Lim, W. Song, and J. Zhang, “Deep reinforcement learning for solving the heterogeneous capacitated vehicle routing problem,” IEEE Trans. Cybern., 2021, DOI: 10.1109/TCYB.2021.3111082

[140] 
J. L. Li, D. W. Fu, Q. Yuan, H. H. Zhang, K. H. Chen, S. Yang, and F. C. Yang, “A traffic prediction enabled double rewarded value iteration network for route planning,” IEEE Trans. Veh. Technol., vol. 68, no. 5, pp. 4170–4181, May 2019. doi: 10.1109/TVT.2019.2893173

[141] 
J. J. Q. Yu, W. Yu, and J. T. Gu, “Online vehicle routing with neural combinatorial optimization and deep reinforcement learning,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3806–3817, Oct. 2019. doi: 10.1109/TITS.2019.2909109

[142] 
B. Balaji, J. BellMasterson, E. Bilgin, A. Damianou, P. M. Garcia, A. Jain, R. F. Luo, A. Maggiar, B. Narayanaswamy, and C. Ye, “Orl: Reinforcement learning benchmarks for online stochastic optimization problems,” arXiv preprint arXiv: 1911.10641, 2019.

[143] 
O. M. Andrychowicz, B. Baker, M. Chociej, R. Józefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, J. Schneider, S. Sidor, J. Tobin, P. Welinder, L. L. Weng, and W. Zaremba, “Learning dexterous inhand manipulation,” Int. J. Rob. Res., vol. 39, no. 1, pp. 3–20, Jan. 2020. doi: 10.1177/0278364919887447

[144] 
D. Mukhutdinov, A. Filchenkov, A. Shalyto, and V. Vyatkin, “Multiagent deep learning for simultaneous optimization for time and energy in distributed routing system,” Future Gener. Comput. Syst., vol. 94, pp. 587–600, May 2019. doi: 10.1016/j.future.2018.12.037

[145] 
R. RamachandranPillai and M. Arock, “An adaptive spiking neural p system for solving vehicle routing problems,” Arabian J. Sci. Eng., vol. 45, no. 4, pp. 2513–2529, Apr. 2020. doi: 10.1007/s13369019041536

[146] 
Y. X. Sheng, H. W. Ma, and W. Xia, “A pointer neural network for the vehicle routing problem with task priority and limited resources,” Inf Technol Control, vol. 49, no. 2, pp. 237–248, Sep. 2020. doi: 10.5755/j01.itc.49.2.24613

[147] 
T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attentionbased neural machine translation,” in Proc. Conf. Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1412–1421.

[148] 
Z. G. Cao, H. L. Guo, W. Song, K. Z. Gao, Z. H. Chen, L. Zhang, and X. X. Zhang, “Using reinforcement learning to minimize the probability of delay occurrence in transportation,” IEEE Trans. Veh. Technol., vol. 69, no. 3, pp. 2424–2436, Mar. 2020. doi: 10.1109/TVT.2020.2964784

[149] 
C. Chen, J. G. Jiang, N. Lv, and S. Y. Li, “An intelligent path planning scheme of autonomous vehicles platoon using deep reinforcement learning on network edge,” IEEE Access, vol. 8, pp. 99059–99069, May 2020. doi: 10.1109/ACCESS.2020.2998015

[150] 
A. K. Kalakanti, S. Verma, T. Paul, and T. Yoshida, “RL SolVeR pro: Reinforcement learning for solving vehicle routing problem,” in Proc. 1st Int. Conf. Artificial Intelligence and Data Sciences, Ipoh, Malaysia, 2019, pp. 94–99.

[151] 
J. Holler, R. Vuorio, Z. W. Qin, X. C. Tang, Y. Jiao, T. C. Jin, S. Singh, C. X. Wang, and J. P. Ye, “Deep reinforcement learning for multidriver vehicle dispatching and repositioning problem,” in Proc. IEEE Int. Conf. Data Mining, Beijing, China, 2019, pp. 1090–1095.

[152] 
N. Vesselinova, R. Steinert, D. F. PerezRamirez, and M. Boman, “Learning combinatorial optimization on graphs: A survey with applications to networking,” IEEE Access, vol. 8, pp. 120388–120416, Jun. 2020. doi: 10.1109/ACCESS.2020.3004964

[153] 
A. Subramanian, E. Uchoa, and L. S. Ochi, “A hybrid algorithm for a class of vehicle routing problems,” Comput. Oper. Res., vol. 40, no. 10, pp. 2519–2531, Oct. 2013. doi: 10.1016/j.cor.2013.01.013

[154] 
X. Yao and Y. Liu, “Machine learning,” in Search Methodologies, E. K. Burke and G. Kendall, Eds. Boston, USA: Springer, 2014, pp. 477–517.

[155] 
N. Altman and M. Krzywinski, “The curse(s) of dimensionality,” Nat. Methods, vol. 15, no. 6, pp. 399–400, Jun. 2018. doi: 10.1038/s415920180019x

[156] 
M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich, “A review of relational machine learning for knowledge graphs,” Proc. IEEE, vol. 104, no. 1, pp. 11–33, Jan. 2016. doi: 10.1109/JPROC.2015.2483592

[157] 
K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh, “Machine learning for molecular and materials science,” Nature, vol. 559, no. 7715, pp. 547–555, Jul. 2018. doi: 10.1038/s4158601803372

[158] 
Q. Cappart, D. Chetélat, E. B. Khalil, A. Lodi, C. Morris, and P. Veličkovič, “Combinatorial optimization and reasoning with graph neural networks,” in Proc. Thirtieth Int. Joint Conf. Artificial Intelligence, Montreal, Canada, 2021, pp. 4348–4355.

[159] 
K. L. Wagstaff, “Machine learning that matters,” in Proc. 29th Int. Conf. Machine Learning, Edinburgh, UK, 2012.
