Volume 13
Issue 4
IEEE/CAA Journal of Automatica Sinica
| Citation: | Y. Tian, Y. Liu, S. Yang, and X. Zhang, “Deep reinforcement learning based on search space independent operators for black-box continuous optimization,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 4, pp. 913–925, Apr. 2026. doi: 10.1109/JAS.2025.125444 |
| [1] |
Y. Tian, Y. Feng, C. Wang, R. Cao, X. Zhang, X. Pei, K. C. Tan, and Y. Jin, “A large-scale combinatorial many-objective evolutionary algorithm for intensity-modulated radiotherapy planning,” IEEE Trans. Evolutionary Computation, vol. 26, no. 6, pp. 1511–1525, 2022. doi: 10.1109/TEVC.2022.3144675
|
| [2] |
L. M. Ochoa-Estopier and M. Jobson, “Optimization of heat-integrated crude oil distillation systems. part ⅰ: The distillation model,” Industrial and Engineering Chemistry Research, vol. 54, no. 18, pp. 4988–5000, 2015.
|
| [3] |
C. He, R. Cheng, C. Zhang, Y. Tian, Q. Chen, and X. Yao, “Evolutionary large-scale multiobjective optimization for ratio error estimation of voltage transformers,” IEEE Trans. Evolutionary Computation, vol. 24, no. 5, pp. 868–881, 2020. doi: 10.1109/TEVC.2020.2967501
|
| [4] |
K. G. Murty, Linear Programming. Hoboken, USA: John Wiley & Sons, 1983.
|
| [5] |
M. Li, “Generalized Lagrange multiplier method and KKT conditions with an application to distributed optimization,” IEEE Trans. Circuits and Systems II: Express Briefs, vol. 66, no. 2, pp. 252–256, 2019. doi: 10.1109/TCSII.2018.2842085
|
| [6] |
S. S. Petrova and A. D. Solov’ev, “The origin of the method of steepest descent,” Historia Mathematica, vol. 24, no. 4, pp. 361–375, 1997.
|
| [7] |
J. Mockus, Bayesian Approach to Global Optimization: Theory and Applications. Kluwer Academic, Dordrecht, the Netherlands: Springer, 1989.
|
| [8] |
Y. Tian, H. Chen, X. Xiang, H. Jiang, and X. Zhang, “A comparative study on evolutionary algorithms and mathematical programming methods for continuous optimization,” in Proc. the IEEE Congress on Evolutionary Computation, 2022.
|
| [9] |
D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Trans. Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997. doi: 10.1109/4235.585893
|
| [10] |
Y. Tian, S. Peng, X. Zhang, T. Rodemann, K. C. Tan, and Y. Jin, “A recommender system for metaheuristic algorithms for continuous optimization based on deep recurrent neural networks,” IEEE Trans. Artificial Intelligence, vol. 1, no. 1, pp. 5–18, 2020.
|
| [11] |
J. H. Holland, Adaptation in Natural and Artificial Systems. Cambridge, MA: MIT Press, 1992.
|
| [12] |
R. Storn and K. Price, “Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces,” J. Global Optimization, vol. 11, no. 4, pp. 341−359, 1997. doi: 10.1023/A:1008202821328
|
| [13] |
N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary Computation, vol. 9, no. 2, pp. 159–195, 2001. doi: 10.1162/106365601750190398
|
| [14] |
Y. Tian, X. Li, H. Ma, X. Zhang, K. C. Tan, and Y. Jin, “Deep reinforcement learning based adaptive operator selection for evolutionary multi-objective optimization,” IEEE Trans. Emerging Topics in Computational Intelligence, vol. 7, no. 4, pp. 1051–1064, 2023.
|
| [15] |
H. Tong, S. Zhang, C. Huang, and X. Yao, “Algorithm portfolio for parameter tuned evolutionary algorithms,” in Proc. the IEEE Symposium Series on Computational Intelligence, 2019, pp. 1849–1856.
|
| [16] |
N. Mazyavkina, S. Sviridov, S. Ivanov, and E. Burnaev, “Reinforcement learning for combinatorial optimization: A survey,” Computers and Operations Research, vol. 134, Art. no. 105400, 2021.
|
| [17] |
L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artificial Intelligence Research, no. 4, pp. 237–285, 1996.
|
| [18] |
H. A. Nomer, A. W. Mohamed, and A. H. Yousef, “GSK-RL: Adaptive gaining-sharing knowledge algorithm using reinforcement learning,” in Proc. the 3rd Novel Intelligent and Leading Emerging Sciences Conf., Giza, Egypt, 2021, pp. 169–174.
|
| [19] |
Y. Tian, X. Zhang, C. He, K. C. Tan, and Y. Jin, “Principled design of translation, scale, and rotation invariant variation operators for metaheuristics,” Chinese J. Electronics, vol. 32, no. 1, pp. 111–129, 2023.
|
| [20] |
N. Agatz, P. Bouman, and M. Schmidt, “Optimization approaches for the traveling salesman problem with drone,” Transportation Science, vol 52, no. 4, pp. 965−981, 2018.
|
| [21] |
G. Xia, Z. Tang, J. Wang, R. Wang, Y. Li, and G. Xia, “A new parallel improvement algorithm for maximum cut problem,” in Advances in Neural Networks – ISNN 2004, vol. 3173, 2004, pp. 419–424.
|
| [22] |
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” arXiv preprint arXiv: 1611.09940, 2016.
|
| [23] |
M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L.-M. Rousseau, “Learning heuristics for the TSP by policy gradient,” in Integration of Constraint Programming, Artificial Intelligence, and Operations Research. Springer, 2018, pp. 170–181.
|
| [24] |
W. Kool, H. Van Hoof, and M. Welling, “Attention, learn to solve routing problems!” arXiv preprint arXiv: 1803.08475, 2018.
|
| [25] |
Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, “Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,” arXiv preprint arXiv: 1911.04936, 2019.
|
| [26] |
X. Chen and Y. Tian, “Learning to perform local rewriting for combinatorial optimization,” in Proc. the 33rd Int. Conf. Neural Information Processing Systems. Vancouver, Canada, 2019, pp. 6281−6292.
|
| [27] |
L. Gao, M. Chen, Q. Chen, G. Luo, N. Zhu, and Z. Liu, “Learn to design the heuristics for vehicle routing problem,” arXiv preprint arXiv: 2002.08539, 2020.
|
| [28] |
Y. Wu, W. Song, Z. Cao, J. Zhang, and A. Lim, “Learning improvement heuristics for solving routing problems,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, no. 9, pp. 5057–5069, 2022.
|
| [29] |
Z. Zheng, S. Yao, G. Li, L. Han, and Z. Wang, “Pareto improver: Learning improvement heuristics for multi-objective route planning,” IEEE Trans. Intelligent Transportation Systems, vol. 25, no. 1, pp. 1033–1043, 2024.
|
| [30] |
J. Sun, X. Liu, T. Bäck, and Z. Xu, “Learning adaptive differential evolution algorithm from optimization experiences by policy gradient,” IEEE Trans. Evolutionary Computation, vol. 25, no. 4, pp. 666–680, 2021. doi: 10.1109/TEVC.2021.3060811
|
| [31] |
A. Draa, S. Bouzoubia, and I. Boukhalfa, “A sinusoidal differential evolution algorithm for numerical optimisation,” Applied Soft Computing, vol. 27, pp. 99–126, 2015.
|
| [32] |
S. Das, A. Konar, and U. K. Chakraborty, “Two improved differential evolution schemes for faster global search,” in Proc. the 7th Annual Conf. on Genetic and Evolutionary Computation, 2005, pp. 991–998.
|
| [33] |
R. Tanabe and A. S. Fukunaga, “Improving the search performance of SHADE using linear population size reduction,” in Proc. the IEEE Congress on Evolutionary Computation. Beijing, China, 2014, pp. 1658–1665.
|
| [34] |
J. Brest, S. Greiner, B. Boskovic, M. Mernik, and V. Zumer, “Self-adapting control parameters in differential evolution: A comparative study on numerical benchmark problems,” IEEE Trans. Evolutionary Computation, vol. 10, no. 6, pp. 646–657, 2006.
|
| [35] |
A. K. Qin and P. N. Suganthan, “Self-adaptive differential evolution algorithm for numerical optimization,” in Proc. the IEEE Congress on Evolutionary Computation, Edinburgh, UK, 2005, pp. 1785–1791.
|
| [36] |
J. Zhang and A. C. Sanderson, “JADE: Adaptive differential evolution with optional external archive,” IEEE Trans. Evolutionary Computation, vol. 13, no. 5, pp. 945–958, 2009.
|
| [37] |
K. M. Sallam, S. M. Elsayed, R. K. Chakrabortty, and M. J. Ryan, “Improved multi-operator differential evolution algorithm for solving unconstrained problems,” in Proc. the IEEE Congress on Evolutionary Computation, Glasgow, UK, 2020, pp. 1–8.
|
| [38] |
R. Tanabe and A. Fukunaga, “Success-history based parameter adaptation for differential evolution,” in Proc. the IEEE Congress on Evolutionary Computation, Cancun, Mexico, 2013, pp. 71–78.
|
| [39] |
F. Zhao, F. Ji, T. Xu, N. Zhu, and Jonrinaldi, “Hierarchical parallel search with automatic parameter configuration for particle swarm optimization,” Applied Soft Computing, vol. 151, Art. no. 111126, 2024.
|
| [40] |
G. Karafotias, A. E. Eiben, and M. Hoogendoorn, “Generic parameter control with reinforcement learning,” in Proc. the Annual Conf. on Genetic and Evolutionary Computation, 2014, pp. 1319–1326.
|
| [41] |
H. Zhang, J. Sun, K. C. Tan, and Z. Xu, “Learning adaptive differential evolution by natural evolution strategies,” IEEE Trans. Emerging Topics in Computational Intelligence, vol. 7, no. 3, pp. 872–886, 2023.
|
| [42] |
Y. Liu, H. Lu, S. Cheng, and Y. Shi, “An adaptive online parameter control algorithm for particle swarm optimization based on reinforcement learning,” in Proc. the IEEE Congress on Evolutionary Computation, Wellington, New Zealand, 2019, pp. 815–822.
|
| [43] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 2018.
|
| [44] |
R. Tinós, “Artificial neural network based crossover for evolutionary algorithms,” Applied Soft Computing, vol. 95, Art. no. 106512, 2020.
|
| [45] |
C. He, S. Huang, R. Cheng, K. C. Tan, and Y. Jin, “Evolutionary multiobjective optimization driven by generative adversarial networks (GANs),” IEEE Trans. Cybernetics, vol. 51, no. 6, pp. 3129–3142, 2021.
|
| [46] |
J. Kudela, “A critical problem in benchmarking and analysis of evolutionary computation methods,” Nature Machine Intelligence, vol. 4, pp. 1238–1245, 2022.
|
| [47] |
K. Sörensen, “Metaheuristics—The metaphor exposed,” Int. Trans. in Operational Research, vol. 22, no. 1, pp. 3–18, 2015.
|
| [48] |
N. Hansen, R. Ros, N. Mauny, M. Schoenauer, and A. Auger, “Impacts of invariance in search: When CMA-ES and PSO face ill-conditioned and non-separable problems,” Applied Soft Computing, vol. 11, no. 8, pp. 5755–5769, 2011.
|
| [49] |
R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. the 6th Int. Symposium on Micro Machine and Human Science, Nagoya, Japan, 1995, pp. 39–43.
|
| [50] |
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347v2, 2017.
|
| [51] |
K. Deb and M. Goyal, “A combined genetic adaptive search (GeneAS) for engineering design,” Computer Science and Informatics, vol. 26, no. 4, pp. 30–45, 1996.
|
| [52] |
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evolutionary Computation, vol. 6, no. 2, pp. 182–197, 2002.
|
| [53] |
X. Chu, F. Cai, C. Cui, M. Hu, L. Li, and Q. Qin, “Adaptive recommendation model using meta-learning for population-based algorithms,” Information Sciences, vol. 476, pp. 192–210, 2019.
|
| [54] |
M. Sharma, A. Komninos, M. López-Ibáñez, and D. Kazakov, “Deep reinforcement learning based parameter control in differential evolution,” in Proc. the Genetic and Evolutionary Computation Conf., 2019, pp. 709–717.
|
| [55] |
S. Zhao, T. Zhang, S. Ma, and M. Chen, “Dandelion optimizer: A nature-inspired metaheuristic algorithm for engineering applications,” Engineering Applications of Artificial Intelligence, vol. 114, Art. no. 105075, 2022.
|
| [56] |
B. Abdollahzadeh, F. S. Gharehchopogh, N. Khodadadi, and S. Mirjalili, “Mountain gazelle optimizer: A new nature-inspired metaheuristic algorithm for global optimization problems,” Advances in Engineering Software, vol. 174, Art. no. 103282, 2022.
|
| [57] |
Y. Tian, R. Cheng, X. Zhang, and Y. Jin, “PlatEMO: A MATLAB platform for evolutionary multi-objective optimization[educational forum],” IEEE Computational Intelligence Magazine, vol. 12, no. 4, pp. 73–87, 2017.
|
| [58] |
X. Yao, Y. Liu, and G. Lin, “Evolutionary programming made faster,” IEEE Trans. Evolutionary Computation, vol. 3, no. 2, pp. 82–102, 1999.
|
| [59] |
J. Derrac, S. García, D. Molina, and F. Herrera, “A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms,” Swarm and Evolutionary Computation, vol. 1, no. 1, pp. 3–18, 2011. doi: 10.1016/j.swevo.2011.02.002
|
| [60] |
Y. Tian, X. Zheng, X. Zhang, and Y. Jin, “Efficient large-scale multiobjective optimization based on a competitive swarm optimizer,” IEEE Trans. Cybernetics, vol. 50, no. 8, pp. 3696–3708, 2020. doi: 10.1109/TCYB.2019.2906383
|
| [61] |
Y. Tian, H. Chen, H. Ma, X. Zhang, K. C. Tan, and Y. Jin, “Integrating conjugate gradients into evolutionary algorithms for large-scale continuous multi-objective optimization,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 10, pp. 1801–1817, 2022. doi: 10.1109/JAS.2022.105875
|
| [62] |
Y. Yuan, X. Luo, M. Shang, and Z. Wang, “A Kalman-filter-incorporated latent factor analysis model for temporally dynamic sparse data,” IEEE Trans. Cybernetics, vol. 53, no. 9, pp. 5788–5801, 2023. doi: 10.1109/TCYB.2022.3185117
|
| [63] |
J. Li, X. Luo, Y. Yuan, and S. Gao, “A nonlinear PID-incorporated adaptive stochastic gradient descent algorithm for latent factor analysis,” IEEE Trans. Autom. Science and Engineering, vol. 21, no. 3, pp. 3742–3756, 2024. doi: 10.1109/TASE.2023.3284819
|
| [64] |
Y. Yuan, J. Li, and X. Luo, “A fuzzy PID-incorporated stochastic gradient descent algorithm for fast and accurate latent factor analysis,” IEEE Trans. Fuzzy Systems, vol. 32, no. 7, pp. 4049–4061, 2024. doi: 10.1109/TFUZZ.2024.3389733
|
| [65] |
X. Xiang, Y. Tian, J. Xiao, and X. Zhang, “A clustering-based surrogate-assisted multiobjective evolutionary algorithm for shelter location under uncertainty of road networks,” IEEE Trans. Industrial Informatics, vol. 16, no. 12, pp. 7544–7555, 2020. doi: 10.1109/TII.2019.2962137
|