A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 4
Apr.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
F. Ming, W. Gong, L. Wang, and  Y. Jin,  “Constrained multi-objective optimization with deep reinforcement learning assisted operator selection,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 4, pp. 919–931, Apr. 2024. doi: 10.1109/JAS.2023.123687
Citation: F. Ming, W. Gong, L. Wang, and  Y. Jin,  “Constrained multi-objective optimization with deep reinforcement learning assisted operator selection,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 4, pp. 919–931, Apr. 2024. doi: 10.1109/JAS.2023.123687

Constrained Multi-Objective Optimization With Deep Reinforcement Learning Assisted Operator Selection

doi: 10.1109/JAS.2023.123687
Funds:  This work was partly supported by the National Natural Science Foundation of China (62076225, 62073300) and the Natural Science Foundation for Distinguished Young Scholars of Hubei (2019CFA081)
More Information
  • Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable attention. Various constrained multi-objective optimization evolutionary algorithms (CMOEAs) have been developed with the use of different algorithmic strategies, evolutionary operators, and constraint-handling techniques. The performance of CMOEAs may be heavily dependent on the operators used, however, it is usually difficult to select suitable operators for the problem at hand. Hence, improving operator selection is promising and necessary for CMOEAs. This work proposes an online operator selection framework assisted by Deep Reinforcement Learning. The dynamics of the population, including convergence, diversity, and feasibility, are regarded as the state; the candidate operators are considered as actions; and the improvement of the population state is treated as the reward. By using a Q-network to learn a policy to estimate the Q-values of all actions, the proposed approach can adaptively select an operator that maximizes the improvement of the population according to the current state and thereby improve the algorithmic performance. The framework is embedded into four popular CMOEAs and assessed on 42 benchmark problems. The experimental results reveal that the proposed Deep Reinforcement Learning-assisted operator selection significantly improves the performance of these CMOEAs and the resulting algorithm obtains better versatility compared to nine state-of-the-art CMOEAs.

     

  • loading
  • 1 An evolutionary operator means the operations of an evolutionary algorithm used for generating offspring solution, such as the crossover and mutation of GA, the differential variation of DE, and the particle swarm update of PSO/CSO.
  • [1]
    A. Kumar, G. Wu, M. Z. Ali, Q. Luo, R. Mallipeddi, P. N. Suganthan, and S. Das, “A benchmark-suite of real-world constrained multi-objective optimization problems and some baseline results,” Swarm Evol. Comput., vol. 67, p. 100961, Dec. 2021. doi: 10.1016/j.swevo.2021.100961
    [2]
    B. Tan, H. Ma, Y. Mei, and M. Zhang, “Evolutionary multi-objective optimization for web service location allocation problem,” IEEE Trans. Serv. Comput., vol. 14, no. 2, pp. 458–471, Mar.–Apr. 2021. doi: 10.1109/TSC.2018.2793266
    [3]
    Z. Ma and Y. Wang, “Shift-based penalty for evolutionary constrained multiobjective optimization and its application,” IEEE Trans. Cybern., vol. 53, no. 1, pp. 18–30, Jan. 2023. doi: 10.1109/TCYB.2021.3069814
    [4]
    Y. Tian, T. Zhang, J. Xiao, X. Zhang, and Y. Jin, “A coevolutionary framework for constrained multiobjective optimization problems,” IEEE Trans. Evol. Comput., vol. 25, no. 1, pp. 102–116, Feb. 2021. doi: 10.1109/TEVC.2020.3004012
    [5]
    Y. Tian, Y. Zhang, Y. Su, X. Zhang, K. C. Tan, and Y. Jin, “Balancing objective optimization and constraint satisfaction in constrained evolutionary multiobjective optimization,” IEEE Trans. Cybern., vol. 52, no. 9, pp. 9559–9572, Sept. 2022. doi: 10.1109/TCYB.2020.3021138
    [6]
    R. Jiao, B. Xue, and M. Zhang, “A multiform optimization framework for constrained multiobjective optimization,” IEEE Trans. Cybern., vol. 53, no. 8, pp. 5165–5177, Aug. 2023. doi: 10.1109/TCYB.2022.3178132
    [7]
    K. Qiao, K. Yu, B. Qu, J. Liang, H. Song, and C. Yue, “An evolutionary multitasking optimization framework for constrained multiobjective optimization problems,” IEEE Trans. Evol. Comput., vol. 26, no. 2, pp. 263–277, Apr. 2022. doi: 10.1109/TEVC.2022.3145582
    [8]
    Z. Ma, Y. Wang, and W. Song, “A new fitness function with two rankings for evolutionary constrained multiobjective optimization,” IEEE Trans. Syst.,Man,Cybern.: Syst., vol. 51, no. 8, pp. 5005–5016, Aug. 2021. doi: 10.1109/TSMC.2019.2943973
    [9]
    Y. Tian, X. Zhang, C. He, K. Tan, and Y. Jin, “Principled design of translation, scale, and rotation invariant variation operators for metaheuristics,” Chin. J. Electron., vol. 32, no. 1, pp. 111–129, Jan. 2023. doi: 10.23919/cje.2022.00.100
    [10]
    J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis With Applications to Biology, Control, and Artificial Intelligence. Cambridge, USA: MIT Press, 1992.
    [11]
    R. Storn and K. Price, “Differential evolution — A simple and efficient heuristic for global optimization over continuous spaces,” J. Global Optim., vol. 11, no. 4, pp. 341–359, Dec. 1997. doi: 10.1023/A:1008202821328
    [12]
    R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. 6th Int. Symp. Micro Machine and Human Science, Nagoya, Japan, 1995, pp. 39–43.
    [13]
    X. Zhang, X. Zheng, R. Cheng, J. Qiu, and Y. Jin, “A competitive mechanism based multi-objective particle swarm optimizer with fast convergence,” Inf. Sci., vol. 427, pp. 63–76, Feb. 2018. doi: 10.1016/j.ins.2017.10.037
    [14]
    L. Hu, Y. Yang, Z. Tang, Y. He, and X. Luo, “FCAN-MOPSO: An improved fuzzy-based graph clustering algorithm for complex networks with multiobjective particle swarm optimization,” IEEE Trans. Fuzzy Syst., vol. 31, no. 10, pp. 3470–3484, Oct. 2023. doi: 10.1109/TFUZZ.2023.3259726
    [15]
    L. Hu, K. C. C. Chan, X. Yuan, and S. Xiong, “A variational Bayesian framework for cluster analysis in a complex network,” IEEE Trans. Knowl. Data Eng., vol. 32, no. 11, pp. 2115–2128, Nov. 2020. doi: 10.1109/TKDE.2019.2914200
    [16]
    C. Wang, R. Xu, and X. Zhang, “An evolutionary algorithm based on multi-operator ensemble for multi-objective optimization,” in Proc. 15th Intelligent Computing Theories and Application, Nanchang, China, 2019, pp. 14–24.
    [17]
    Y. Tian, X. Li, H. Ma, X. Zhang, K. C. Tan, and Y. Jin, “Deep reinforcement learning based adaptive operator selection for evolutionary multi-objective optimization,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 7, no. 4, pp. 1051–1064, Aug. 2023. doi: 10.1109/TETCI.2022.3146882
    [18]
    L. Dong, Q. Lin, Y. Zhou, and J. Jiang, “Adaptive operator selection with test-and-apply structure for decomposition-based multi-objective optimization,” Swarm Evol. Comput., vol. 68, p. 101013, Feb. 2022. doi: 10.1016/j.swevo.2021.101013
    [19]
    S. Schneider, R. Khalili, A. Manzoor, H. Qarawlus, R. Schellenberg, H. Karl, and A. Hecker, “Self-learning multi-objective service coordination using deep reinforcement learning,” IEEE Trans. Netw. Serv. Manage., vol. 18, no. 3, pp. 3829–3842, Sept. 2021. doi: 10.1109/TNSM.2021.3076503
    [20]
    L. Caviglione, M. Gaggero, M. Paolucci, and R. Ronco, “Deep reinforcement learning for multi-objective placement of virtual machines in cloud datacenters,” Soft Comput., vol. 25, no. 19, pp. 12569–12588, Oct. 2021. doi: 10.1007/s00500-020-05462-x
    [21]
    W. Liu, R. Wang, T. Zhang, K. Li, W. Li, H. Ishibuchi, and X. Liao, “Hybridization of evolutionary algorithm and deep reinforcement learning for multiobjective orienteering optimization,” IEEE Trans. Evol. Comput., vol. 27, no. 5, pp. 1260–1274, Oct. 2023. doi: 10.1109/TEVC.2022.3199045
    [22]
    Y. Li, G. Hao, Y. Liu, Y. Yu, Z. Ni, and Y. Zhao, “Many-objective distribution network reconfiguration via deep reinforcement learning assisted optimization algorithm,” IEEE Trans. Power Delivery, vol. 37, no. 3, pp. 2230–2244, Jun. 2022. doi: 10.1109/TPWRD.2021.3107534
    [23]
    F. Zhao, S. Di, and L. Wang, “A hyperheuristic with Q-learning for the multiobjective energy-efficient distributed blocking flow shop scheduling problem,” IEEE Trans. Cybern., vol. 53, no. 5, pp. 3337–3350, May 2023. doi: 10.1109/TCYB.2022.3192112
    [24]
    K. Li, T. Zhang, and R. Wang, “Deep reinforcement learning for multiobjective optimization,” IEEE Trans. Cybern., vol. 51, no. 6, pp. 3103–3114, Jun. 2021. doi: 10.1109/TCYB.2020.2977661
    [25]
    Z. Zhang, Z. Wu, H. Zhang, and J. Wang, “Meta-learning-based deep reinforcement learning for multiobjective optimization problems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 10, pp. 7978–7991, Oct. 2023. doi: 10.1109/TNNLS.2022.3148435
    [26]
    Z. Fan, W. Li, X. Cai, H. Li, C. Wei, Q. Zhang, K. Deb, and E. Goodman, “Push and pull search for solving constrained multi-objective optimization problems,” Swarm Evol. Comput., vol. 44, pp. 665–679, Feb. 2019. doi: 10.1016/j.swevo.2018.08.017
    [27]
    Q. Zhu, Q. Zhang, and Q. Lin, “A constrained multiobjective evolutionary algorithm with detect-and-escape strategy,” IEEE Trans. Evol. Comput., vol. 24, no. 5, pp. 938–947, Oct. 2020. doi: 10.1109/TEVC.2020.2981949
    [28]
    A. Santiago, B. Dorronsoro, H. J. Fraire, and P. Ruiz, “Micro-Genetic algorithm with fuzzy selection of operators for multi-objective optimization: μFAME,” Swarm Evol. Comput., vol. 61, p. 100818, Mar. 2021. doi: 10.1016/j.swevo.2020.100818
    [29]
    Y. Yuan, H. Xu, and B. Wang, “An experimental investigation of variation operators in reference-point based many-objective optimization,” in Proc. Annu. Conf. Genetic and Evolutionary Computation, Madrid, Spain, 2015, pp. 775–782.
    [30]
    K. McClymont and E. C. Keedwell, “Markov chain hyper-heuristic (MCHH): An online selective hyper-heuristic for multi-objective continuous problems,” in Proc. 13th Annu. Conf. Genetic and Evolutionary Computation, Dublin, Ireland, 2011, pp. 2003–2010.
    [31]
    A. Lin, P. Yu, S. Cheng, and L. Xing, “One-to-one ensemble mechanism for decomposition-based multi-objective optimization,” Swarm Evol. Comput., vol. 68, p. 101007, Feb. 2022. doi: 10.1016/j.swevo.2021.101007
    [32]
    R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” IEEE Trans. Neural Netw., vol. 9, no. 5, p. 1054, Sept. 1998.
    [33]
    S. A. Fayaz, S. J. Sidiq, M. Zaman, and M. A. Butt, “Machine learning: An introduction to reinforcement learning,” in Machine Learning and Data Science: Fundamentals and Applications, P. Agrawal, C. Gupta, A. Sharma, V. Madaan, and N. Joshi, Eds. Hoboken, USA: Wiley-Scrivener, 2022, pp. 1–22.
    [34]
    C. J. C. H. Watkins and P. Dayan, “Technical note: Q-learning,” Mach. Learn., vol. 8, no. 3, pp. 279–292, May 1992.
    [35]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. doi: 10.1038/nature14236
    [36]
    P. Kerschke, H. H. Hoos, F. Neumann, and H. Trautmann, “Automated algorithm selection: Survey and perspectives,” Evol. Comput., vol. 27, no. 1, pp. 3–45, Mar. 2019. doi: 10.1162/evco_a_00242
    [37]
    W. Gong, Z. Cai, and D. Liang, “Adaptive ranking mutation operator based differential evolution for constrained optimization,” IEEE Trans. Cybern., vol. 45, no. 4, pp. 716–727, Apr. 2015. doi: 10.1109/TCYB.2014.2334692
    [38]
    Q. Zhang, A. Zhou, S. Zhao, P. N. Suganthan, W. Liu, and S. Tiwari, “Multiobjective optimization test instances for the CEC 2009 special session and competition,” Mech. Eng., 2008.
    [39]
    Z. Fan, W. Li, X. Cai, H. Li, C. Wei, Q. Zhang, K. Deb, and E. Goodman, “Difficulty adjustable and scalable constrained multiobjective test problem toolkit,” Evol. Comput., vol. 28, no. 3, pp. 339–378, Sept. 2020. doi: 10.1162/evco_a_00259
    [40]
    Z.-Z. Liu and Y. Wang, “Handling constrained multiobjective optimization problems with constraints in both the decision and objective spaces,” IEEE Trans. Evol. Comput., vol. 23, no. 5, pp. 870–884, Oct. 2019. doi: 10.1109/TEVC.2019.2894743
    [41]
    Z. Fan, W. Li, X. Cai, H. Huang, Y. Fang, Y. You, J. Mo, C. Wei, and E. Goodman, “An improved epsilon constraint-handling method in MOEA/D for CMOPs with large infeasible regions,” Soft Comput., vol. 23, no. 23, pp. 12491–12510, Dec. 2019. doi: 10.1007/s00500-019-03794-x
    [42]
    M. Ming, A. Trivedi, R. Wang, D. Srinivasan, and T. Zhang, “A dual-population-based evolutionary algorithm for constrained multiobjective optimization,” IEEE Trans. Evol. Comput., vol. 25, no. 4, pp. 739–753, Aug. 2021. doi: 10.1109/TEVC.2021.3066301
    [43]
    Z.-Z. Liu, B.-C. Wang, and K. Tang, “Handling constrained multiobjective optimization problems via bidirectional coevolution,” IEEE Trans. Cybern., vol. 52, no. 10, pp. 10163–10176, Oct. 2022. doi: 10.1109/TCYB.2021.3056176
    [44]
    M. Li, S. Yang, and X. Liu, “Shift-based density estimation for pareto-based algorithms in many-objective optimization,” IEEE Trans. Evol. Comput., vol. 18, no. 3, pp. 348–365, Jun. 2014. doi: 10.1109/TEVC.2013.2262178
    [45]
    K. Yu, J. Liang, B. Qu, Y. Luo, and C. Yue, “Dynamic selection preference-assisted constrained multiobjective differential evolution,” IEEE Trans. Syst.,Man,Cybern.: Syst., vol. 52, no. 5, pp. 2954–2965, May 2022. doi: 10.1109/TSMC.2021.3061698
    [46]
    J. Yuan, H.-L. Liu, and Z. He, “A constrained multi-objective evolutionary algorithm using valuable infeasible solutions,” Swarm Evol. Comput., vol. 68, p. 101020, Feb. 2022. doi: 10.1016/j.swevo.2021.101020
    [47]
    K. Deb and R. B. Agrawal, “Simulated binary crossover for continuous search space,” Complex Syst., vol. 9, no. 2, pp. 115–148, 1995.
    [48]
    H. Ishibuchi, R. Imada, N. Masuyama, and Y. Nojima, “Comparison of hypervolume, IGD and IGD+ from the viewpoint of optimal distributions of solutions,” in Proc. 10th Int. Conf. Evolutionary Multi-Criterion Optimization, East Lansing, USA, 2019, pp. 332–345.
    [49]
    H. Ishibuchi, L. M. Pang, and K. Shang, “Difficulties in fair performance comparison of multi-objective evolutionary algorithms [research frontier],” IEEE Comput. Intell. Mag., vol. 17, no. 1, pp. 86–101, Feb. 2022. doi: 10.1109/MCI.2021.3129961
    [50]
    J. Alcalá-Fdez, L. Sánchez, S. García, M. J. del Jesus, S. Ventura, J. M. Garrell, J. Otero, C. Romero, J. Bacardit, V. M. Rivas, J. C. Fernández, and F. Herrera, “KEEL: A software tool to assess evolutionary algorithms for data mining problems,” Soft Comput., vol. 13, no. 3, pp. 307–318, Feb. 2009. doi: 10.1007/s00500-008-0323-y
    [51]
    J. R. Schott, “Fault tolerant design using single and multicriteria genetic algorithm optimization,” Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, USA, 1995.
    [52]
    Y. Tian, H. Chen, H. Ma, X. Zhang, K. C. Tan, and Y. Jin, “Integrating conjugate gradients into evolutionary algorithms for large-scale continuous multi-objective optimization,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 10, pp. 1801–1817, Oct. 2022. doi: 10.1109/JAS.2022.105875
    [53]
    Y. Tian, Y. Feng, X. Zhang, and C. Sun, “A fast clustering based evolutionary algorithm for super-large-scale sparse multi-objective optimization,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 4, pp. 1048–1063, Apr. 2023. doi: 10.1109/JAS.2022.105437
    [54]
    H. Li, J. Li, P. Wu, Y. You, and N. Zeng, “A ranking-system-based switching particle swarm optimizer with dynamic learning strategies,” Neurocomputing, vol. 494, pp. 356–367, Jul. 2022. doi: 10.1016/j.neucom.2022.04.117
    [55]
    N. Zeng, Z. Wang, W. Liu, H. Zhang, K. Hone, and X. Liu, “A dynamic neighborhood-based switching particle swarm optimization algorithm,” IEEE Trans. Cybern., vol. 52, no. 9, pp. 9290–9301, Sept. 2022. doi: 10.1109/TCYB.2020.3029748
    [56]
    D. Guo, X. Wang, K. Gao, Y. Jin, J. Ding, and T. Chai, “Evolutionary optimization of high-dimensional multiobjective and many-objective expensive problems assisted by a dropout neural network,” IEEE Trans. Syst.,Man,Cybern.: Syst., vol. 52, no. 4, pp. 2084–2097, Apr. 2022. doi: 10.1109/TSMC.2020.3044418
    [57]
    X. Luo, Y. Yuan, S. Chen, N. Zeng, and Z. Wang, “Position-transitional particle swarm optimization-incorporated latent factor analysis,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 8, pp. 3958–3970, Aug. 2022. doi: 10.1109/TKDE.2020.3033324
  • JAS-2023-0129-supp.pdf

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(5)

    Article Metrics

    Article views (125) PDF downloads(52) Cited by()

    Highlights

    • A generic deep reinforcement learning-assisted multi-objective optimization operator selection model
    • The model can contain an arbitrary number of operators
    • An adaptive operator-assisted constrained multi-objective optimization framework
    • The framework can be embedded into any CMOEA and improves its performance

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return