A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 15.3, Top 1 (SCI Q1)
    CiteScore: 23.5, Top 2% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
J. Li and  T. Zhou,  “A robust large-scale multiagent deep reinforcement learning method for coordinated automatic generation control of integrated energy systems in a performance-based frequency regulation market,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 7, pp. 1–14, Jul. 2025. doi: 10.1109/JAS.2024.124482
Citation: J. Li and  T. Zhou,  “A robust large-scale multiagent deep reinforcement learning method for coordinated automatic generation control of integrated energy systems in a performance-based frequency regulation market,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 7, pp. 1–14, Jul. 2025. doi: 10.1109/JAS.2024.124482

A Robust Large-Scale Multiagent Deep Reinforcement Learning Method for Coordinated Automatic Generation Control of Integrated Energy Systems in a Performance-Based Frequency Regulation Market

doi: 10.1109/JAS.2024.124482
Funds:  This work was jointly supported by National Natural Science Foundation of China (52307118)
More Information
  • To enhance the frequency stability and lower the regulation mileage payment of a multiarea integrated energy system (IES) that supports the power Internet of Things (IoT), this paper proposes a data-driven cooperative method for automatic generation control (AGC). The method consists of adaptive fractional-order proportional-integral (FOPI) controllers and a novel efficient integration exploration multiagent twin delayed deep deterministic policy gradient (EIE-MATD3) algorithm. The FOPI controllers are designed for each area based on the performance-based frequency regulation market mechanism. The EIE-MATD3 algorithm is used to tune the coefficients of the FOPI controllers in real time using centralized training and decentralized execution. The algorithm incorporates imitation learning and efficient integration exploration to obtain a more robust coordinated control strategy. An experiment on the four-area China Southern Grid (CSG) real-time digital system shows that the proposed method can improve the control performance and reduce the regulation mileage payment of each area in the IES.

     

  • loading
  • [1]
    X. Zhao, S. Zou, and Z. Ma, “Decentralized resilient H load frequency control for cyber-physical power systems under DoS attacks,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 11, pp. 1737–1751, Nov. 2021. doi: 10.1109/JAS.2021.1004162
    [2]
    L. Yin, S. Luo, and C. Ma, “Expandable depth and width adaptive dynamic programming for economic smart generation control of smart grids,” Energy, vol. 232, p. 120964, Oct. 2021. doi: 10.1016/j.energy.2021.120964
    [3]
    L. Xi, J. Wu, Y. Xu, and H. Sun, “Automatic generation control based on multiple neural networks with actor-critic strategy,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 6, pp. 2483–2493, Jun. 2021. doi: 10.1109/TNNLS.2020.3006080
    [4]
    J. Li, T. Yu, X. Zhang, F. Li, D. Lin, and H. Zhu, “Efficient experience replay based deep deterministic policy gradient for AGC dispatch in integrated energy system,” Appl. Energy, vol. 285, p. 116386, Mar. 2021. doi: 10.1016/j.apenergy.2020.116386
    [5]
    Y. Sun, Y. Wang, Z. Wei, G. Sun, and X. Wu, “Robust H load frequency control of multi-area power system with time delay: A sliding mode control approach,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 2, pp. 610–617, Mar. 2018. doi: 10.1109/JAS.2017.7510649
    [6]
    X. Zhang, T. Tan, B. Zhou, T. Yu, B. Yang, and X. Huang, “Adaptive distributed auction-based algorithm for optimal mileage based AGC dispatch with high participation of renewable energy,” Int. J. Electr. Power Energy Syst., vol. 124, p. 106371, Jan. 2021. doi: 10.1016/j.ijepes.2020.106371
    [7]
    Z. Deng and C. Xu, “Frequency regulation of power systems with a wind farm by sliding-mode-based design,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 11, pp. 1980–1989, Nov. 2022. doi: 10.1109/JAS.2022.105407
    [8]
    B. Yildirim, M. Gheisarnejad, and M. H. Khooban, “A robust non-integer controller design for load frequency control in modern marine power grids,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 6, no. 4, pp. 852–866, Aug. 2022. doi: 10.1109/TETCI.2021.3114735
    [9]
    A. Zafari, M. Mehrasa, S. Bacha, K. Al-Haddad, and N. Hosseinzadeh, “A robust fractional-order control technique for stable performance of multilevel converter-based grid-tied DG units,” IEEE Trans. Ind. Electron., vol. 69, no. 10, pp. 10192–10201, Oct. 2022. doi: 10.1109/TIE.2021.3121725
    [10]
    M. Barakat, “Novel chaos game optimization tuned-fractional-order PID fractional-order PI controller for load-frequency control of interconnected power systems,” Prot. Control Mod. Power Syst., vol. 7, no. 1, p. 16, May 2022. doi: 10.1186/s41601-022-00238-x
    [11]
    L. Yin and D. Zheng, “Decomposition prediction fractional-order PID reinforcement learning for short-term smart generation control of integrated energy systems,” Appl. Energy, vol. 355, p. 122246, Feb. 2024. doi: 10.1016/j.apenergy.2023.122246
    [12]
    A. Kumar and S. Pan, “Design of fractional order PID controller for load frequency control system with communication delay,” ISA Trans., vol. 129, pp. 138–149, Oct. 2022. doi: 10.1016/j.isatra.2021.12.033
    [13]
    K. Hongesombut and R. Keteruksa, “Fractional order based on a flower pollination algorithm PID controller and virtual inertia control for microgrid frequency stabilization,” Electr. Power Syst. Res., vol. 220, p. 109381, Jul. 2023. doi: 10.1016/j.jpgr.2023.109381
    [14]
    D. Guha, P. K. Roy, and S. Banerjee, “Adaptive fractional-order sliding-mode disturbance observer-based robust theoretical frequency controller applied to hybrid wind-diesel power system,” ISA Trans, vol. 133, pp. 160–183, Feb. 2023. doi: 10.1016/j.isatra.2022.06.030
    [15]
    W. Zheng, Y. Q. Chen, X. Wang, Y. Chen, and M. Lin, “Enhanced fractional order sliding mode control for a class of fractional order uncertain systems with multiple mismatched disturbances,” ISA Trans., vol. 133, pp. 147–159, Feb. 2023. doi: 10.1016/j.isatra.2022.07.002
    [16]
    X.-C. Shangguan, Y. He, C.-K. Zhang, L. Jiang, and M. Wu, “Adjustable event-triggered load frequency control of power systems using control-performance-standard-based fuzzy logic,” IEEE Trans. Fuzzy Syst., vol. 30, no. 8, pp. 3297–3311, Aug. 2022. doi: 10.1109/TFUZZ.2021.3112232
    [17]
    X. Chen, C. Zhao, and N. Li, “Distributed automatic load frequency control with optimality in power systems,” IEEE Trans. Control Netw. Syst., vol. 8, no. 1, pp. 307–318, Mar. 2021. doi: 10.1109/TCNS.2020.3024489
    [18]
    Q. Li, Z. Peng, and B. Zhou, “Efficient learning of safe driving policy via human-AI copilot optimization,” in Proc. 10th Int. Conf. Learning Representations, 2022.
    [19]
    J. Wu, Y. Zhou, H. Yang, Z. Huang, and C. Lv, “Human-guided reinforcement learning with sim-to-real transfer for autonomous navigation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 12, pp. 14745–14759, Dec. 2023. doi: 10.1109/TPAMI.2023.3314762
    [20]
    L. Xi, L. Yu, Y. Xu, S. Wang, and X. Chen, “A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems,” IEEE Trans. Sustain. Energy, vol. 11, no. 4, pp. 2417–2426, Dec. 2020. doi: 10.1109/TSTE.2019.2958361
    [21]
    V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. doi: 10.1038/nature14236
    [22]
    C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of PPO in cooperative multi-agent games,” in Proc. 36th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2022, pp. 1787.
    [23]
    E. Wei, D. Wicke, D. Freelan, and S. Luke, “Multiagent soft Q-learning,” in Proc. AAAI Spring Symposia, Stanford University, Palo Alto, USA, 2018.
    [24]
    S. Iqbal and F. Sha, “Actor-attention-critic for multi-agent reinforcement learning,” in Proc. 36th Int. Conf. Machine Learning, Long Beach, USA, 2019, pp. 2961–2970.
    [25]
    R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6382–6393.
    [26]
    K. Cobbe, O. Klimov, C. Hesse, T. Kim, and J. Schulman, “Quantifying generalization in reinforcement learning,” in Proc. 36th Int. Conf. Machine Learning, Long Beach, USA, 2019, pp. 1282–1289.
    [27]
    R. Kirk, A. Zhang, E. Grefenstette, and T. Rocktäschel, “A survey of zero-shot generalisation in deep reinforcement learning,” J. Artif. Intell. Res., vol. 76, pp. 201–264, Jan. 2023. doi: 10.1613/jair.1.14174
    [28]
    T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, A. Sendonaris, G. Dulac-Arnold, I. Osband, J. Agapiou, J. Z. Leibo, and A. Gruslys, “Learning from demonstrations for real world reinforcement learning,” arXiv preprint arXiv: 1704.03732, 2017.
    [29]
    T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Proc. 4th Int. Conf. Learning Representations, San Juan, Puerto Rico, 2016.
    [30]
    D. Horgan, J. Quan, D. Budden, G. Barth-Maron, M. Hessel, H. van Hasselt, and D. Silver, “Distributed prioritized experience replay,” in Proc. 6th Int. Conf. Learning Representations, Vancouver, Canada, 2018.
    [31]
    S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 1587–1596.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(4)

    Article Metrics

    Article views (11) PDF downloads(1) Cited by()

    Highlights

    • A data-driven FOPI-based AGC controller is proposed for IES
    • An EIE-MATD3 algorithm is proposed to improve the robustness of AGC controller
    • The FOPI controllers are designed for each area based on the market mechanism

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return