A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 11 Issue 2
Feb.  2024

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and  S. Xie,  “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993
Citation: J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and  S. Xie,  “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993

UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach

doi: 10.1109/JAS.2023.123993
Funds:  This work was supported in part by NSFC (62102099, U22A2054, 62101594); in part by the Pearl River Talent Recruitment Program (2021QN02S643), and Guangzhou Basic Research Program (2023A04J1699); in part by the National Research Foundation, Singapore, and Infocomm Media Development Authority under its Future Communications Research Development Programme, DSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019, Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programme, DesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programme, and MOE Tier 1 under Grant RG87/22; in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165); in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102; in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)
More Information
  • Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation, which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units (RSU) or unmanned aerial vehicles (UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning (MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization (MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers (e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.

     

  • loading
  • [1]
    M. Xu, W. C. Ng, W. Y. B. Lim, J. Kang, Z. Xiong, D. Niyato, Q. Yang, X. S. Shen, and C. Miao, “A full dive into realizing the edge-enabled metaverse: Visions, enabling technologies, and challenges,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 656–700, 2023. doi: 10.1109/COMST.2022.3221119
    [2]
    H. Duan, J. Li, S. Fan, Z. Lin, X. Wu, and W. Cai, “Metaverse for social good: A university campus prototype,” in Proc. 29th ACM Int. Conf. Multimedia, China, 2021, pp. 153–161.
    [3]
    X. Huang, W. Zhong, J. Nie, Q. Hu, Z. Xiong, J. Kang, and T. Q. Quek, “Joint user association and resource pricing for metaverse: Distributed and centralized approaches,” in Proc. 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 505–513.
    [4]
    Y. Wang, Z. Su, N. Zhang, R. Xing, D. Liu, T. H. Luan, and X. Shen, “A survey on metaverse: Fundamentals, security, and privacy,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 319–352, 2023. doi: 10.1109/COMST.2022.3202047
    [5]
    Y. Jiang, J. Kang, D. Niyato, X. Ge, Z. Xiong, and C. Miao, “Reliable coded distributed computing for metaverse services: Coalition formation and incentive mechanism design,” arXiv preprint arXiv: 2111.10548, 2022.
    [6]
    X. Sun and N. Ansari, “PRIMAL: PRofit maximization avatar placement for mobile edge computing,” in Proc. IEEE Int. Conf. Communications, Kuala Lumpur, Malaysia, 2016, pp. 1–6.
    [7]
    N. H. Chu, D. T. Hoang, D. N. Nguyen, K. T. Phan, E. Dutkiewicz, D. Niyato, and T. Shu, “MetaSlicing: A novel resource allocation framework for metaverse,” IEEE Trans. Mob. Comput., 2023. DOI: 10.1109/TMC.2023.3288085
    [8]
    Y. Yao, X. Zheng, Z. Wang, and J. Jiang, “Development overview of augmented reality navigation,” Acad. J. Comput. Inf. Sci., vol. 4, no. 2, pp. 83–90, May 2021.
    [9]
    M. Xu, D. T. Hoang, J. Kang, D. Niyato, Q. Yan, and D. I. Kim, “Secure and reliable transfer learning framework for 6G-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 29, no. 4, pp. 132–139, Aug. 2022. doi: 10.1109/MWC.004.2100542
    [10]
    K. Li, Y. Cui, W. Li, T. Lv, X. Yuan, S. Li, W. Ni, M. Simsek, and F. Dressler, “When internet of things meets metaverse: Convergence of physical and cyber worlds,” IEEE Internet Things J., vol. 10, no. 5, pp. 4148–4173, Mar. 2023. doi: 10.1109/JIOT.2022.3232845
    [11]
    K. Zhu, J. Yang, Y. Zhang, J. Nie, W. Y. B. Lim, H. Zhang, and Z. Xiong, “Aerial refueling: Scheduling wireless energy charging for UAV enabled data collection,” IEEE Trans. Green Commun. Netw., vol. 6, no. 3, pp. 1494–1510, Sept. 2022. doi: 10.1109/TGCN.2022.3164602
    [12]
    L. Chang, Z. Zhang, P. Li, S. Xi, W. Guo, Y. Shen, Z. Xiong, J. Kang, D. Niyato, X. Qiao, and Y. Wu, “6G-enabled edge AI for metaverse: Challenges, methods, and future research directions,” J. Commun. Inf. Netw., vol. 7, no. 2, pp. 107–121, Jun. 2022. doi: 10.23919/JCIN.2022.9815195
    [13]
    R. Yu, Y. Zhang, S. Gjessing, W. Xia, and K. Yang, “Toward cloud-based vehicular networks with efficient resource management,” IEEE Net., vol. 27, no. 5, pp. 48–55, Sep.–Oct. 2013. doi: 10.1109/MNET.2013.6616115
    [14]
    J. Kang, R. Yu, X. Huang, M. Jonsson, H. Bogucka, S. Gjessing, and Y. Zhang, “Location privacy attacks and defenses in cloud-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 23, no. 5, pp. 52–59, Oct. 2016. doi: 10.1109/MWC.2016.7721742
    [15]
    Y. Ren, R. Xie, F. R. Yu, T. Huang, and Y. Liu, “Quantum collective learning and many-to-many matching game in the metaverse for connected and autonomous vehicles,” IEEE Trans. Veh. Technol., vol. 71, no. 11, pp. 12128–12139, Nov. 2022. doi: 10.1109/TVT.2022.3190271
    [16]
    R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6382–6393.
    [17]
    M. Xu, J. Peng, B. B. Gupta, J. Kang, Z. Xiong, Z. Li, and A. A. Abd El-Latif, “Multiagent federated reinforcement learning for secure incentive mechanism in intelligent cyber-physical systems,” IEEE Internet Things J., vol. 9, no. 22, pp. 22095–22108, Nov. 2022. doi: 10.1109/JIOT.2021.3081626
    [18]
    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.
    [19]
    M. Wen, J. Kuba, R. Lin, W. Zhang, Y. Wen, J. Wang, and Y. Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” in Proc. 36th Conf. Neural Information Processing Systems, 2022, pp. 16509–16521.
    [20]
    J. G. Kuba, R. Chen, M. Wen, Y. Wen, F. Sun, J. Wang, and Y. Yang, “Trust region policy optimisation in multi-agent reinforcement learning,” in Proc. 10th Int. Conf. Learning Representations, 2022.
    [21]
    T. Huynh-The, Q.-V. Pham, X.-Q. Pham, T. T. Nguyen, Z. Han, and D.-S. Kim, “Artificial intelligence for the metaverse: A survey,” Eng. Appl. Artif. Intell., vol. 117, p. 105581, Jan. 2023. doi: 10.1016/j.engappai.2022.105581
    [22]
    L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” in Proc. 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 15084–15097.
    [23]
    J. Kang, X. Li, J. Nie, Y. Liu, M. Xu, Z. Xiong, D. Niyato, and Q. Yan, “Communication-efficient and cross-chain empowered federated learning for artificial intelligence of things,” IEEE Trans. Netw. Sci. Eng., vol. 9, no. 5, pp. 2966–2977, Sep.–Oct. 2022. doi: 10.1109/TNSE.2022.3178970
    [24]
    Z. Wang, Q. Hut, M. Xu, and H. Jiang, “Blockchain-based edge resource sharing for metaverse,” in Proc. IEEE 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 620–626.
    [25]
    Q. Yang, Y. Zhao, H. Huang, Z. Xiong, J. Kang, and Z. Zheng, “Fusing blockchain and AI with metaverse: A survey,” IEEE Open J. Comput. Soc., vol. 3, pp. 122–136, Jul. 2022. doi: 10.1109/OJCS.2022.3188249
    [26]
    W. Y. B. Lim, Z. Xiong, D. Niyato, X. Cao, C. Miao, S. Sun, and Q. Yang, “Realizing the metaverse with edge intelligence: A match made in heaven,” IEEE Wirel. Commun., vol. 30, no. 4, pp. 64–71, Aug. 2023. doi: 10.1109/MWC.018.2100716
    [27]
    J. Kang, D. Ye, J. Nie, J. Xiao, X. Deng, S. Wang, Z. Xiong, R. Yu, and D. Niyato, “Blockchain-based federated learning for industrial metaverses: Incentive scheme with optimal AoI,” in Proc. IEEE Int. Conf. Blockchain, Espoo, Finland, 2022, pp. 71–78.
    [28]
    J.-M. Jot, R. Audfray, M. Hertensteiner, and B. Schmidt, “Rendering spatial sound for interoperable experiences in the audio metaverse,” in Proc. Immersive and 3D Audio: From Architecture to Automotive, Bologna, Italy, 2021, pp. 1–15.
    [29]
    T. Taleb and A. Ksentini, “An analytical model for follow me cloud,” in Proc. IEEE Global Communications Conf., Atlanta, USA, 2013, pp. 1291–1296.
    [30]
    X. Yu, M. Guan, M. Liao, and X. Fan, “Pre-migration of vehicle to network services based on priority in mobile edge computing,” IEEE Access, vol. 7, pp. 3722–3730, 2019. doi: 10.1109/ACCESS.2018.2888478
    [31]
    C.-L. Wu, T.-C. Chiu, C.-Y. Wang, and A.-C. Pang, “Mobility-aware deep reinforcement learning with glimpse mobility prediction in edge computing,” in Proc. IEEE Int. Conf. Communications, Dublin, Ireland, 2020, pp. 1–7.
    [32]
    C. Gong, L. Wei, D. Gong, T. Li, F. Feng, and Q. Wang, “Energy-efficient task migration and path planning in uav-enabled mobile edge computing system,” Complexity, vol. 2022, p. 4269102, 2022.
    [33]
    Y. Song, Y. Sun, and W. Shi, “A two-tiered on-demand resource allocation mechanism for VM-based data centers,” IEEE Trans. Ser. Comput., vol. 6, no. 1, pp. 116–129, 2013. doi: 10.1109/TSC.2011.41
    [34]
    B. Murugan, V. Vasudevan, and B. Ganeshpandi, “Intelligent scheduling system using agent based resource allocation in cloud,” in Proc. Int. Conf. Electrical, Electronics, and Optimization Techniques, Chennai, India, 2016, pp. 3031–3035.
    [35]
    N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y. Wang, “A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,” in Proc. 37th Int. Conf. Distributed Computing Systems, Atlanta, USA, 2017, pp. 372–382.
    [36]
    C. Zhang and Z. Zheng, “Task migration for mobile edge computing using deep reinforcement learning,” Future Gener. Comput. Syst., vol. 96, pp. 111–118, Jul. 2019. doi: 10.1016/j.future.2019.01.059
    [37]
    J. Wang, K. Liu, and J. Pan, “Online UAV-mounted edge server dispatching for mobile-to-mobile edge computing,” IEEE Internet Things J., vol. 7, no. 2, pp. 1375–1386, Feb. 2020. doi: 10.1109/JIOT.2019.2954798
    [38]
    X. Ma, Z. Su, Q. Xu, and B. Ying, “Edge computing and UAV swarm cooperative task offloading in vehicular networks,” in Proc. Int. Wireless Communications and Mobile Computing, Dubrovnik, Croatia, 2022, pp. 955–960.
    [39]
    X. Zhang, M. Peng, S. Yan, and Y. Sun, “Joint communication and computation resource allocation in fog-based vehicular networks,” IEEE Internet Things J., vol. 9, no. 15, pp. 13195–13208, Aug. 2022. doi: 10.1109/JIOT.2022.3140811
    [40]
    Y. Jiao, P. Wang, D. Niyato, and K. Suankaewmanee, “Auction mechanisms in cloud/fog computing resource allocation for public blockchain networks,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 9, pp. 1975–1989, Sept. 2019. doi: 10.1109/TPDS.2019.2900238
    [41]
    W. Feng, Z. Yan, L. T. Yang, and Q. Zheng, “Anonymous authentication on trust in blockchain-based mobile crowdsourcing,” IEEE Internet Things J., vol. 9, no. 16, pp. 14185–14202, Aug. 2022. doi: 10.1109/JIOT.2020.3018878
    [42]
    Z. Xiong, S. Feng, W. Wang, D. Niyato, P. Wang, and Z. Han, “Cloud/fog computing resource management and pricing for blockchain networks,” IEEE Internet Things J., vol. 6, no. 3, pp. 4585–4600, Jun. 2019. doi: 10.1109/JIOT.2018.2871706
    [43]
    W. Junfei, J. Li, Z. Gao, Z. Han, C. Qiu, and X. Wang, “Resource management and pricing for cloud computing based mobile blockchain with pooling,” IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 128–138, Jan.–Mar. 2023. doi: 10.1109/TCC.2021.3081580
    [44]
    A. Asheralieva and D. Niyato, “Distributed dynamic resource management and pricing in the IoT systems with blockchain-as-a-service and UAV-enabled mobile edge computing,” IEEE Internet Things J., vol. 7, no. 3, pp. 1974–1993, Mar. 2020. doi: 10.1109/JIOT.2019.2961958
    [45]
    N. Q. Hieu, T. T. Anh, N. C. Luong, D. Niyato, D. I. Kim, and E. Elmroth, “Deep reinforcement learning for resource management in blockchain-enabled federated learning network,” IEEE Netw. Lett., vol. 4, no. 3, pp. 137–141, Sept. 2022. doi: 10.1109/LNET.2022.3173971
    [46]
    Y. Wu, Y. Song, T. Wang, L. Qian, and T. Q. S. Quek, “Non-orthogonal multiple access assisted federated learning via wireless power transfer: A cost-efficient approach,” IEEE Trans. Commun., vol. 70, no. 4, pp. 2853–2869, Apr. 2022. doi: 10.1109/TCOMM.2022.3153068
    [47]
    M. Kong, J. Zhao, X. Sun, and Y. Nie, “Secure and efficient computing resource management in blockchain-based vehicular fog computing,” China Commun., vol. 18, no. 4, pp. 115–125, Apr. 2021. doi: 10.23919/JCC.2021.04.009
    [48]
    H. Xu, W. Huang, Y. Zhou, D. Yang, M. Li, and Z. Han, “Edge computing resource allocation for unmanned aerial vehicle assisted mobile network with blockchain applications,” IEEE Trans. Wirel. Commun., vol. 20, no. 5, pp. 3107–3121, May 2021. doi: 10.1109/TWC.2020.3047496
    [49]
    S. He, K. Shi, C. Liu, B. Guo, J. Chen, and Z. Shi, “Collaborative sensing in internet of things: A comprehensive survey,” IEEE Commun. Surv. Tutorials, vol. 24, no. 3, pp. 1435–1474, 2022. doi: 10.1109/COMST.2022.3187138
    [50]
    M. Chen, M. Mozaffari, W. Saad, C. Yin, M. Debbah, and C. S. Hong, “Caching in the sky: Proactive deployment of cache-enabled unmanned aerial vehicles for optimized quality-of-experience,” IEEE J. Sel. Areas Commun., vol. 35, no. 5, pp. 1046–1061, May 2017. doi: 10.1109/JSAC.2017.2680898
    [51]
    C. E. Shannon, “A mathematical theory of communication,” ACM SIGMOBILE Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, 2001. doi: 10.1145/584091.584093
    [52]
    S. Gronauer and K. Diepold, “Multi-agent deep reinforcement learning: A survey,” Artif. Intell. Rev., vol. 55, no. 2, pp. 895–943, Feb. 2022. doi: 10.1007/s10462-021-09996-w
    [53]
    J. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, “Counterfactual multi-agent policy gradients,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, 2018, pp. 363.
    [54]
    J. G. Kuba, M. Wen, L. Meng, S. Gu, H. Zhang, D. Mguni, J. Wang, and Y. Yang, “Settling the variance of multi-agent policy gradients,” in Proc 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 13458–13470.
    [55]
    L. Baird, “Residual algorithms: Reinforcement learning with function approximation,” in Machine Learning Proc. 1995, A. Prieditis and S. Russell, Eds. Amsterdam, The Netherlands: Elsevier, pp. 30–37.
    [56]
    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347, 2017.
    [57]
    J. Kasai, N. Pappas, H. Peng, J. Cross, and N. Smith, “Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation,” in Proc. 9th Int. Conf. Learning Representations, Austria, 2021.
    [58]
    M. Yang and O. Nachum, “Representation matters: Offline pretraining for sequential decision making,” in Proc. 38th Int. Conf. Machine Learning, 2021, pp. 11784–11794.
    [59]
    Z. Li, E. Wallace, S. Shen, K. Lin, K. Keutzer, D. Klein, and J. Gonzalez, “Train large, then compress: Rethinking model size for efficient training and inference of transformers,” in Proc. 37th Int. Conf. Machine Learning, 2020, pp. 553.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(12)  / Tables(2)

    Article Metrics

    Article views (457) PDF downloads(107) Cited by()

    Highlights

    • We introduce a novel avatar task migration framework aimed at achieving continuous user-avatar interaction. Within this framework, vehicles choose appropriate edge servers (e.g., RSUs or UAVs) for the migration and pre-migration of tasks, enabling real-time avatar task execution in UAV-assisted vehicular Metaverses
    • In order to efficiently solve the service provisioning problem, we model the avatar task migration process as a Partially Observable Markov Decision Process. The proposed framework considers the avatar task migration problem as binary integer programming and proves that this problem is NP-hard. The challenges are then tackled using MADRL algorithms.
    • We propose a transformer-based decision-making model based on MAPPO that processes in a sequential manner. The proposed model leverages the self-attentive mechanism to perceive the relationship between agents' interactions for obtaining the optimal policy for each agent. Numerical results show that the proposed approach outperforms the existing MAPPO approach by approximately 2% and effectively reduces the latency of avatar task execution by around 20%
    • To incentivize edge servers (e.g., RSUs or UAVs) to contribute adequate resources to vehicles, we maintain transaction records of communication, computing, and storage resources exchanged between edge servers and vehicles in the blockchain. Utilizing smart contracts ensures the security and traceability of these transactions

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return