IEEE/CAA Journal of Automatica Sinica
Citation: | J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993 |
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation, which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units (RSU) or unmanned aerial vehicles (UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning (MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization (MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers (e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
[1] |
M. Xu, W. C. Ng, W. Y. B. Lim, J. Kang, Z. Xiong, D. Niyato, Q. Yang, X. S. Shen, and C. Miao, “A full dive into realizing the edge-enabled metaverse: Visions, enabling technologies, and challenges,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 656–700, 2023. doi: 10.1109/COMST.2022.3221119
|
[2] |
H. Duan, J. Li, S. Fan, Z. Lin, X. Wu, and W. Cai, “Metaverse for social good: A university campus prototype,” in Proc. 29th ACM Int. Conf. Multimedia, China, 2021, pp. 153–161.
|
[3] |
X. Huang, W. Zhong, J. Nie, Q. Hu, Z. Xiong, J. Kang, and T. Q. Quek, “Joint user association and resource pricing for metaverse: Distributed and centralized approaches,” in Proc. 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 505–513.
|
[4] |
Y. Wang, Z. Su, N. Zhang, R. Xing, D. Liu, T. H. Luan, and X. Shen, “A survey on metaverse: Fundamentals, security, and privacy,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 319–352, 2023. doi: 10.1109/COMST.2022.3202047
|
[5] |
Y. Jiang, J. Kang, D. Niyato, X. Ge, Z. Xiong, and C. Miao, “Reliable coded distributed computing for metaverse services: Coalition formation and incentive mechanism design,” arXiv preprint arXiv: 2111.10548, 2022.
|
[6] |
X. Sun and N. Ansari, “PRIMAL: PRofit maximization avatar placement for mobile edge computing,” in Proc. IEEE Int. Conf. Communications, Kuala Lumpur, Malaysia, 2016, pp. 1–6.
|
[7] |
N. H. Chu, D. T. Hoang, D. N. Nguyen, K. T. Phan, E. Dutkiewicz, D. Niyato, and T. Shu, “MetaSlicing: A novel resource allocation framework for metaverse,” IEEE Trans. Mob. Comput., 2023. DOI: 10.1109/TMC.2023.3288085
|
[8] |
Y. Yao, X. Zheng, Z. Wang, and J. Jiang, “Development overview of augmented reality navigation,” Acad. J. Comput. Inf. Sci., vol. 4, no. 2, pp. 83–90, May 2021.
|
[9] |
M. Xu, D. T. Hoang, J. Kang, D. Niyato, Q. Yan, and D. I. Kim, “Secure and reliable transfer learning framework for 6G-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 29, no. 4, pp. 132–139, Aug. 2022. doi: 10.1109/MWC.004.2100542
|
[10] |
K. Li, Y. Cui, W. Li, T. Lv, X. Yuan, S. Li, W. Ni, M. Simsek, and F. Dressler, “When internet of things meets metaverse: Convergence of physical and cyber worlds,” IEEE Internet Things J., vol. 10, no. 5, pp. 4148–4173, Mar. 2023. doi: 10.1109/JIOT.2022.3232845
|
[11] |
K. Zhu, J. Yang, Y. Zhang, J. Nie, W. Y. B. Lim, H. Zhang, and Z. Xiong, “Aerial refueling: Scheduling wireless energy charging for UAV enabled data collection,” IEEE Trans. Green Commun. Netw., vol. 6, no. 3, pp. 1494–1510, Sept. 2022. doi: 10.1109/TGCN.2022.3164602
|
[12] |
L. Chang, Z. Zhang, P. Li, S. Xi, W. Guo, Y. Shen, Z. Xiong, J. Kang, D. Niyato, X. Qiao, and Y. Wu, “6G-enabled edge AI for metaverse: Challenges, methods, and future research directions,” J. Commun. Inf. Netw., vol. 7, no. 2, pp. 107–121, Jun. 2022. doi: 10.23919/JCIN.2022.9815195
|
[13] |
R. Yu, Y. Zhang, S. Gjessing, W. Xia, and K. Yang, “Toward cloud-based vehicular networks with efficient resource management,” IEEE Net., vol. 27, no. 5, pp. 48–55, Sep.–Oct. 2013. doi: 10.1109/MNET.2013.6616115
|
[14] |
J. Kang, R. Yu, X. Huang, M. Jonsson, H. Bogucka, S. Gjessing, and Y. Zhang, “Location privacy attacks and defenses in cloud-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 23, no. 5, pp. 52–59, Oct. 2016. doi: 10.1109/MWC.2016.7721742
|
[15] |
Y. Ren, R. Xie, F. R. Yu, T. Huang, and Y. Liu, “Quantum collective learning and many-to-many matching game in the metaverse for connected and autonomous vehicles,” IEEE Trans. Veh. Technol., vol. 71, no. 11, pp. 12128–12139, Nov. 2022. doi: 10.1109/TVT.2022.3190271
|
[16] |
R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6382–6393.
|
[17] |
M. Xu, J. Peng, B. B. Gupta, J. Kang, Z. Xiong, Z. Li, and A. A. Abd El-Latif, “Multiagent federated reinforcement learning for secure incentive mechanism in intelligent cyber-physical systems,” IEEE Internet Things J., vol. 9, no. 22, pp. 22095–22108, Nov. 2022. doi: 10.1109/JIOT.2021.3081626
|
[18] |
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.
|
[19] |
M. Wen, J. Kuba, R. Lin, W. Zhang, Y. Wen, J. Wang, and Y. Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” in Proc. 36th Conf. Neural Information Processing Systems, 2022, pp. 16509–16521.
|
[20] |
J. G. Kuba, R. Chen, M. Wen, Y. Wen, F. Sun, J. Wang, and Y. Yang, “Trust region policy optimisation in multi-agent reinforcement learning,” in Proc. 10th Int. Conf. Learning Representations, 2022.
|
[21] |
T. Huynh-The, Q.-V. Pham, X.-Q. Pham, T. T. Nguyen, Z. Han, and D.-S. Kim, “Artificial intelligence for the metaverse: A survey,” Eng. Appl. Artif. Intell., vol. 117, p. 105581, Jan. 2023. doi: 10.1016/j.engappai.2022.105581
|
[22] |
L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” in Proc. 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 15084–15097.
|
[23] |
J. Kang, X. Li, J. Nie, Y. Liu, M. Xu, Z. Xiong, D. Niyato, and Q. Yan, “Communication-efficient and cross-chain empowered federated learning for artificial intelligence of things,” IEEE Trans. Netw. Sci. Eng., vol. 9, no. 5, pp. 2966–2977, Sep.–Oct. 2022. doi: 10.1109/TNSE.2022.3178970
|
[24] |
Z. Wang, Q. Hut, M. Xu, and H. Jiang, “Blockchain-based edge resource sharing for metaverse,” in Proc. IEEE 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 620–626.
|
[25] |
Q. Yang, Y. Zhao, H. Huang, Z. Xiong, J. Kang, and Z. Zheng, “Fusing blockchain and AI with metaverse: A survey,” IEEE Open J. Comput. Soc., vol. 3, pp. 122–136, Jul. 2022. doi: 10.1109/OJCS.2022.3188249
|
[26] |
W. Y. B. Lim, Z. Xiong, D. Niyato, X. Cao, C. Miao, S. Sun, and Q. Yang, “Realizing the metaverse with edge intelligence: A match made in heaven,” IEEE Wirel. Commun., vol. 30, no. 4, pp. 64–71, Aug. 2023. doi: 10.1109/MWC.018.2100716
|
[27] |
J. Kang, D. Ye, J. Nie, J. Xiao, X. Deng, S. Wang, Z. Xiong, R. Yu, and D. Niyato, “Blockchain-based federated learning for industrial metaverses: Incentive scheme with optimal AoI,” in Proc. IEEE Int. Conf. Blockchain, Espoo, Finland, 2022, pp. 71–78.
|
[28] |
J.-M. Jot, R. Audfray, M. Hertensteiner, and B. Schmidt, “Rendering spatial sound for interoperable experiences in the audio metaverse,” in Proc. Immersive and 3D Audio: From Architecture to Automotive, Bologna, Italy, 2021, pp. 1–15.
|
[29] |
T. Taleb and A. Ksentini, “An analytical model for follow me cloud,” in Proc. IEEE Global Communications Conf., Atlanta, USA, 2013, pp. 1291–1296.
|
[30] |
X. Yu, M. Guan, M. Liao, and X. Fan, “Pre-migration of vehicle to network services based on priority in mobile edge computing,” IEEE Access, vol. 7, pp. 3722–3730, 2019. doi: 10.1109/ACCESS.2018.2888478
|
[31] |
C.-L. Wu, T.-C. Chiu, C.-Y. Wang, and A.-C. Pang, “Mobility-aware deep reinforcement learning with glimpse mobility prediction in edge computing,” in Proc. IEEE Int. Conf. Communications, Dublin, Ireland, 2020, pp. 1–7.
|
[32] |
C. Gong, L. Wei, D. Gong, T. Li, F. Feng, and Q. Wang, “Energy-efficient task migration and path planning in uav-enabled mobile edge computing system,” Complexity, vol. 2022, p. 4269102, 2022.
|
[33] |
Y. Song, Y. Sun, and W. Shi, “A two-tiered on-demand resource allocation mechanism for VM-based data centers,” IEEE Trans. Ser. Comput., vol. 6, no. 1, pp. 116–129, 2013. doi: 10.1109/TSC.2011.41
|
[34] |
B. Murugan, V. Vasudevan, and B. Ganeshpandi, “Intelligent scheduling system using agent based resource allocation in cloud,” in Proc. Int. Conf. Electrical, Electronics, and Optimization Techniques, Chennai, India, 2016, pp. 3031–3035.
|
[35] |
N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y. Wang, “A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,” in Proc. 37th Int. Conf. Distributed Computing Systems, Atlanta, USA, 2017, pp. 372–382.
|
[36] |
C. Zhang and Z. Zheng, “Task migration for mobile edge computing using deep reinforcement learning,” Future Gener. Comput. Syst., vol. 96, pp. 111–118, Jul. 2019. doi: 10.1016/j.future.2019.01.059
|
[37] |
J. Wang, K. Liu, and J. Pan, “Online UAV-mounted edge server dispatching for mobile-to-mobile edge computing,” IEEE Internet Things J., vol. 7, no. 2, pp. 1375–1386, Feb. 2020. doi: 10.1109/JIOT.2019.2954798
|
[38] |
X. Ma, Z. Su, Q. Xu, and B. Ying, “Edge computing and UAV swarm cooperative task offloading in vehicular networks,” in Proc. Int. Wireless Communications and Mobile Computing, Dubrovnik, Croatia, 2022, pp. 955–960.
|
[39] |
X. Zhang, M. Peng, S. Yan, and Y. Sun, “Joint communication and computation resource allocation in fog-based vehicular networks,” IEEE Internet Things J., vol. 9, no. 15, pp. 13195–13208, Aug. 2022. doi: 10.1109/JIOT.2022.3140811
|
[40] |
Y. Jiao, P. Wang, D. Niyato, and K. Suankaewmanee, “Auction mechanisms in cloud/fog computing resource allocation for public blockchain networks,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 9, pp. 1975–1989, Sept. 2019. doi: 10.1109/TPDS.2019.2900238
|
[41] |
W. Feng, Z. Yan, L. T. Yang, and Q. Zheng, “Anonymous authentication on trust in blockchain-based mobile crowdsourcing,” IEEE Internet Things J., vol. 9, no. 16, pp. 14185–14202, Aug. 2022. doi: 10.1109/JIOT.2020.3018878
|
[42] |
Z. Xiong, S. Feng, W. Wang, D. Niyato, P. Wang, and Z. Han, “Cloud/fog computing resource management and pricing for blockchain networks,” IEEE Internet Things J., vol. 6, no. 3, pp. 4585–4600, Jun. 2019. doi: 10.1109/JIOT.2018.2871706
|
[43] |
W. Junfei, J. Li, Z. Gao, Z. Han, C. Qiu, and X. Wang, “Resource management and pricing for cloud computing based mobile blockchain with pooling,” IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 128–138, Jan.–Mar. 2023. doi: 10.1109/TCC.2021.3081580
|
[44] |
A. Asheralieva and D. Niyato, “Distributed dynamic resource management and pricing in the IoT systems with blockchain-as-a-service and UAV-enabled mobile edge computing,” IEEE Internet Things J., vol. 7, no. 3, pp. 1974–1993, Mar. 2020. doi: 10.1109/JIOT.2019.2961958
|
[45] |
N. Q. Hieu, T. T. Anh, N. C. Luong, D. Niyato, D. I. Kim, and E. Elmroth, “Deep reinforcement learning for resource management in blockchain-enabled federated learning network,” IEEE Netw. Lett., vol. 4, no. 3, pp. 137–141, Sept. 2022. doi: 10.1109/LNET.2022.3173971
|
[46] |
Y. Wu, Y. Song, T. Wang, L. Qian, and T. Q. S. Quek, “Non-orthogonal multiple access assisted federated learning via wireless power transfer: A cost-efficient approach,” IEEE Trans. Commun., vol. 70, no. 4, pp. 2853–2869, Apr. 2022. doi: 10.1109/TCOMM.2022.3153068
|
[47] |
M. Kong, J. Zhao, X. Sun, and Y. Nie, “Secure and efficient computing resource management in blockchain-based vehicular fog computing,” China Commun., vol. 18, no. 4, pp. 115–125, Apr. 2021. doi: 10.23919/JCC.2021.04.009
|
[48] |
H. Xu, W. Huang, Y. Zhou, D. Yang, M. Li, and Z. Han, “Edge computing resource allocation for unmanned aerial vehicle assisted mobile network with blockchain applications,” IEEE Trans. Wirel. Commun., vol. 20, no. 5, pp. 3107–3121, May 2021. doi: 10.1109/TWC.2020.3047496
|
[49] |
S. He, K. Shi, C. Liu, B. Guo, J. Chen, and Z. Shi, “Collaborative sensing in internet of things: A comprehensive survey,” IEEE Commun. Surv. Tutorials, vol. 24, no. 3, pp. 1435–1474, 2022. doi: 10.1109/COMST.2022.3187138
|
[50] |
M. Chen, M. Mozaffari, W. Saad, C. Yin, M. Debbah, and C. S. Hong, “Caching in the sky: Proactive deployment of cache-enabled unmanned aerial vehicles for optimized quality-of-experience,” IEEE J. Sel. Areas Commun., vol. 35, no. 5, pp. 1046–1061, May 2017. doi: 10.1109/JSAC.2017.2680898
|
[51] |
C. E. Shannon, “A mathematical theory of communication,” ACM SIGMOBILE Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, 2001. doi: 10.1145/584091.584093
|
[52] |
S. Gronauer and K. Diepold, “Multi-agent deep reinforcement learning: A survey,” Artif. Intell. Rev., vol. 55, no. 2, pp. 895–943, Feb. 2022. doi: 10.1007/s10462-021-09996-w
|
[53] |
J. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, “Counterfactual multi-agent policy gradients,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, 2018, pp. 363.
|
[54] |
J. G. Kuba, M. Wen, L. Meng, S. Gu, H. Zhang, D. Mguni, J. Wang, and Y. Yang, “Settling the variance of multi-agent policy gradients,” in Proc 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 13458–13470.
|
[55] |
L. Baird, “Residual algorithms: Reinforcement learning with function approximation,” in Machine Learning Proc. 1995, A. Prieditis and S. Russell, Eds. Amsterdam, The Netherlands: Elsevier, pp. 30–37.
|
[56] |
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347, 2017.
|
[57] |
J. Kasai, N. Pappas, H. Peng, J. Cross, and N. Smith, “Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation,” in Proc. 9th Int. Conf. Learning Representations, Austria, 2021.
|
[58] |
M. Yang and O. Nachum, “Representation matters: Offline pretraining for sequential decision making,” in Proc. 38th Int. Conf. Machine Learning, 2021, pp. 11784–11794.
|
[59] |
Z. Li, E. Wallace, S. Shen, K. Lin, K. Keutzer, D. Klein, and J. Gonzalez, “Train large, then compress: Rethinking model size for efficient training and inference of transformers,” in Proc. 37th Int. Conf. Machine Learning, 2020, pp. 553.
|