UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach

Jiawen Kang; Junlong Chen; Minrui Xu; Zehui Xiong; Yutao Jiao; Luchao Han; Dusit Niyato; Yongju Tong; Shengli Xie

doi:10.1109/JAS.2023.123993

Volume 11 Issue 2

Feb. 2024

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 19.2, Top 1 (SCI Q1)

CiteScore: 28.2, Top 1% (Q1)
Google Scholar h5-index: 95， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > 11(2): 430-445

J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993

Citation:

J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993

Citation:

J. Kang, J. Chen, M. Xu, Z. Xiong, Y. Jiao, L. Han, D. Niyato, Y. Tong, and S. Xie, “UAV-assisted dynamic avatar task migration for vehicular metaverse services: A multi-agent deep reinforcement learning approach,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 2, pp. 430–445, Feb. 2024. doi: 10.1109/JAS.2023.123993

PDF( 2930 KB)

UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach

doi: 10.1109/JAS.2023.123993

Funds: This work was supported in part by NSFC (62102099, U22A2054, 62101594); in part by the Pearl River Talent Recruitment Program (2021QN02S643), and Guangzhou Basic Research Program (2023A04J1699); in part by the National Research Foundation, Singapore, and Infocomm Media Development Authority under its Future Communications Research Development Programme, DSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019, Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programme, DesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programme, and MOE Tier 1 under Grant RG87/22; in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165); in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102; in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)

More Information

Author Bio:
Jiawen Kang (Senior Member, IEEE) received the Ph.D. degree from the Guangdong University of Technology in 2018. He was a Postdoc at Nanyang Technological University, Singapore, from 2018 to 2021. He currently is a Professor at Guangdong University of Technology. His research interests mainly focus on blockchain, security, and privacy protection in wireless communications and networking

Junlong Chen is currently working toward the B.S. degree at the Guangdong University of Technology. His research interests mainly include deep reinforcement learning, Internet of Things, and Metaverse

Minrui Xu received the B.S. degree from Sun Yat-Sen University, in 2021. He is currently working toward the Ph.D. degree in the School of Computer Science and Engineering, Nanyang Technological University, Singapore. His research interests mainly focus on metaverse, deep reinforcement learning, and mechanism design

Zehui Xiong (Member, IEEE) is currently an Assistant Professor at Singapore University of Technology and Design, and also an Honorary Adjunct Senior Research Scientist with Alibaba-NTU Singapore Joint Research Institute, Singapore. He received the Ph.D. degree in Nanyang Technological University, Singapore. He was the Visiting Scholar at Princeton University and University of Waterloo. His research interests include wireless communications, IoT, blockchain, edge intelligence, and Metaverse

Yutao Jiao received the Ph.D. degree in computer science and engineering from Nanyang Technological University, Singapore, in 2020. He is currently an Assistant Professor in the College of Communication Engineering, Army Engineering University of PLA. He has published more than 30 research papers in leading journals and flagship conferences including 2 ESI highly cited papers. He is the co-inventor of 2 granted patents and has won the IEEE Communications Society Asia Pacific Outstanding Paper Award. He was selected for the Young Elite Scientists Sponsorship Program by China Association for Science and Technology (CAST). He is now serving as the Guest Editor for the Computer Communications, the Review Editor for the Frontiers in the Internet of Things, and has also served as the TPC Member for IEEE SECON, GC, VTC, IWCMC, and etc. His research interests include algorithmic mechanism design, mobile blockchain, decentralized machine learning, as well as mobile crowdsensing in field of Internet of Things

Luchao Han received the bachelor degree from North China Electric Power Universit, in 2012, and the Ph.D. degree in signal and information processing from the Institute of Acoustics, Chinese Academy of Sciences (IACAS), in 2022. He is currently an Academic Researcher with the National Natural Science Foundation of China. His current research interests include cybersecurity, deep learning and big data

Dusit Niyato (Fellow, IEEE) is a Professor in the School of Computer Science and Engineering, Nanyang Technological University, Singapore. He received the B.Eng. degree from King Mongkuts Institute of Technology Ladkrabang (KMITL), Thailand, in 1999, and the Ph.D. degree in electrical and computer engineering from the University of Manitoba, Canada, in 2008. His research interests are in the areas of sustainability, edge intelligence, decentralized machine learning, and incentive mechanism design

Yongju Tong is currently working toward the B.S. degree at the Guangdong University of Technology. Her interests mainly include deep reinforcement learning, blockchain, and Metaverse

Shengli Xie (Fellow, IEEE) received the Ph.D. degree in control theory and applications from South China University of Technology in 1997. He is currently a Full Professor and the Head of the Institute of Intelligent Information Processing, Guangdong University of Technology. He has coauthored 2 books and more than 150 research papers in refereed journals and conference proceedings and was awarded Highly Cited Researcher in 2020. His research interests include blind signal processing, machine learning, and Internet of Things. He was awarded the Second Prize of National Natural Science Award of China in 2009. He is an Associate Editor for IEEE Transactions on Systems, Man, and Cybernetics: Systems
Corresponding author: Zehui Xiong, e-mail: zehui_xiong@sutd.edu.sg
Received Date: 2023-02-12
Revised Date: 2023-05-21
Accepted Date: 2023-09-27

Available Online: 2023-11-22

Abstract

Abstract

Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation, which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units (RSU) or unmanned aerial vehicles (UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning (MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization (MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers (e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
- Avatar,
- blockchain,
- metaverses,
- multi-agent deep reinforcement learning,
- transformer,
- UAVs

FullText(HTML)

References(59)

References

[1]	M. Xu, W. C. Ng, W. Y. B. Lim, J. Kang, Z. Xiong, D. Niyato, Q. Yang, X. S. Shen, and C. Miao, “A full dive into realizing the edge-enabled metaverse: Visions, enabling technologies, and challenges,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 656–700, 2023. doi: 10.1109/COMST.2022.3221119
[2]	H. Duan, J. Li, S. Fan, Z. Lin, X. Wu, and W. Cai, “Metaverse for social good: A university campus prototype,” in Proc. 29th ACM Int. Conf. Multimedia, China, 2021, pp. 153–161.
[3]	X. Huang, W. Zhong, J. Nie, Q. Hu, Z. Xiong, J. Kang, and T. Q. Quek, “Joint user association and resource pricing for metaverse: Distributed and centralized approaches,” in Proc. 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 505–513.
[4]	Y. Wang, Z. Su, N. Zhang, R. Xing, D. Liu, T. H. Luan, and X. Shen, “A survey on metaverse: Fundamentals, security, and privacy,” IEEE Commun. Surv. Tutorials, vol. 25, no. 1, pp. 319–352, 2023. doi: 10.1109/COMST.2022.3202047
[5]	Y. Jiang, J. Kang, D. Niyato, X. Ge, Z. Xiong, and C. Miao, “Reliable coded distributed computing for metaverse services: Coalition formation and incentive mechanism design,” arXiv preprint arXiv: 2111.10548, 2022.
[6]	X. Sun and N. Ansari, “PRIMAL: PRofit maximization avatar placement for mobile edge computing,” in Proc. IEEE Int. Conf. Communications, Kuala Lumpur, Malaysia, 2016, pp. 1–6.
[7]	N. H. Chu, D. T. Hoang, D. N. Nguyen, K. T. Phan, E. Dutkiewicz, D. Niyato, and T. Shu, “MetaSlicing: A novel resource allocation framework for metaverse,” IEEE Trans. Mob. Comput., 2023. DOI: 10.1109/TMC.2023.3288085
[8]	Y. Yao, X. Zheng, Z. Wang, and J. Jiang, “Development overview of augmented reality navigation,” Acad. J. Comput. Inf. Sci., vol. 4, no. 2, pp. 83–90, May 2021.
[9]	M. Xu, D. T. Hoang, J. Kang, D. Niyato, Q. Yan, and D. I. Kim, “Secure and reliable transfer learning framework for 6G-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 29, no. 4, pp. 132–139, Aug. 2022. doi: 10.1109/MWC.004.2100542
[10]	K. Li, Y. Cui, W. Li, T. Lv, X. Yuan, S. Li, W. Ni, M. Simsek, and F. Dressler, “When internet of things meets metaverse: Convergence of physical and cyber worlds,” IEEE Internet Things J., vol. 10, no. 5, pp. 4148–4173, Mar. 2023. doi: 10.1109/JIOT.2022.3232845
[11]	K. Zhu, J. Yang, Y. Zhang, J. Nie, W. Y. B. Lim, H. Zhang, and Z. Xiong, “Aerial refueling: Scheduling wireless energy charging for UAV enabled data collection,” IEEE Trans. Green Commun. Netw., vol. 6, no. 3, pp. 1494–1510, Sept. 2022. doi: 10.1109/TGCN.2022.3164602
[12]	L. Chang, Z. Zhang, P. Li, S. Xi, W. Guo, Y. Shen, Z. Xiong, J. Kang, D. Niyato, X. Qiao, and Y. Wu, “6G-enabled edge AI for metaverse: Challenges, methods, and future research directions,” J. Commun. Inf. Netw., vol. 7, no. 2, pp. 107–121, Jun. 2022. doi: 10.23919/JCIN.2022.9815195
[13]	R. Yu, Y. Zhang, S. Gjessing, W. Xia, and K. Yang, “Toward cloud-based vehicular networks with efficient resource management,” IEEE Net., vol. 27, no. 5, pp. 48–55, Sep.–Oct. 2013. doi: 10.1109/MNET.2013.6616115
[14]	J. Kang, R. Yu, X. Huang, M. Jonsson, H. Bogucka, S. Gjessing, and Y. Zhang, “Location privacy attacks and defenses in cloud-enabled internet of vehicles,” IEEE Wirel. Commun., vol. 23, no. 5, pp. 52–59, Oct. 2016. doi: 10.1109/MWC.2016.7721742
[15]	Y. Ren, R. Xie, F. R. Yu, T. Huang, and Y. Liu, “Quantum collective learning and many-to-many matching game in the metaverse for connected and autonomous vehicles,” IEEE Trans. Veh. Technol., vol. 71, no. 11, pp. 12128–12139, Nov. 2022. doi: 10.1109/TVT.2022.3190271
[16]	R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6382–6393.
[17]	M. Xu, J. Peng, B. B. Gupta, J. Kang, Z. Xiong, Z. Li, and A. A. Abd El-Latif, “Multiagent federated reinforcement learning for secure incentive mechanism in intelligent cyber-physical systems,” IEEE Internet Things J., vol. 9, no. 22, pp. 22095–22108, Nov. 2022. doi: 10.1109/JIOT.2021.3081626
[18]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.
[19]	M. Wen, J. Kuba, R. Lin, W. Zhang, Y. Wen, J. Wang, and Y. Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” in Proc. 36th Conf. Neural Information Processing Systems, 2022, pp. 16509–16521.
[20]	J. G. Kuba, R. Chen, M. Wen, Y. Wen, F. Sun, J. Wang, and Y. Yang, “Trust region policy optimisation in multi-agent reinforcement learning,” in Proc. 10th Int. Conf. Learning Representations, 2022.
[21]	T. Huynh-The, Q.-V. Pham, X.-Q. Pham, T. T. Nguyen, Z. Han, and D.-S. Kim, “Artificial intelligence for the metaverse: A survey,” Eng. Appl. Artif. Intell., vol. 117, p. 105581, Jan. 2023. doi: 10.1016/j.engappai.2022.105581
[22]	L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” in Proc. 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 15084–15097.
[23]	J. Kang, X. Li, J. Nie, Y. Liu, M. Xu, Z. Xiong, D. Niyato, and Q. Yan, “Communication-efficient and cross-chain empowered federated learning for artificial intelligence of things,” IEEE Trans. Netw. Sci. Eng., vol. 9, no. 5, pp. 2966–2977, Sep.–Oct. 2022. doi: 10.1109/TNSE.2022.3178970
[24]	Z. Wang, Q. Hut, M. Xu, and H. Jiang, “Blockchain-based edge resource sharing for metaverse,” in Proc. IEEE 19th Int. Conf. Mobile Ad Hoc and Smart Systems, Denver, USA, 2022, pp. 620–626.
[25]	Q. Yang, Y. Zhao, H. Huang, Z. Xiong, J. Kang, and Z. Zheng, “Fusing blockchain and AI with metaverse: A survey,” IEEE Open J. Comput. Soc., vol. 3, pp. 122–136, Jul. 2022. doi: 10.1109/OJCS.2022.3188249
[26]	W. Y. B. Lim, Z. Xiong, D. Niyato, X. Cao, C. Miao, S. Sun, and Q. Yang, “Realizing the metaverse with edge intelligence: A match made in heaven,” IEEE Wirel. Commun., vol. 30, no. 4, pp. 64–71, Aug. 2023. doi: 10.1109/MWC.018.2100716
[27]	J. Kang, D. Ye, J. Nie, J. Xiao, X. Deng, S. Wang, Z. Xiong, R. Yu, and D. Niyato, “Blockchain-based federated learning for industrial metaverses: Incentive scheme with optimal AoI,” in Proc. IEEE Int. Conf. Blockchain, Espoo, Finland, 2022, pp. 71–78.
[28]	J.-M. Jot, R. Audfray, M. Hertensteiner, and B. Schmidt, “Rendering spatial sound for interoperable experiences in the audio metaverse,” in Proc. Immersive and 3D Audio: From Architecture to Automotive, Bologna, Italy, 2021, pp. 1–15.
[29]	T. Taleb and A. Ksentini, “An analytical model for follow me cloud,” in Proc. IEEE Global Communications Conf., Atlanta, USA, 2013, pp. 1291–1296.
[30]	X. Yu, M. Guan, M. Liao, and X. Fan, “Pre-migration of vehicle to network services based on priority in mobile edge computing,” IEEE Access, vol. 7, pp. 3722–3730, 2019. doi: 10.1109/ACCESS.2018.2888478
[31]	C.-L. Wu, T.-C. Chiu, C.-Y. Wang, and A.-C. Pang, “Mobility-aware deep reinforcement learning with glimpse mobility prediction in edge computing,” in Proc. IEEE Int. Conf. Communications, Dublin, Ireland, 2020, pp. 1–7.
[32]	C. Gong, L. Wei, D. Gong, T. Li, F. Feng, and Q. Wang, “Energy-efficient task migration and path planning in uav-enabled mobile edge computing system,” Complexity, vol. 2022, p. 4269102, 2022.
[33]	Y. Song, Y. Sun, and W. Shi, “A two-tiered on-demand resource allocation mechanism for VM-based data centers,” IEEE Trans. Ser. Comput., vol. 6, no. 1, pp. 116–129, 2013. doi: 10.1109/TSC.2011.41
[34]	B. Murugan, V. Vasudevan, and B. Ganeshpandi, “Intelligent scheduling system using agent based resource allocation in cloud,” in Proc. Int. Conf. Electrical, Electronics, and Optimization Techniques, Chennai, India, 2016, pp. 3031–3035.
[35]	N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y. Wang, “A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,” in Proc. 37th Int. Conf. Distributed Computing Systems, Atlanta, USA, 2017, pp. 372–382.
[36]	C. Zhang and Z. Zheng, “Task migration for mobile edge computing using deep reinforcement learning,” Future Gener. Comput. Syst., vol. 96, pp. 111–118, Jul. 2019. doi: 10.1016/j.future.2019.01.059
[37]	J. Wang, K. Liu, and J. Pan, “Online UAV-mounted edge server dispatching for mobile-to-mobile edge computing,” IEEE Internet Things J., vol. 7, no. 2, pp. 1375–1386, Feb. 2020. doi: 10.1109/JIOT.2019.2954798
[38]	X. Ma, Z. Su, Q. Xu, and B. Ying, “Edge computing and UAV swarm cooperative task offloading in vehicular networks,” in Proc. Int. Wireless Communications and Mobile Computing, Dubrovnik, Croatia, 2022, pp. 955–960.
[39]	X. Zhang, M. Peng, S. Yan, and Y. Sun, “Joint communication and computation resource allocation in fog-based vehicular networks,” IEEE Internet Things J., vol. 9, no. 15, pp. 13195–13208, Aug. 2022. doi: 10.1109/JIOT.2022.3140811
[40]	Y. Jiao, P. Wang, D. Niyato, and K. Suankaewmanee, “Auction mechanisms in cloud/fog computing resource allocation for public blockchain networks,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 9, pp. 1975–1989, Sept. 2019. doi: 10.1109/TPDS.2019.2900238
[41]	W. Feng, Z. Yan, L. T. Yang, and Q. Zheng, “Anonymous authentication on trust in blockchain-based mobile crowdsourcing,” IEEE Internet Things J., vol. 9, no. 16, pp. 14185–14202, Aug. 2022. doi: 10.1109/JIOT.2020.3018878
[42]	Z. Xiong, S. Feng, W. Wang, D. Niyato, P. Wang, and Z. Han, “Cloud/fog computing resource management and pricing for blockchain networks,” IEEE Internet Things J., vol. 6, no. 3, pp. 4585–4600, Jun. 2019. doi: 10.1109/JIOT.2018.2871706
[43]	W. Junfei, J. Li, Z. Gao, Z. Han, C. Qiu, and X. Wang, “Resource management and pricing for cloud computing based mobile blockchain with pooling,” IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 128–138, Jan.–Mar. 2023. doi: 10.1109/TCC.2021.3081580
[44]	A. Asheralieva and D. Niyato, “Distributed dynamic resource management and pricing in the IoT systems with blockchain-as-a-service and UAV-enabled mobile edge computing,” IEEE Internet Things J., vol. 7, no. 3, pp. 1974–1993, Mar. 2020. doi: 10.1109/JIOT.2019.2961958
[45]	N. Q. Hieu, T. T. Anh, N. C. Luong, D. Niyato, D. I. Kim, and E. Elmroth, “Deep reinforcement learning for resource management in blockchain-enabled federated learning network,” IEEE Netw. Lett., vol. 4, no. 3, pp. 137–141, Sept. 2022. doi: 10.1109/LNET.2022.3173971
[46]	Y. Wu, Y. Song, T. Wang, L. Qian, and T. Q. S. Quek, “Non-orthogonal multiple access assisted federated learning via wireless power transfer: A cost-efficient approach,” IEEE Trans. Commun., vol. 70, no. 4, pp. 2853–2869, Apr. 2022. doi: 10.1109/TCOMM.2022.3153068
[47]	M. Kong, J. Zhao, X. Sun, and Y. Nie, “Secure and efficient computing resource management in blockchain-based vehicular fog computing,” China Commun., vol. 18, no. 4, pp. 115–125, Apr. 2021. doi: 10.23919/JCC.2021.04.009
[48]	H. Xu, W. Huang, Y. Zhou, D. Yang, M. Li, and Z. Han, “Edge computing resource allocation for unmanned aerial vehicle assisted mobile network with blockchain applications,” IEEE Trans. Wirel. Commun., vol. 20, no. 5, pp. 3107–3121, May 2021. doi: 10.1109/TWC.2020.3047496
[49]	S. He, K. Shi, C. Liu, B. Guo, J. Chen, and Z. Shi, “Collaborative sensing in internet of things: A comprehensive survey,” IEEE Commun. Surv. Tutorials, vol. 24, no. 3, pp. 1435–1474, 2022. doi: 10.1109/COMST.2022.3187138
[50]	M. Chen, M. Mozaffari, W. Saad, C. Yin, M. Debbah, and C. S. Hong, “Caching in the sky: Proactive deployment of cache-enabled unmanned aerial vehicles for optimized quality-of-experience,” IEEE J. Sel. Areas Commun., vol. 35, no. 5, pp. 1046–1061, May 2017. doi: 10.1109/JSAC.2017.2680898
[51]	C. E. Shannon, “A mathematical theory of communication,” ACM SIGMOBILE Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, 2001. doi: 10.1145/584091.584093
[52]	S. Gronauer and K. Diepold, “Multi-agent deep reinforcement learning: A survey,” Artif. Intell. Rev., vol. 55, no. 2, pp. 895–943, Feb. 2022. doi: 10.1007/s10462-021-09996-w
[53]	J. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, “Counterfactual multi-agent policy gradients,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, 2018, pp. 363.
[54]	J. G. Kuba, M. Wen, L. Meng, S. Gu, H. Zhang, D. Mguni, J. Wang, and Y. Yang, “Settling the variance of multi-agent policy gradients,” in Proc 35th Int. Conf. Neural Information Processing Systems, 2021, pp. 13458–13470.
[55]	L. Baird, “Residual algorithms: Reinforcement learning with function approximation,” in Machine Learning Proc. 1995, A. Prieditis and S. Russell, Eds. Amsterdam, The Netherlands: Elsevier, pp. 30–37.
[56]	J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347, 2017.
[57]	J. Kasai, N. Pappas, H. Peng, J. Cross, and N. Smith, “Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation,” in Proc. 9th Int. Conf. Learning Representations, Austria, 2021.
[58]	M. Yang and O. Nachum, “Representation matters: Offline pretraining for sequential decision making,” in Proc. 38th Int. Conf. Machine Learning, 2021, pp. 11784–11794.
[59]	Z. Li, E. Wallace, S. Shen, K. Lin, K. Keutzer, D. Klein, and J. Gonzalez, “Train large, then compress: Rethinking model size for efficient training and inference of transformers,” in Proc. 37th Int. Conf. Machine Learning, 2020, pp. 553.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(12) / Tables(2)

Get Citation

PDF

XML

Article Metrics

Article views (1063) PDF downloads(146)

Highlights

We introduce a novel avatar task migration framework aimed at achieving continuous user-avatar interaction. Within this framework, vehicles choose appropriate edge servers (e.g., RSUs or UAVs) for the migration and pre-migration of tasks, enabling real-time avatar task execution in UAV-assisted vehicular Metaverses
In order to efficiently solve the service provisioning problem, we model the avatar task migration process as a Partially Observable Markov Decision Process. The proposed framework considers the avatar task migration problem as binary integer programming and proves that this problem is NP-hard. The challenges are then tackled using MADRL algorithms.
We propose a transformer-based decision-making model based on MAPPO that processes in a sequential manner. The proposed model leverages the self-attentive mechanism to perceive the relationship between agents' interactions for obtaining the optimal policy for each agent. Numerical results show that the proposed approach outperforms the existing MAPPO approach by approximately 2% and effectively reduces the latency of avatar task execution by around 20%
To incentivize edge servers (e.g., RSUs or UAVs) to contribute adequate resources to vehicles, we maintain transaction records of communication, computing, and storage resources exchanged between edge servers and vehicles in the blockchain. Utilizing smart contracts ensures the security and traceability of these transactions

UAV-Assisted Dynamic Avatar Task Migration for Vehicular Metaverse Services: A Multi-Agent Deep Reinforcement Learning Approach

doi: 10.1109/JAS.2023.123993

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content