IEEE/CAA Journal of Automatica Sinica
Citation: | R. Wang, Z. Hu, J. Yu, and J. Cheng, “Modelling diverse interactions and multimodality for pedestrian trajectory prediction,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 9, pp. 1801–1813, Sept. 2025. doi: 10.1109/JAS.2025.125363 |
[1] |
Z. Wei, H. Zhao, Z. Li, X. Bu, Y. Chen, X. Zhang, Y. Lv, and F. Y. Wang, “STGSA: A novel spatial-temporal graph synchronous aggregation model for traffic prediction,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 1, pp. 226–238, Jan. 2023. doi: 10.1109/JAS.2023.123033
|
[2] |
X. Li, Y. Liu, K. Wang, and F. Y. Wang, “A recurrent attention and interaction model for pedestrian trajectory prediction,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1361–1370, Sep. 2020.
|
[3] |
X. Zhao, Y. Chen, J. Guo, and D. Zhao, “A spatial-temporal attention model for human trajectory prediction,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 4, pp. 965–974, Jul. 2020. doi: 10.1109/JAS.2020.1003228
|
[4] |
Y. Zheng, Q. Li, C. Wang, X. Wang, and L. Hu, “Multi-source adaptive selection and fusion for pedestrian dead reckoning,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 12, pp. 2174–2185, Dec. 2022.
|
[5] |
A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, F. F. Li, and S. Savarese, “Social LSTM: Human trajectory prediction in crowded spaces,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 961–971.
|
[6] |
A. Gupta, J. Johnson, F. F. Li, S. Savarese, and A. Alahi, “Social GAN: Socially acceptable trajectories with generative adversarial networks,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 2255–2264.
|
[7] |
J. Amirian, J. B. Hayet, and J. Pettré, “Social ways: Learning multi-modal distributions of pedestrian trajectories with GANS,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops, Long Beach, USA, 2019, pp. 2964–2972.
|
[8] |
V. Kosaraju, A. Sadeghian, R. Martín-Martín, I. Reid, S. Rezatofighi, and S. Savarese, “Social-BIGAT: Multimodal trajectory forecasting using bicycle-GAN and graph attention networks,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, 2019, p. 13.
|
[9] |
H. Manh and G. Alaghband, “Scene-LSTM: A model for human trajectory prediction,” arXiv preprint arXiv: 1808.04018, 2018.
|
[10] |
H. Xue, D. Q. Huynh, and M. Reynolds, “SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Lake Tahoe, USA, 2018, pp. 1186–1194.
|
[11] |
M. Meng, Z. Wu, T. Chen, X. Cai, X. S. Zhou, F. Yang, and D. Shen, “Forecasting human trajectory from scene history,” in Proc. 36th Int. Conf. Neural Information Processing Systems, New Orleans, USA, 2022, pp. 1807.
|
[12] |
A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, and S. Savarese, “SoPhie: An attentive GAN for predicting paths compliant to social and physical constraints,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 1349–1358.
|
[13] |
A. Mohamed, D. Zhu, W. Vu, M. Elhoseiny, and C. Claudel, “Social-implicit: Rethinking trajectory prediction evaluation and the effectiveness of implicit maximum likelihood estimation,” in Proc. 17th European Conf. Computer Vision, Tel Aviv, Israel, 2022, pp. 463–479.
|
[14] |
J. Liang, L. Jiang, J. C. Niebles, A. G. Hauptmann, and F. F. Li, “Peeking into the future: Predicting future person activities and locations in videos,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 5718–5727.
|
[15] |
C. Yu, X. Ma, J. Ren, H. Zhao, and S. Yi, “Spatio-temporal graph transformer networks for pedestrian trajectory prediction,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 507–523.
|
[16] |
F. Giuliari, I. Hasan, M. Cristani, and F. Galasso, “Transformer networks for trajectory forecasting,” in Proc. 25th Int. Conf. Pattern Recognition, Milan, Italy, 2020, pp. 10335–10342.
|
[17] |
L. L. Li, B. Yang, M. Liang, W. Zeng, M. Ren, S. Segal, and R. Urtasun, “End-to-end contextual perception and prediction with interaction transformer,” Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Las Vegas, USA, 2008, pp. 5784–5791.
|
[18] |
A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-STGCNN: A social spatio-temporal graph convolutional neural network for human trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 14412–14420.
|
[19] |
L. Shi, L. Wang, C. Long, S. Zhou, M. Zhou, Z. Niu, and G. Hua, “SGCN: Sparse graph convolution network for pedestrian trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, USA, 2021, pp. 8990–8999.
|
[20] |
R. Wang, Z. Hu, X. Song, and W. Li, “Trajectory distribution aware graph convolutional network for trajectory prediction considering spatio-temporal interactions and scene information,” IEEE Trans. Knowl. Data Eng., vol. 36, no. 8, pp. 4304–4316, Aug. 2024. doi: 10.1109/TKDE.2023.3329676
|
[21] |
R. Wang, X. Song, Z. Hu, and Y. Cui, “Spatio-temporal interaction aware and trajectory distribution aware graph convolution network for pedestrian multimodal trajectory prediction,” IEEE Trans. Instrum. Meas., vol. 72, p. 5001211, 2023.
|
[22] |
N. Nikhil and B. T. Morris, “Convolutional neural network for trajectory prediction,” in Proc. European Conf. Computer Vision, Munich, Germany, 2018, pp. 186–196.
|
[23] |
H. Cui, V. Radosavljevic, F. C. Chou, T. H. Lin, T. Nguyen, T. K. Huang, J. Schneider, and N. Djuric, “Multimodal trajectory predictions for autonomous driving using deep convolutional networks,” in Proc. Int. Conf. Robotics and Automation, Montreal, Canada, 2019, pp. 2090–2096.
|
[24] |
J. Duan, L. Wang, C. Long, S. Zhou, F. Zheng, L. Shi, and G. Hua, “Complementary attention gated network for pedestrian trajectory prediction,” in Proc. 36th AAAI Conf. Artificial Intelligence, 2022, pp. 542–550.
|
[25] |
L. Shi, L. Wang, C. Long, S. Zhou, F. Zheng, N. Zheng, and G. Hua, “Social interpretable tree for pedestrian trajectory prediction,” in Proc. 36th AAAI Conf. Artificial Intelligence, 2022, pp. 2235–-2243.
|
[26] |
C. Ge, S. Song, and G. Huang, “Causal intervention for human trajectory prediction with cross attention mechanism,” in Proc. 37th AAAI Conf. Artificial Intelligence, Washington, USA, 2023, pp. 658–666.
|
[27] |
I. Bae and H. G. Jeon, “A set of control points conditioned pedestrian trajectory prediction,” in Proc. 37th AAAI Conf. Artificial Intelligence, Washington, USA, 2023, pp. 6155–6165.
|
[28] |
T. Maeda and N. Ukita, “Fast inference and update of probabilistic density estimation on trajectory prediction,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Paris, France, 2023, pp. 9761–9771.
|
[29] |
K. Chen, X. Song, and X. Ren, “Pedestrian trajectory prediction in heterogeneous traffic using pose keypoints-based convolutional encoder-decoder network,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 5, pp. 1764–1775, May 2021. doi: 10.1109/TCSVT.2020.3013254
|
[30] |
H. Tran, V. Le, and T. Tran, “Goal-driven long-term trajectory prediction,” in Proc. IEEE/CVF Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2021, pp. 796–805.
|
[31] |
Y. Hu, S. Chen, Y. Zhang, X. Gu, “Collaborative motion prediction via neural motion message passing,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 6318–6327.
|
[32] |
J. Sun, Q. Jiang, and C. Lu, “Recursive social behavior graph for trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, W USA, 2020, pp. 657–666.
|
[33] |
P. Negri and D. Garayalde, “Concatenating multiple trajectories using kalman filter for pedestrian tracking,” in Proc. IEEE Biennial Congr. Argentina, Bariloche, Argentina, 2014, pp. 364–369.
|
[34] |
R. Toledo-Moreo and M. A. Zamora-Izquierdo, “IMM-based lane-change prediction in highways with low-cost GPS/INS,” IEEE Trans. Intell. Transp. Syst, vol. 10, no. 1, pp. 180–185, Mar. 2009. doi: 10.1109/TITS.2008.2011691
|
[35] |
J. Wiest, M. Höffken, U. Kreßel, and K. Dietmayer, “Probabilistic trajectory prediction with Gaussian mixture models,” in Proc. IEEE Intelligent Vehicles Symp., Madrid, Spain, 2012, pp. 141–146.
|
[36] |
C. F. Wakim, S. Capperon, and J. Oksman, “A Markovian model of pedestrian behavior,” in Proc. IEEE Int. Conf. Systems, Man and Cybernetics, The Hague, Netherlands, 2004, pp. 4028–4033.
|
[37] |
P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 12077–12086.
|
[38] |
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q. V. Le, “XLNet: Generalized autoregressive pretraining for language understanding,” in Proc. 33rd Int. Conf. Neural Information Processing Systems, 2019, pp. 517.
|
[39] |
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, “ALBERT: A lite BERT for self-supervised learning of language representations,” in Proc. 8th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020.
|
[40] |
A. Bhattacharyya, B. Schiele, and M. Fritz, “Accurate and diverse sampling of sequences based on a “best of many” sample objective,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 8485–8493.
|
[41] |
T. Gu, G. Chen, J. Li, C. Lin, Y. Rao, J. Zhou, and J. Lu, “Stochastic trajectory prediction via motion indeterminacy diffusion,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 17092–17101.
|
[42] |
I. Bae, J. H. Park, and H. G. Jeon, “Non-probability sampling network for stochastic human trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 6467–6477.
|
[43] |
S. Pellegrini, A. Ess, K. Schindler, and L. van Gool, “You'll never walk alone: Modeling social behavior for multi-target tracking,” in Proc. IEEE 12th Int. Conf. Computer Vision, Kyoto, Japan, 2009, pp. 261–268.
|
[44] |
A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” in Comput. Graph. Forum, vol. 26, no. 3, pp. 655–664, Sep. 2007.
|
[45] |
A. Robicquet, A. Sadeghian, A. Alahi, S. Savarese, “Learning social etiquette: Human trajectory understanding in crowded scenes,” in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 549–565.
|
[46] |
K. Linou, “NBA player movements,” 2016. [Online]. Available: https://github.com/linouk23/NBA-Player-Movements. Accessed on: Augest 6, 2024.
|
[47] |
N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and M. Chandraker, “DESIRE: Distant future prediction in dynamic scenes with interacting agents,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2165–2174.
|
[48] |
P. Zhang, J. Xue, P. Zhang, N. Zheng, and W. Ouyang, “Social-aware pedestrian trajectory prediction via states refinement LSTM,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 5, pp. 2742–2759, May 2022.
|
[49] |
C. Xu, M. Li, Z. Ni, Y. Zhang, and S. Chen, “GroupNet: Multiscale hypergraph neural networks for trajectory prediction with relational reasoning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 6488–6497.
|
[50] |
T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 683–700.
|
[51] |
J. Li, F. Yang, M. Tomizuka, and C. Choi, “EvolveGraph: Multi-agent trajectory prediction with dynamic relational reasoning,” in Proc. 34th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, p. 1660.
|
[52] |
K. Mangalam, H. Girase, S. Agarwal, K. H. Lee, E. Adeli, J. Malik, and A. Gaidon, “It is not the journey but the destination: Endpoint conditioned trajectory prediction,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 759–776.
|
[53] |
Y. Huang, H. Bi, Z. Li, T. Mao, and Z. Wang, “STGAT: Modeling spatial-temporal interactions for human trajectory prediction,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 6271–6280.
|
[54] |
T. Kipf, E. Fetaya, K. C. Wang, M. Welling, and R. Zemel, “Neural relational inference for interacting systems,” in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 2688–2697.
|
[55] |
C. Xu, W. Mao, W. Zhang, and S. Chen, “Remember intentions: Retrospective-memory-based trajectory prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, New Orleans, USA, 2022, pp. 6478–6487.
|
[56] |
M. Geisslinger, F. Poszler, and M. Lienkamp, “An ethical trajectory planning algorithm for autonomous vehicles,” Nat. Mach. Intell., vol. 5, no. 2, pp. 137–144, Feb. 2023. doi: 10.1038/s42256-022-00607-z
|
[57] |
W. Wei and J. Wang, “Ethical decision-making for autonomous driving based on LSTM trajectory prediction network,” Procedia Comput. Sci., vol. 226, pp. 134–140, Aug. 2023.
|
[58] |
M. Althoff, M. Koschi, and S. Manzinger, “CommonRoad: Composable benchmarks for motion planning on roads,” Proc. IEEE Intelligent Vehicles Symp., Los Angeles, USA, 2017, pp. 719–726.
|