Volume 13
Issue 1
IEEE/CAA Journal of Automatica Sinica
| Citation: | Q. Yu, Z. Wang, G. Wei, and H. Yu, “Deep learning for video summarization: Systematic review, challenges and opportunities,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 1, pp. 21–42, Jan. 2026. doi: 10.1109/JAS.2025.125864 |
| [1] |
L.-F. Li, Y. Hua, Y.-H. Liu, and F.-H. Huang, “Study on fast fractal image compression algorithm based on centroid radius,” Syst. Sci. Control Eng., vol. 12, no. 1, p. 2269183, Jan. 2024. doi: 10.1080/21642583.2023.2269183
|
| [2] |
G. Liang, Y. Lv, S. Li, X. Wang, and Y. Zhang, “Video summarization with a dual-path attentive network,” Neurocomputing, vol. 467, pp. 1–9, Jan. 2022. doi: 10.1016/j.neucom.2021.09.015
|
| [3] |
H. Li, Q. Ke, M. Gong, and R. Zhang, “Video joint modelling based on hierarchical transformer for co-summarization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3904–3917, Mar. 2023. doi: 10.1109/tpami.2022.3186506
|
| [4] |
J. Lu, X. Wang, X. Cheng, J. Yang, O. Kwan, and X. Wang, “Parallel factories for smart industrial operations: From big AI models to field foundational models and scenarios engineering,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 12, pp. 2079–2086, Dec. 2022. doi: 10.1109/JAS.2022.106094
|
| [5] |
Y. Sun, Y. Hu, H. Zhang, H. Chen, and F.-Y. Wang, “A parallel emission regulatory framework for intelligent transportation systems and smart cities,” IEEE Trans. Intell. Veh., vol. 8, no. 2, pp. 1017–1020, Feb. 2023. doi: 10.1109/TIV.2023.3246045
|
| [6] |
O. Friha, M. A. Ferrag, L. Shu, L. Maglaras, and X. Wang, “Internet of things for the future of smart agriculture: A comprehensive survey of emerging technologies,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 718–752, Apr. 2021. doi: 10.1109/JAS.2021.1003925
|
| [7] |
H. Yu, Y. Wang, Y. Tian, H. Zhang, W. Zheng, and F.-Y. Wang, “Social vision for intelligent vehicles: From computer vision to foundation vision,” IEEE Trans. Intell. Veh., vol. 8, no. 11, pp. 4474–4476, Nov. 2023. doi: 10.1109/TIV.2023.3330870
|
| [8] |
W. Ehab, L. Huang, and Y. Li, “UNet and variants for medical image segmentation,” Int. J. Network Dyn. Intell., vol. 3, no. 2, p. 100009, Jun. 2024. doi: 10.53941/ijndi.2024.100009
|
| [9] |
R. Wang and J. Liang, “The effect of multiscale parameters on the spiking properties of the morphological neuron with excitatory autapse,” Syst. Sci. Control Eng., vol. 12, no. 1, p. 2313865, Mar. 2024. doi: 10.1080/21642583.2024.2313865
|
| [10] |
D. Li, K. D. Ortegas, and M. White, “Exploring the computational effects of advanced deep neural networks on logical and activity learning for enhanced thinking skills,” Systems, vol. 11, no. 7, p. 319, Jun. 2023. doi: 10.3390/systems11070319
|
| [11] |
F. Yao, Y. Ding, S. Hong, and S.-H. Yang, “A survey on evolved LoRa-based communication technologies for emerging internet of things applications,” Int. J. Network Dyn. Intell., vol. 1, no. 1, pp. 4–19, Dec. 2022.
|
| [12] |
Y. Wang, Q. Huang, C. Jiang, J. Liu, M. Shang, and Z. Miao, “Video stabilization: A comprehensive survey,” Neurocomputing, vol. 516, pp. 205–230, Jan. 2023. doi: 10.1016/j.neucom.2022.10.008
|
| [13] |
A. Mohan, K. Gauen, Y.-H. Lu, W. W. Li, and X. Chen, “Internet of video things in 2030: A world with many cameras,” in Proc. IEEE Int. Symp. Circuits and Systems, Baltimore, USA, 2017, pp. 1–4.
|
| [14] |
Z. Li, Y. Li, B. Tan, S. Ding, and S. Xie, “Structured sparse coding with the group log-regularizer for key frame extraction,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 10, pp. 1818–1830, Oct. 2022. doi: 10.1109/JAS.2022.105602
|
| [15] |
M. A. Smith and T. Kanade, “Video skimming and characterization through the combination of image and language understanding,” in Proc. IEEE Int. Workshop on Content-Based Access of Image and Video Database, Bombay, India, 1998, pp. 61–70.
|
| [16] |
Y. Wu, X. Huang, Z. Tian, X. Yan, and H. Yu, “Emotion contagion model for dynamical crowd path planning,” Int. J. Network Dyn. Intell., vol. 3, no. 3, p. 100014, Sep. 2024. doi: 10.53941/ijndi.2024.100014
|
| [17] |
B. Li and W. Li, “Distillation-based user selection for heterogeneous federated learning,” Int. J. Network Dyn. Intell., vol. 3, no. 2, p. 100007, Jun. 2024.
|
| [18] |
M. Otani, Y. Nakashima, E. Rahtu, J. Heikkilä, and N. Yokoya, “Video summarization using deep semantic features,” in Proc. 13th Asian Conf. Computer Vision, Taipei, China, 2016, pp. 361–377.
|
| [19] |
A. G. Money and H. Agius, “Video summarisation: A conceptual framework and survey of the state of the art,” J. Vis. Commun. Image Represent., vol. 19, no. 2, pp. 121–143, Feb. 2008. doi: 10.1016/j.jvcir.2007.04.002
|
| [20] |
T. Sebastian and J. J. Puthiyidam, “A survey on video summarization techniques,” Int. J. Comput. Appl., vol. 132, no. 13, pp. 30–33, Dec. 2015. doi: 10.5120/ijca2015907592
|
| [21] |
Y. S. Khan and S. Pawar, “Video summarization: Survey on event detection and summarization in soccer videos,” Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 11, pp. 256–259, 2015.
|
| [22] |
M. Ajmal, M. H. Ashraf, M. Shakir, Y. Abbas, and F. A. Shah, “Video summarization: Techniques and classification,” in Proc. Int. Conf. Computer Vision and Graphics, Warsaw, Poland, 2012, pp. 1–13.
|
| [23] |
N. Adami, S. Benini, and R. Leonardi, “An overview of video shot clustering and summarization techniques for mobile applications,” in Proc. 2nd Int. Conf. Mobile Multimedia Communications, Alghero, Italy, 2006, pp. 27.
|
| [24] |
C. Ma, P. Cheng, and C. Cai, “Localization and mapping method based on multimodal information fusion and deep learning for dynamic object removal,” Int. J. Network Dyn. Intell., vol. 3, no. 2, p. 100008, Jun. 2024. doi: 10.53941/ijndi.2024.100008
|
| [25] |
J. S. Alqurni, M. K. Alsmadi, H. Alfagham, S. Alzoubi, S. Ihab, A. Sameh, D. S. AbdElminaam, and O. I. Khalaf, “Streamlining video summarization with NLP: Techniques, implementation, and future direction,” SN Comput. Sci., vol. 6, no. 2, p. 110, Jan. 2025. doi: 10.1007/s42979-024-03591-w
|
| [26] |
F. Shamsi and I. Sindhu, “Condensing video content: Deep learning advancements and challenges in video summarization innovations,” Int. “J. Acad. Res. Humanit.”, vol. 4, no. 3, pp. 113–124, Jul. 2024. doi: 10.1109/access.2025.3526068
|
| [27] |
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 1–9.
|
| [28] |
J. L. Elman, “Finding structure in time,” Cognit. Sci., vol. 14, no. 2, pp. 179–211, Mar. 1990. doi: 10.1207/s15516709cog1402_1
|
| [29] |
D. Potapov, M. Douze, Z. Harchaoui, and C. Schmid, “Category-specific video summarization,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 540–555.
|
| [30] |
J. Fajtl, H. S. Sokeh, V. Argyriou, D. Monekosso, and P. Remagnino, “Summarizing videos with attention,” in Proc. 14th Asian Conf. Computer Vision, Perth, Australia, 2018, pp. 39–54.
|
| [31] |
F. Mohades Deilami, H. Sadr, and M. Tarkhan, “Contextualized multidimensional personality recognition using combination of deep neural network and ensemble learning,” Neural Process. Lett., vol. 54, no. 5, pp. 3811–3828, Apr. 2022. doi: 10.1007/s11063-022-10787-9
|
| [32] |
H. Sadr and M. Nazari Soleimandarabi, “ACNN-TL: Attention-based convolutional neural network coupling with transfer learning and contextualized word representation for enhancing the performance of sentiment classification,” J. Supercomput., vol. 78, no. 7, pp. 10149–10175, Jan. 2022. doi: 10.1007/s11227-021-04208-2
|
| [33] |
Z. Khodaverdian, H. Sadr, S. A. Edalatpanah, and M. Nazari, “An energy aware resource allocation based on combination of CNN and GRU for virtual machine selection,” Multimed. Tools Appl., vol. 83, no. 9, pp. 25769–25796, Aug. 2024. doi: 10.1007/s11042-023-16488-2
|
| [34] |
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv: 1409.1556, 2014.
|
| [35] |
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 770–778.
|
| [36] |
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2261–2269.
|
| [37] |
F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size,” arXiv preprint arXiv: 1602.07360, 2016.
|
| [38] |
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv: 1704.04861, 2017.
|
| [39] |
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. doi: 10.1162/neco.1997.9.8.1735
|
| [40] |
M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681, Nov. 1997. doi: 10.1109/78.650093
|
| [41] |
K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Proc. Conf. Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1724–1734.
|
| [42] |
C. Sun, A. Myers, C. Vondrick, K. Murphy, and C. Schmid, “VideoBERT: A joint model for video and language representation learning,” in Proc. IEEE/CVF Int. Conf. Computer Vision, Seoul, Korea (South), 2019, pp. 7463–7472.
|
| [43] |
S. Bano, S. Khalid, N. M. Tairan, H. Shah, and H. A. Khattak, “Summarization of scholarly articles using BERT and BiGRU: Deep learning-based extractive approach,” J. King Saud. Univ Comput. Inf. Sci., vol. 35, no. 9, p. 101739, Oct. 2023. doi: 10.1016/j.jksuci.2023.101739
|
| [44] |
P. Nagar, A. Rathore, C. V. Jawahar, and C. Arora, “Generating personalized summaries of day long egocentric videos,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 6, pp. 6832–6845, Jun. 2023. doi: 10.1109/TPAMI.2021.3118077
|
| [45] |
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.
|
| [46] |
S. Bano and S. Khalid, “BERT-based extractive text summarization of scholarly articles: A novel architecture,” in Proc. Int. Conf. Artificial Intelligence of Things, Istanbul, Turkey, 2022, pp. 1–5.
|
| [47] |
S. Huang, X. Li, Z. Zhang, F. Wu, and J. Han, “User-ranking video summarization with multi-stage spatio-temporal representation,” IEEE Trans. Image Process., vol. 28, no. 6, pp. 2654–2664, Jun. 2019.
|
| [48] |
Y. Jung, D. Cho, S. Woo, and I. S. Kweon, “Global-and-local relative position embedding for unsupervised video summarization,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 167–183.
|
| [49] |
S. Xiao, Z. Zhao, Z. Zhang, X. Yan, and M. Yang, “Convolutional hierarchical attention network for query-focused video summarization,” in Proc. 34th AAAI Conf. Artificial Intelligence, New York, USA, 2020, pp. 12426–12433.
|
| [50] |
M. Rochan, L. Ye, and Y. Wang, “Video summarization using fully convolutional sequence networks,” in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 358–374.
|
| [51] |
M. Z. Khan, S. Jabeen, S. ul Hassan, M. Hassan, and M. U. G. Khan, “Video summarization using CNN and bidirectional LSTM by utilizing scene boundary detection,” in Proc. Int. Conf. Applied and Engineering Mathematics, Taxila, Pakistan, 2019, pp. 197–202.
|
| [52] |
X. Feng, Y. Zhu, and C. Yang, “Video summarization based on fusing features and shot segmentation,” in Proc. 7th IEEE Int. Conf. Network Intelligence and Digital Content, Beijing, China, 2021, pp. 383–387.
|
| [53] |
M. Yan, X. Jiang, and J. Yuan, “3D convolutional generative adversarial networks for detecting temporal irregularities in videos,” in Proc. 24th Int. Conf. Pattern Recognition, Beijing, China, 2018, pp. 2522–2527.
|
| [54] |
T.-J. Fu, S.-H. Tai, and H.-T. Chen, “Attentive and adversarial learning for video summarization,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2019, pp. 1579–1587.
|
| [55] |
T. Yao, T. Mei, and Y. Rui, “Highlight detection with pairwise deep ranking for first-person video summarization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 982–990.
|
| [56] |
K. Davila, F. Xu, S. Setlur, and V. Govindaraju, “FCN-LectureNet: Extractive summarization of whiteboard and chalkboard lecture videos,” IEEE Access, vol. 9, pp. 104469–104484, 2021.
|
| [57] |
S.-H. Zhong, J. Wu, and J. Jiang, “Video summarization via spatio-temporal deep architecture,” Neurocomputing, vol. 332, pp. 224–235, Mar. 2019. doi: 10.1016/j.neucom.2018.12.040
|
| [58] |
X. Li, Y. Liu, K. Wang, and F.-Y. Wang, “A recurrent attention and interaction model for pedestrian trajectory prediction,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 5, pp. 1361–1370, Sep. 2020. doi: 10.1109/jas.2020.1003300
|
| [59] |
Y. Zhang, Y. Liu, and C. Wu, “Attention-guided multi-granularity fusion model for video summarization,” Expert Syst. Appl., vol. 249, p. 123568, Sep. 2024. doi: 10.1016/j.eswa.2024.123568
|
| [60] |
D. Vora, P. Kadam, D. D. Mohite, N. Kumar, N. Kumar, P. Radhakrishnan, and S. Bhagwat, “AI-driven video summarization for optimizing content retrieval and management through deep learning techniques,” Sci. Rep., vol. 15, no. 1, p. 4058, Feb. 2025. doi: 10.1038/s41598-025-87824-9
|
| [61] |
B. Zhao, X. Li, and X. Lu, “Property-constrained dual learning for video summarization,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 10, pp. 3989–4000, Oct. 2020. doi: 10.1109/TNNLS.2019.2951680
|
| [62] |
J. Wang, W. Wang, Z. Wang, L. Wang, D. Feng, and T. Tan, “Stacked memory network for video summarization,” in Proc. 27th ACM Int. Conf. Multimedia, Nice, France, 2019, pp. 836–844.
|
| [63] |
B. Zhao, X. Li, and X. Lu, “TTH-RNN: Tensor-train hierarchical recurrent neural network for video summarization,” IEEE Trans. Ind. Electron., vol. 68, no. 4, pp. 3629–3637, Apr. 2021. doi: 10.1109/TIE.2020.2979573
|
| [64] |
K. Zhang, K. Grauman, and F. Sha, “Retrospective encoders for video summarization,” in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 391–408.
|
| [65] |
L. Yuan, F. E. Tay, P. Li, L. Zhou, and J. Feng, “Cycle-SUM: Cycle-consistent adversarial LSTM networks for unsupervised video summarization,” in Proc. 33rd AAAI Conf. Artificial Intelligence, Honolulu, USA, 2019, pp. 9143–9150.
|
| [66] |
Y. Jung, D. Cho, D. Kim, S. Woo, and I. S. Kweon, “Discriminative feature learning for unsupervised video summarization,” in Proc. 33rd AAAI Conf. Artificial Intelligence, Honolulu, USA, 2019, pp. 8537–8544.
|
| [67] |
B. Zhao, M. Gong, and X. Li, “Audiovisual video summarization,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 8, pp. 5181–5188, Aug. 2023. doi: 10.1109/TNNLS.2021.3119969
|
| [68] |
Z. Ji, K. Xiong, Y. Pang, and X. Li, “Video summarization with attention-based encoder-decoder networks,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 6, pp. 1709–1717, Jun. 2020.
|
| [69] |
K. Zhang, W.-L. Chao, F. Sha, and K. Grauman, “Video summarization with long short-term memory,” in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 766–782.
|
| [70] |
Z. Ji, F. Jiao, Y. Pang, and L. Shao, “Deep attentive and semantic preserving video summarization,” Neurocomputing, vol. 405, pp. 200–207, Sep. 2020. doi: 10.1016/j.neucom.2020.04.132
|
| [71] |
X. Teng, X. Gui, P. Xu, Y. Shao, J. Tong, T. Du, and H. Dai, “A multi-flexible video summarization scheme using property-constraint decision tree,” Neurocomputing, vol. 506, pp. 406–417, Sep. 2022. doi: 10.1016/j.neucom.2022.07.077
|
| [72] |
Y. Liu, B. Tian, Y. Lv, L. Li, and F.-Y. Wang, “Point cloud classification using content-based transformer via clustering in feature space,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 1, pp. 231–239, Jan. 2024. doi: 10.1109/JAS.2023.123432
|
| [73] |
L. Lan, L. Jiang, T. Yu, X. Liu, and Z. He, “FullTransNet: Full transformer with local-global attention for video summarization,” arXiv preprint arXiv: 2501.00882, 2025.
|
| [74] |
S. Feng, Y. Xie, Y. Wei, J. Yan, and Q. Wang, “Transformer-based video summarization with spatial-temporal representation,” in Proc. 8th Int. Conf. Big Data and Information Analytics, Guiyang, China, 2022, pp. 428–433.
|
| [75] |
M. Bilkhu, S. Wang, and T. Dobhal, “Attention is all you need for videos: Self-attention based video summarization using universal transformers,” arXiv preprint arXiv: 1906.02792, 2019.
|
| [76] |
B. Bilonoh and S. Mashtalir, “Parallel multi-head dot product attention for video summarization,” in Proc. IEEE 3rd Int. Conf. Data Stream Mining & Processing, Lviv, Ukraine, 2020, pp. 158–162.
|
| [77] |
Z. Niu, G. Zhong, and H. Yu, “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48–62, Sep. 2021. doi: 10.1016/j.neucom.2021.03.091
|
| [78] |
X. Li, M. Li, P. Yan, G. Li, Y. Jiang, H. Luo, and S. Yin, “Deep learning attention mechanism in medical image analysis: Basics and beyonds,” Int. J. Network Dyn. Intell., vol. 2, no. 1, pp. 93–116, Mar. 2023. doi: 10.53941/ijndi0201006
|
| [79] |
J.-H. Huang, L. Murn, M. Mrak, and M. Worring, “GPT2MVS: Generative pre-trained transformer-2 for multi-modal video summarization,” in Proc. Int. Conf. Multimedia Retrieval, Taipei, China, 2021, pp. 580–589.
|
| [80] |
M. Narasimhan, A. Rohrbach, and T. Darrell, “CLIP-it! Language-guided video summarization,” in Proc. 35th Int. Conf. Neural Information Processing Systems, vol. 34, pp. 1072, 2021.
|
| [81] |
X. Shang, Z. Yuan, A. Wang, and C. Wang, “Multimodal video summarization via time-aware transformers,” in Proc. 29th ACM Int. Conf. Multimedia, 2021, pp. 1072.
|
| [82] |
M. Jia, Y. Wei, X. Song, T. Sun, M. Zhang, and L. Nie, “Query-oriented micro-video summarization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 6, pp. 4174–4187, Jun. 2024. doi: 10.1109/TPAMI.2024.3355402
|
| [83] |
B. Zhao, M. Gong, and X. Li, “Hierarchical multimodal transformer to summarize videos,” Neurocomputing, vol. 468, pp. 360–369, Jan. 2022. doi: 10.1016/j.neucom.2021.10.039
|
| [84] |
Y. Chen, B. Guo, Y. Shen, R. Zhou, W. Lu, W. Wang, X. Wen, and X. Suo, “Video summarization with u-shaped transformer,” Appl. Intell., vol. 52, no. 15, pp. 17864–17880, Apr. 2022. doi: 10.1007/s10489-022-03451-1
|
| [85] |
H. Wang, B. Zhou, Z. Zhang, Y. Du, D. Ho, and K.-F. Wong, “M3sum: A novel unsupervised language-guided video summarization,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Seoul, Korea, Republic of, 2024, pp. 4140–4144.
|
| [86] |
X. Tian, Y. Jin, Z. Zhang, P. Liu, and X. Tang, “Video summarization with temporal-channel visual transformer,” Pattern Recognit., vol. 165, p. 111631, Sep. 2025. doi: 10.1016/j.patcog.2025.111631
|
| [87] |
X. Yan, K. Deng, Q. Zou, Z. Tian, and H. Yu, “Self-cumulative contrastive graph clustering,” IEEE/CAA J. Autom. Sinica, vol. 12, no. 6, pp. 1194–1208, Jun. 2025. doi: 10.1109/JAS.2024.125025
|
| [88] |
M. Wang, H. Wang, and H. Zheng, “A mini review of node centrality metrics in biological networks,” Int. J. Network Dyn. Intell., vol. 1, no. 1, pp. 99–110, Dec. 2022. doi: 10.53941/ijndi0101009
|
| [89] |
E. R. Pinto, E. G. Nepomuceno, and A. S. L. O. Campanharo, “Individual-based modelling of animal brucellosis spread with the use of complex networks,” Int. J. Network Dyn. Intell., vol. 1, no. 1, pp. 120–129, Dec. 2022. doi: 10.53941/ijndi0101011
|
| [90] |
Y. Ding, M. Fu, P. Luo, and F.-X. Wu, “Network learning for biomarker discovery,” Int. J. Network Dyn. Intell., vol. 2, no. 1, pp. 51–65, Mar. 2023. doi: 10.53941/ijndi0201004
|
| [91] |
L. Ji, R. Tu, K. Lin, L. Wang, and N. Duan, “Multimodal graph neural network for video procedural captioning,” Neurocomputing, vol. 488, pp. 88–96, Jun. 2022. doi: 10.1016/j.neucom.2022.02.062
|
| [92] |
C. Ma, L. Lyu, and C. Lyu, “A novel graph-based structural dissimilarity measure for video summarization,” in Proc. 3rd Int. Conf. Computer Vision, Image and Deep Learning & Int. Conf. Computer Engineering and Applications, Changchun, China, 2022, pp. 643–647.
|
| [93] |
C. Ma, L. Lyu, G. Lu, and C. Lyu, “Adaptive multiview graph difference analysis for video summarization,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 12, pp. 8795–8808, Dec. 2022. doi: 10.1109/TCSVT.2022.3190998
|
| [94] |
R. Zhong, R. Wang, Y. Zou, Z. Hong, and M. Hu, “Graph attention networks adjusted Bi-LSTM for video summarization,” IEEE Signal Process. Lett, vol. 28, pp. 663–667, 2021.
|
| [95] |
B. Zhao, H. Li, X. Lu, and X. Li, “Reconstructive sequence-graph network for video summarization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 5, pp. 2793–2801, May 2022. doi: 10.1109/tpami.2021.3072117
|
| [96] |
S. Sahami, G. Cheung, and C.-W. Lin, “Fast graph sampling for short video summarization using Gershgorin disc alignment,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Singapore, Singapore, 2022, pp. 1765–1769.
|
| [97] |
J. Park, J. Lee, I.-J. Kim, and K. Sohn, “SumGraph: Video summarization via recursive graph modeling,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 647–663.
|
| [98] |
W. Zhu, Y. Han, J. Lu, and J. Zhou, “Relational reasoning over spatial-temporal graphs for video summarization,” IEEE Trans. Image Process., vol. 31, pp. 3017–3031, 2022.
|
| [99] |
M. Ma, S. Mei, S. Wan, Z. Wang, X.-S. Hua, and D. D. Feng, “Graph convolutional dictionary selection with $L_{2,\;p} $, norm for video summarization,” IEEE Trans. Image Process., vol. 31, pp. 1789–1804, 2022.
|
| [100] |
C.-H. Yeh, C.-M. Lien, Z.-X. Zhan, F.-H. Tsai, and M.-J. Chen, “Graph convolutional network for fast video summarization in compressed domain,” Neurocomputing, vol. 617, p. 128945, Feb. 2025. doi: 10.1016/j.neucom.2024.128945
|
| [101] |
J. M. R. Chaves and S. Tripathi, “VideoSAGE: Video summarization with graph representation learning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops, Seattle, USA, 2024, pp. 2527–2534.
|
| [102] |
J. Gunuganti, Z.-T. Yeh, J.-H. Wang, and M. Norouzi, “Unsupervised video summarization with adversarial graph-based attention network,” J. Vis. Commun. Image Represent., vol. 102, p. 104200, Jun. 2024. doi: 10.1016/j.jvcir.2024.104200
|
| [103] |
B. Mahasseni, M. Lam, and S. Todorovic, “Unsupervised video summarization with adversarial LSTM networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2982–2991.
|
| [104] |
Y. Xia, W. Zheng, Y. Wang, H. Yu, J. Dong, and F.-Y. Wang, “Local and global perception generative adversarial network for facial expression synthesis,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 3, pp. 1443–1452, Mar. 2022. doi: 10.1109/TCSVT.2021.3074032
|
| [105] |
J. Sun, H. Yu, J. J. Zhang, J. Dong, H. Yu, and G. Zhong, “Face image-sketch synthesis via generative adversarial fusion,” Neural Networks, vol. 154, pp. 179–189, Oct. 2022. doi: 10.1016/j.neunet.2022.07.013
|
| [106] |
H. Lin, Y. Liu, S. Li, and X. Qu, “How generative adversarial networks promote the development of intelligent transportation systems: A survey,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 9, pp. 1781–1796, Sep. 2023. doi: 10.1109/JAS.2023.123744
|
| [107] |
M. P. Kalashami, M. M. Pedram, and H. Sadr, “EEG feature extraction and data augmentation in emotion recognition,” Comput. Intell. Neurosci., vol. 2022, no. 1, p. 7028517, Mar. 2022. doi: 10.1155/2022/7028517
|
| [108] |
E. Apostolidis, E. Adamantidou, A. I. Metsai, V. Mezaris, and I. Patras, “Unsupervised video summarization via attention-driven adversarial learning,” in Proc. 26th Int. Conf. Multimedia Modeling, Daejeon, South Korea, 2020, pp. 492–504.
|
| [109] |
X. He, Y. Hua, T. Song, Z. Zhang, Z. Xue, R. Ma, N. Robertson, and H. Guan, “Unsupervised video summarization with attentive conditional generative adversarial networks,” in Proc. 27th ACM Int. Conf. Multimedia, Nice, France, 2019, pp. 2296–2304.
|
| [110] |
H. Lee and G. Lee, “Summarizing long-length videos with GAN-enhanced audio/visual features,” in Proc. IEEE/CVF Int. Conf. Computer Vision Workshop, Seoul, Korea (South), 2019, pp. 3727–3731.
|
| [111] |
R. Yanagi, R. Togo, T. Ogawa, and M. Haseyama, “Scene retrieval for video summarization based on text-to-image GAN,” in Proc. IEEE Int. Conf. Image Processing, Taipei, China, 2019, pp. 1825–1829.
|
| [112] |
E. Apostolidis, E. Adamantidou, A. I. Metsai, V. Mezaris, and I. Patras, “AC-SUM-GAN: Connecting actor-critic and generative adversarial networks for unsupervised video summarization,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 8, pp. 3278–3292, Aug. 2021. doi: 10.1109/TCSVT.2020.3037883
|
| [113] |
X. Guo, Z. Bi, J. Wang, S. Qin, S. Liu, and L. Qi, “Reinforcement learning for disassembly system optimization problems: A survey,” Int. J. Network Dyn. Intell., vol. 2, no. 1, pp. 1–14, 2023.
|
| [114] |
F. Zhang, Q. Yang, and D. An, “Privacy preserving demand side management method via multi-agent reinforcement learning,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 10, pp. 1984–1999, Oct. 2023. doi: 10.1109/JAS.2023.123321
|
| [115] |
J. Lu, L. Han, Q. Wei, X. Wang, X. Dai, and F.-Y. Wang, “Event-triggered deep reinforcement learning using parallel control: A case study in autonomous driving,” IEEE Trans. Intell. Veh., vol. 8, no. 4, pp. 2821–2831, Apr. 2023. doi: 10.1109/TIV.2023.3262132
|
| [116] |
A. Phaphuangwittayakul, Y. Guo, F. Ying, W. Xu, and Z. Zheng, “Self-attention recurrent summarization network with reinforcement learning for video summarization task,” in Proc. IEEE Int. Conf. Multimedia and Expo, Shenzhen, China, 2021, pp. 1–6.
|
| [117] |
K. Zhou, Y. Qiao, and T. Xiang, “Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, pp. 7582–7589, 2018.
|
| [118] |
S.-S. Zang, H. Yu, Y. Song, and R. Zeng, “Unsupervised video summarization using deep Non-Local video summarization networks,” Neurocomputing, vol. 519, pp. 26–35, Jan. 2023. doi: 10.1016/j.neucom.2022.11.028
|
| [119] |
Y. Yuan and J. Zhang, “Unsupervised video summarization via deep reinforcement learning with shot-level semantics,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 1, pp. 445–456, Jan. 2023. doi: 10.1109/TCSVT.2022.3197819
|
| [120] |
H. Sun, X. Zhu, and C. Zhou, “Deep reinforcement learning for video summarization with semantic reward,” in Proc. 22nd Int. Conf. Software Quality, Reliability, and Security Companion, Guangzhou, China, 2022, pp. 754–755.
|
| [121] |
Z. Li and L. Yang, “Weakly supervised deep reinforcement learning for video summarization with semantically meaningful reward,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2021, pp. 3238–3246.
|
| [122] |
Y. Huang, R. Zhong, W. Yao, and R. Wang, “Unsupervised learning of visual and semantic features for video summarization,” in Proc. IEEE Int. Symp. Circuits and Systems, Daegu, Korea, 2021, pp. 1–5.
|
| [123] |
J. Lei, Q. Luan, X. Song, X. Liu, D. Tao, and M. Song, “Action parsing-driven video summarization based on reinforcement learning,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 7, pp. 2126–2137, Jul. 2019. doi: 10.1109/TCSVT.2018.2860797
|
| [124] |
S. Lan, R. Panda, Q. Zhu, and A. K. Roy-Chowdhury, “FFNet: Video fast-forwarding via reinforcement learning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 6771–6780.
|
| [125] |
J. Hao, S. Liu, B. Guo, Y. Ding, and Z. Yu, “Context-adaptive online reinforcement learning for multi-view video summarization on mobile devices,” in Proc. IEEE 28th Int. Conf. Parallel and Distributed Systems, Nanjing, China, 2023, pp. 411–418.
|
| [126] |
T. Liu, Q. Meng, J.-J. Huang, A. Vlontzos, D. Rueckert, and B. Kainz, “Video summarization through reinforcement learning with a 3D spatio-temporal U-Net,” IEEE Trans. Image Process., vol. 31, pp. 1573–1586, Feb. 2022.
|
| [127] |
L. Lee, B. Eysenbach, R. R. Salakhutdinov, S. Gu, and C. Finn, “Weakly-supervised reinforcement learning for controllable behavior,” in Proc. 34th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 224.
|
| [128] |
W. Zhang, J. Wang, and F. Lan, “Dynamic hand gesture recognition based on short-term sampling neural networks,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 1, pp. 110–120, Jan. 2021. doi: 10.1109/jas.2020.1003465
|
| [129] |
Y.-T. Liu, Y.-J. Li, F.-E. Yang, S.-F. Chen, and Y.-C. F. Wang, “Learning hierarchical self-attention for video summarization,” in Proc. IEEE Int. Conf. Image Processing, Taipei, China, 2019, pp. 3377–3381.
|
| [130] |
E. Apostolidis, A. I. Metsai, E. Adamantidou, V. Mezaris, and I. Patras, “A stepwise, label-based approach for improving the adversarial training in unsupervised video summarization,” in Proc. 1st Int. Workshop on AI for Smart TV Content Production, Access and Delivery, Nice, France, 2019, pp. 17–25.
|
| [131] |
T. Liu, Q. Meng, A. Vlontzos, J. Tan, D. Rueckert, and B. Kainz, “Ultrasound video summarization using deep reinforcement learning,” in Proc. 23rd Int. Conf. Medical Image Computing and Computer Assisted Intervention, Lima, Peru, 2020, pp. 483–492.
|
| [132] |
M. S. Afzal and M. A. Tahir, “Reinforcement learning based video summarization with combination of ResNet and gated recurrent unit,” in Proc. 16th Int. Joint Conf. Computer Vision, Imaging and Computer Graphics Theory and Applications, 2021, pp. 261–268.
|
| [133] |
G. Wu, S. Song, X. Wang, and J. Zhang, “Reconstructive network under contrastive graph rewards for video summarization,” Expert Syst. Appl., vol. 250, p. 123860, Sep. 2024. doi: 10.1016/j.eswa.2024.123860
|
| [134] |
B. Zhao, X. Li, and X. Lu, “Hierarchical recurrent neural network for video summarization,” in Proc. 25th ACM Int. Conf. Multimedia, Mountain View, USA, 2017, pp. 863–871.
|
| [135] |
A. K. Singh, D. Srivastava, and M. Tapaswi, ““Previously on..” from recaps to story summarization,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2024, pp. 13635–13646.
|
| [136] |
W.-T. Chu and Y.-H. Liu, “Spatiotemporal modeling and label distribution learning for video summarization,” in Proc. IEEE 21st Int. Workshop on Multimedia Signal Processing, Kuala Lumpur, Malaysia, 2019, pp. 1–6.
|
| [137] |
B. Zhao, X. Li, and X. Lu, “HSA-RNN: Hierarchical structure-adaptive RNN for video summarization,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 7405–7414.
|
| [138] |
Y. Song, J. Vallmitjana, A. Stent, and A. Jaimes, “TVSum: Summarizing web videos using titles,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 5179–5187.
|
| [139] |
R. Panda, A. Das, Z. Wu, J. Ernst, and A. K. Roy-Chowdhury, “Weakly supervised summarization of web videos,” in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 3677–3686.
|
| [140] |
R. Panda and A. K. Roy-Chowdhury, “Collaborative summarization of topic-related videos,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 4274–4283.
|
| [141] |
W.-S. Chu, Y. Song, and A. Jaimes, “Video co-summarization: Video summarization by visual co-occurrence,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 3584–3592.
|
| [142] |
K. Zhang, W.-L. Chao, F. Sha, and K. Grauman, “Summary transfer: Exemplar-based subset selection for video summarization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 1059–1067.
|
| [143] |
X. Li, B. Zhao, and X. Lu, “A general framework for edited video and raw video summarization,” IEEE Trans. Image Process., vol. 26, no. 8, pp. 3652–3664, Aug. 2017. doi: 10.1109/TIP.2017.2695887
|
| [144] |
M. Sun, A. Farhadi, and S. Seitz, “Ranking domain-specific highlights by analyzing edited videos,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 787–802.
|
| [145] |
J. Gao, X. Yang, Y. Zhang, and C. Xu, “Unsupervised video summarization via relation-aware assignment learning,” IEEE Trans. Multimedia, vol. 23, pp. 3203–3214, 2021.
|
| [146] |
J. Wu, S.-H. Zhong, and Y. Liu, “Dynamic graph convolutional network for multi-video summarization,” Pattern Recognit., vol. 107, p. 107382, Nov. 2020. doi: 10.1016/j.patcog.2020.107382
|
| [147] |
S. Messaoud, I. Lourentzou, A. Boughoula, M. Zehni, Z. Zhao, C. Zhai, and A. G. Schwing, “DeepQAMVS: Query-aware hierarchical pointer networks for multivideo summarization,” in Proc. 44th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, New York, USA, 2021, pp. 1389–1399.
|
| [148] |
A. Alfaro and I. Sipiran, “Unsupervised video summarization: A reconstruction model with proximal gradient methods,” in Computer Vision-ECCV 2024 Workshops, A. Del Bue, C. Canton, J. Pont-Tuset, and T. Tommasi, Eds. Cham, Germany: Springer, 2025, pp. 84–99.
|
| [149] |
R. Panda, N. C. Mithun, and A. K. Roy-Chowdhury, “Diversity-aware multi-video summarization,” IEEE Trans. Image Process., vol. 26, no. 10, pp. 4712–4724, Oct. 2017. doi: 10.1109/TIP.2017.2708902
|
| [150] |
Z. Ji, Y. Ma, Y. Pang, and X. Li, “Query-aware sparse coding for web multi-video summarization,” Inf. Sci., vol. 478, pp. 152–166, Apr. 2019. doi: 10.1016/j.ins.2018.09.050
|
| [151] |
X. Yan, S. Hu, Y. Mao, Y. Ye, and H. Yu, “Deep multi-view learning methods: A review,” Neurocomputing, vol. 448, pp. 106–129, Aug. 2021. doi: 10.1016/j.neucom.2021.03.090
|
| [152] |
X. Yan, Y. Mao, Y. Ye, and H. Yu, “Incremental multiview clustering with continual information bottleneck method,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 55, no. 1, pp. 295–306, Jan. 2025. doi: 10.1109/TSMC.2024.3465039
|
| [153] |
S.-H. Ou, C.-H. Lee, V. S. Somayazulu, Y.-K. Chen, and S.-Y. Chien, “On-line multi-view video summarization for wireless video sensor network,” IEEE J. Sel. Top. Signal Process., vol. 9, no. 1, pp. 165–179, Feb. 2015. doi: 10.1109/JSTSP.2014.2331916
|
| [154] |
T. Hussain, K. Muhammad, A. Ullah, J. Del Ser, A. H. Gandomi, M. Sajjad, S. W. Baik, and V. H. C. de Albuquerque, “Multiview summarization and activity recognition meet edge computing in IoT environments,” IEEE Internet Things J., vol. 8, no. 12, pp. 9634–9644, Jun. 2021. doi: 10.1109/JIOT.2020.3027483
|
| [155] |
A. Kanehira, L. Van Gool, Y. Ushiku, and T. Harada, “Viewpoint-aware video summarization,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 7435–7444.
|
| [156] |
A. Khosla, R. Hamid, C.-J. Lin, and N. Sundaresan, “Large-scale video summarization using web-image priors,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Portland, USA, 2013, pp. 2698–2705.
|
| [157] |
M. Rochan and Y. Wang, “Video summarization by learning from unpaired data,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 7894–7903.
|
| [158] |
H. Wei, B. Ni, Y. Yan, H. Yu, X. Yang, and C. Yao, “Video summarization via semantic attended networks,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, 2018, pp. 216–223.
|
| [159] |
L. Feng, Z. Li, Z. Kuang, and W. Zhang, “Extractive video summarizer with memory augmented neural networks,” in Proc. 26th ACM Int. Conf. Multimedia, Seoul, Republic of Korea, 2018, pp. 976–983.
|
| [160] |
M. Elfeki and A. Borji, “Video summarization via actionness ranking,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2019, pp. 754–763.
|
| [161] |
C. Huang and H. Wang, “A novel key-frames selection framework for comprehensive video summarization,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 2, pp. 577–589, Feb. 2020. doi: 10.1109/TCSVT.2019.2890899
|
| [162] |
Y. Yuan, H. Li, and Q. Wang, “Spatiotemporal modeling for video summarization using convolutional recurrent neural network,” IEEE Access, vol. 7, pp. 64676–64685, 2019.
|
| [163] |
L. Lan and C. Ye, “Recurrent generative adversarial networks for unsupervised WCE video summarization,” Knowl.-Based Syst., vol. 222, p. 106971, Jun. 2021. doi: 10.1016/j.knosys.2021.106971
|
| [164] |
P. Li, Q. Ye, L. Zhang, L. Yuan, X. Xu, and L. Shao, “Exploring global diverse attention via pairwise temporal relation for video summarization,” Pattern Recognit., vol. 111, p. 107677, Mar. 2021. doi: 10.1016/j.patcog.2020.107677
|
| [165] |
J. A. Ghauri, S. Hakimov, and R. Ewerth, “Supervised video summarization via multiple feature sets with parallel attention,” in Proc. IEEE Int. Conf. Multimedia and Expo, Shenzhen, China, 2021, pp. 1–6.
|
| [166] |
P. Li, C. Tang, and X. Xu, “Video summarization with a graph convolutional attention network,” Front. Inf. Technol. Electron. Eng., vol. 22, no. 6, pp. 902–913, Jun. 2021. doi: 10.1631/FITEE.2000429
|
| [167] |
J. Lin, S.-H. Zhong, and A. Fares, “Deep hierarchical LSTM networks with attention for video summarization,” Comput. Electr. Eng., vol. 97, p. 107618, Jan. 2022. doi: 10.1016/j.compeleceng.2021.107618
|
| [168] |
J.-H. Huang, “Multi-modal video summarization,” in Proc. Int. Conf. Multimedia Retrieval, Multimedia, Retrieval, 2024, pp. 1214–1218.
|
| [169] |
Y. Zhang, X. Liang, D. Zhang, M. Tan, and E. P. Xing, “Unsupervised object-level video summarization with online motion auto-encoder,” Pattern Recognit. Lett., vol. 130, pp. 376–385, Feb. 2020. doi: 10.1016/j.patrec.2018.07.030
|
| [170] |
U.-N. Yoon, M.-D. Hong, and G.-S. Jo, “Interp-SUM: Unsupervised video summarization with piecewise linear interpolation,” Sensors, vol. 21, no. 13, p. 4562, Jul. 2021. doi: 10.3390/s21134562
|
| [171] |
M. Kaseris, I. Mademlis, and I. Pitas, “Adversarial unsupervised video summarization augmented with dictionary loss,” in Proc. IEEE Int. Conf. Image Processing, Anchorage, USA, 2021, pp. 2683–2687.
|
| [172] |
M. Gygli, H. Grabner, H. Riemenschneider, and L. Van Gool, “Creating summaries from user videos,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 505–520.
|
| [173] |
S. E. F. de Avila, A. P. B. Lopes, A. Jr. da Luz, and A. de Albuquerque Araújo, “VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method,” Pattern Recognit. Lett., vol. 32, no. 1, pp. 56–68, Jan. 2011. doi: 10.1016/j.patrec.2010.08.004
|
| [174] |
K.-H. Zeng, T.-H. Chen, J. C. Niebles, and M. Sun, “Title generation for user generated videos,” in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 609–625.
|
| [175] |
V. Kaushal, S. Kothawade, A. Tomar, R. Iyer, and G. Ramakrishnan, “How good is a video summary? A new benchmarking dataset and evaluation framework towards realistic video summarization,” arXiv preprint arXiv: 2101.10514, 2021.
|
| [176] |
H.-I. Ho, W.-C. Chiu, and Y.-C. F. Wang, “Summarizing first-person videos from third persons’ points of views,” in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 72–89.
|
| [177] |
Y. J. Lee, J. Ghosh, and K. Grauman, “Discovering important people and objects for egocentric video summarization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Providence, USA, 2012, pp. 1346–1353.
|
| [178] |
S. Yeung, A. Fathi, and L. Fei-Fei, “VideoSET: Video summary evaluation through text,” arXiv preprint arXiv: 1406.5824, 2014.
|
| [179] |
Z. Lei, C. Zhang, Q. Zhang, and G. Qiu, “FrameRank: A text processing approach to video summarization,” in Proc. IEEE Int. Conf. Multimedia and Expo, Shanghai, China, 2019, pp. 368–373.
|
| [180] |
C.-Y. Fu, J. Lee, M. Bansal, and A. Berg, “Video highlight prediction using audience chat reactions,” in Proc. Conf. Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2019, pp. 972–978.
|
| [181] |
V. Kaushal, S. Subramanian, S. Kothawade, R. Iyer, and G. Ramakrishnan, “A framework towards domain specific video summarization,” in Proc. IEEE Winter Conf. Applications of Computer Vision, Waikoloa, USA, 2019, pp. 666–675.
|
| [182] |
E. R. Ziegel, “Standard probability and statistics tables and formulae,” Technometrics, vol. 43, no. 2, pp. 249–250, 2001.
|
| [183] |
M. G. Kendall, “The treatment of ties in ranking problems,” Biometrika, vol. 33, no. 3, pp. 239–251, Nov. 1945. doi: 10.2307/2332303
|
| [184] |
H. Sadr, M. Nazari, M. M. Pedram, and M. Teshnehlab, “Exploring the efficiency of topic-based models in computing semantic relatedness of geographic terms,” Int. J. Web Res., vol. 2, no. 2, pp. 23–35, Aut. 2019.
|
| [185] |
J. Almeida, N. J. Leite, and R. da S. Torres, “VISON: VIdeo summarization for ONline applications,” Pattern Recognit. Lett., vol. 33, no. 4, pp. 397–409, Mar. 2012. doi: 10.1016/j.patrec.2011.08.007
|
| [186] |
H. Jacob, F. L. C. Pádua, A. Lacerda, and A. C. M. Pereira, “A video summarization approach based on the emulation of bottom-up mechanisms of visual attention,” J. Intell. Inf. Syst., vol. 49, no. 2, pp. 193–211, Jan. 2017. doi: 10.1007/s10844-016-0441-4
|
| [187] |
A. Sharghi, J. S. Laurel, and B. Gong, “Query-focused video summarization: Dataset, evaluation, and a memory network based approach,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 2127–2136.
|
| [188] |
N. Ejaz, I. Mehmood, and S. W. Baik, “Feature aggregation based visual attention model for video summarization,” Comput. Electr. Eng., vol. 40, no. 3, pp. 993–1005, Apr. 2014. doi: 10.1016/j.compeleceng.2013.10.005
|
| [189] |
S. Khalid, S. Wu, and F. Zhang, “A multi-objective approach to determining the usefulness of papers in academic search,” Data Technol. Appl., vol. 55, no. 5, pp. 734–748, Apr. 2021. doi: 10.1108/dta-05-2020-0104
|
| [190] |
M. Otani, Y. Nakashima, E. Rahtu, and J. Heikkilä, “Rethinking the evaluation of video summaries,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, USA, 2019, pp. 7588–7596.
|
| [191] |
A. Singh, D. M. Thounaojam, and S. Chakraborty, “A novel automatic shot boundary detection algorithm: Robust to illumination and motion effect,” Signal, Image Video Process., vol. 14, no. 4, pp. 645–653, Nov. 2020. doi: 10.1007/s11760-019-01593-3
|
| [192] |
B. Rashmi and H. Nagendraswamy, “Video shot boundary detection using block based cumulative approach,” Multimed. Tools Appl., vol. 80, no. 1, pp. 641–664, Sep. 2021. doi: 10.1007/s11042-020-09697-6
|
| [193] |
S. Chakraborty, A. Singh, and D. M. Thounaojam, “A novel bifold-stage shot boundary detection algorithm: Invariant to motion and illumination,” Vis. Comput., vol. 38, no. 2, pp. 445–456, Jan. 2022. doi: 10.1007/s00371-020-02027-9
|
| [194] |
A. K. Mallick and S. Mukhopadhyay, “Video retrieval framework based on color co-occurrence feature of adaptive low rank extracted keyframes and graph pattern matching,” Inf. Process. Manage., vol. 59, no. 2, p. 102870, Mar. 2022. doi: 10.1016/j.ipm.2022.102870
|
| [195] |
Z. Gao, G. Lu, and P. Yan, “Key-frame selection for video summarization: An approach of multidimensional time series analysis,” Multidimens. Syst. Signal Process., vol. 29, no. 4, pp. 1485–1505, Jul. 2018. doi: 10.1007/s11045-017-0513-9
|
| [196] |
R. Radhakrishnan, A. Divakaran, and Z. Xiong, “A time series clustering based framework for multimedia mining and summarization using audio features,” in Proc. 6th ACM SIGMM Int. Workshop on Multimedia Information Retrieval, New York, USA, 2004, pp. 157–164.
|
| [197] |
K. Fujimura, K. Honda, and K. Uehara, “Automatic video summarization by using color and utterance information,” in Proc. IEEE Int. Conf. Multimedia and Expo, Lausanne, Switzerland, 2002, pp. 49–52.
|
| [198] |
A. Divakaran, K. A. Peker, and H. Sun, “Video summarization using motion descriptors,” in Proc. SPIE 4315, Storage and Retrieval for Media Databases, San Jose, USA, 2001, pp. 517–522.
|
| [199] |
Y. Takahashi, N. Nitta, and N. Babaguchi, “Video summarization for large sports video archives,” in Proc. IEEE Int. Conf. Multimedia and Expo, Amsterdam, Netherlands, 2005, pp. 1170–1173.
|
| [200] |
B. Li and M. I. Sezan, “Event detection and summarization in sports video,” in Proc. IEEE Workshop on Content-Based Access of Image and Video Libraries, Kauai, USA, 2001, pp. 132–138.
|
| [201] |
J.-G. Kim, H. S. Chang, K. Kang, M. Kim, J. Kim, and H.-M. Kim, “Summarization of news video and its description for content-based access,” Int. J. Imaging Syst. Technol., vol. 13, no. 5, pp. 267–274, Mar. 2003.
|
| [202] |
M. Sheng, W. Ding, and W. Sheng, “Differential evolution with adaptive niching and reinitialisation for nonlinear equation systems,” Int. J. Syst. Sci., vol. 55, no. 10, pp. 2172–2186, Apr. 2024. doi: 10.1080/00207721.2024.2337039
|
| [203] |
L.-N. Liu and G.-H. Yang, “Distributed energy resource coordination for a microgrid over unreliable communication network with dos attacks,” Int. J. Syst. Sci., vol. 55, no. 2, pp. 237–252, Oct. 2024. doi: 10.1080/00207721.2023.2269294
|