Dynamic graph convolutional network for multi-video summarization

Abstract Multi-video summarization is an effective tool for users to browse multiple videos. In this paper, multi-video summarization is formulated as a graph analysis problem and a dynamic graph convolutional network is proposed to measure the importance and relevance of each video shot in its own video as well as in the whole video collection. Two strategies are proposed to solve the inherent class imbalance problem of video summarization task. Moreover, we propose a diversity regularization to encourage the model to generate a diverse summary. Extensive experiments are conducted, and the comparisons are carried out with the state-of-the-art video summarization methods, the traditional and novel graph models. Our method achieves state-of-the-art performances on two standard video summarization datasets. The results demonstrate the effectiveness of our proposed model in generating a representative summary for multiple videos with good diversity.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  Bernard Mérialdo,et al.  Multi-document video summarization , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[3]  Abhinav Gupta,et al.  Videos as Space-Time Region Graphs , 2018, ECCV.

[4]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[5]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[6]  Amit K. Roy-Chowdhury,et al.  Diversity-Aware Multi-Video Summarization , 2017, IEEE Transactions on Image Processing.

[7]  Ling Shao,et al.  Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Si Zhang,et al.  Graph convolutional networks: a comprehensive review , 2019, Computational Social Networks.

[9]  Yang Wang,et al.  Video Summarization by Learning From Unpaired Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xuelong Li,et al.  Query-aware sparse coding for web multi-video summarization , 2019, Inf. Sci..

[11]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Ling Shao,et al.  Learning Compositional Neural Information Fusion for Human Parsing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Song-Chun Zhu,et al.  Learning Human-Object Interactions by Graph Parsing Neural Networks , 2018, ECCV.

[15]  Yale Song,et al.  TVSum: Summarizing web videos using titles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[17]  Song-Chun Zhu,et al.  Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Jianmin Jiang,et al.  A novel clustering method for static video summarization , 2017, Multimedia Tools and Applications.

[19]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[20]  Shaohui Mei,et al.  Video summarization via minimum sparse reconstruction , 2015, Pattern Recognit..

[21]  Bernard Mérialdo,et al.  Generating summaries of multi-episode video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[22]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[23]  Tianbao Yang,et al.  Improving Sequential Determinantal Point Processes for Supervised Video Summarization , 2018, ECCV.

[24]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[25]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[26]  Ke Zhang,et al.  Video Summarization with Long Short-Term Memory , 2016, ECCV.

[27]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[29]  Amit K. Roy-Chowdhury,et al.  Collaborative Summarization of Topic-Related Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Carlo Giudicianni,et al.  Weighted spectral clustering for water distribution network partitioning , 2017, Applied Network Science.

[31]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[32]  Bernard Mérialdo,et al.  Multimedia maximal marginal relevance for multi-video summarization , 2014, Multimedia Tools and Applications.

[33]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[34]  Jianmin Jiang,et al.  Video summarization via spatio-temporal deep architecture , 2019, Neurocomputing.

[35]  Jianxin Li,et al.  Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN , 2018, WWW.

[36]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[37]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Yan Liu,et al.  MvsGCN: A Novel Graph Convolutional Network for Multi-video Summarization , 2019, ACM Multimedia.

[39]  Hanspeter Pfister,et al.  Multi-video browsing and summarization , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[40]  Dongdong Chen,et al.  Quantum-based subgraph convolutional neural networks , 2019, Pattern Recognit..