Embedded sparse coding for summarizing multi-view videos

Most traditional video summarization methods are designed to generate effective summaries for single-view videos, and thus they cannot fully exploit the complicated intra- and inter-view correlations in summarizing multi-view videos. In this paper, we introduce a novel framework for summarizing multi-view videos in a way that takes into consideration both intra- and inter-view correlations in a joint embedding space. We learn the embedding by minimizing an objective function that has two terms: one due to intra-view correlations and another due to inter-view correlations across the multiple views. The solution is obtained by using a Majorization-Minimization algorithm that monotonically decreases the cost function in each iteration. We then employ a sparse representative selection approach over the learned embedding space to summarize the multi-view videos. Experiments on several multi-view datasets demonstrate that the proposed approach clearly outperforms the state-of-the-art methods.

[1]  C. Schmid,et al.  Category-Specific Video Summarization , 2014, ECCV.

[2]  Bin Zhao,et al.  Quasi Real-Time Summarization for Consumer Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[4]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5]  Yale Song,et al.  Video co-summarization: Video summarization by visual co-occurrence , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  B. S. Manjunath,et al.  Multicamera video summarization and anomaly detection from activity motifs , 2014, TOSN.

[7]  Jurandy Almeida,et al.  VISON: VIdeo Summarization for ONline applications , 2012, Pattern Recognit. Lett..

[8]  Shaohui Mei,et al.  A Top-Down Approach for Video Summarization , 2014, TOMM.

[9]  Kristen Grauman,et al.  Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.

[10]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[11]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[12]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[13]  Zhi-Hua Zhou,et al.  Multi-View Video Summarization , 2010, IEEE Transactions on Multimedia.

[14]  Patrick Gros,et al.  Automatically Creating Adaptive Video Summaries Using Constraint Satisfaction Programming: Application to Sport Content , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  H. C. Longuet-Higgins,et al.  An algorithm for associating the features of two images , 1991, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[16]  Luc Van Gool,et al.  Video summarization by learning submodular mixtures of objectives , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[18]  Chong-Wah Ngo,et al.  Click-through-based cross-view learning for image search , 2014, SIGIR.

[19]  Yanwen Guo,et al.  Multi-keyframe abstraction from videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[20]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[21]  Andrea Cavallaro,et al.  Resource Allocation for Personalized Video Summarization , 2014, IEEE Transactions on Multimedia.

[22]  Shaohui Mei,et al.  Video summarization via minimum sparse reconstruction , 2015, Pattern Recognit..

[23]  Samuel Kaski,et al.  Majorization-Minimization for Manifold Embedding (Supplemental Document) , 2015 .

[24]  Eric P. Xing,et al.  Joint Summarization of Large-Scale Collections of Web Images and Videos for Storyline Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ananda S. Chowdhury,et al.  Multi-View Video Summarization Using Bipartite Matching Constrained Optimum-Path Forest Clustering , 2015, IEEE Transactions on Multimedia.

[26]  Guillermo Sapiro,et al.  See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Chih-Jen Lin,et al.  Large-Scale Video Summarization Using Web-Image Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Luc Van Gool,et al.  Creating Summaries from User Videos , 2014, ECCV.

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  James M. Rehg,et al.  Gaze-enabled egocentric video summarization via constrained submodular maximization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Samuel Kaski,et al.  Majorization-Minimization for Manifold Embedding , 2015, AISTATS.

[32]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Jiebo Luo,et al.  Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.

[34]  Chia-han Lee,et al.  On-Line Multi-View Video Summarization for Wireless Video Sensor Network , 2015, IEEE Journal of Selected Topics in Signal Processing.

[35]  B. S. Manjunath,et al.  Multicamera Video Summarization from Optimal Reconstruction , 2010, ACCV Workshops.