Photo Stream Alignment and Summarization for Collaborative Photo Collection and Sharing

With the popularity of digital cameras and camera phones, it is common for different people, who may or may not know each other, to attend the same event and take pictures and videos from different spatial or personal perspectives. Within the realm of social media, it is desirable to enable these people to select and share their pictures and videos in order to enrich memories and facilitate social networking. However, it is cumbersome to manually manage these photos from different cameras, of which the clocks settings are often not calibrated. In this paper, we propose automatic algorithms to address the above problems. First, we accurately align different photo streams or sequences from different photographers for the same event in chronological order on a common timeline, while respecting the time constraints within each photo stream. Given the preferred similarity measures (e.g., visual, and spatial similarities), our algorithm performs photo stream alignment via matching on a bipartite kernel sparse representation graph that forces the data connections to be sparse in an explicit fashion. Furthermore, we can produce a summary master stream from the aligned super stream of photos for efficient sharing by removing those redundant photos in the super stream while accounting for the temporal integrity. Based on a similar kernel sparse representation graph, our master stream summarization algorithm performs greedy backward selection to drop redundant photos without affecting the integrity of remaining photos for the entire event. We evaluate our algorithms on real-world personal online albums for 36 events and demonstrate its efficacy in automatically facilitating collaborative photo collection and sharing.

[1]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[2]  Luc Van Gool,et al.  I know what you did last summer: object-level auto-annotation of holiday snaps , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[5]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Yi Li,et al.  ARISTA - image search to annotation on billions of web photos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Liang-Tien Chia,et al.  Kernel Sparse Representation for Image Classification and Face Recognition , 2010, ECCV.

[9]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Andreas E. Savakis,et al.  Automated event clustering and quality screening of consumer pictures for digital albuming , 2003, IEEE Trans. Multim..

[11]  Patrick Baudisch,et al.  Time quilt: scaling up zoomable photo browsers for large, unstructured photo collections , 2005, CHI EA '05.

[12]  Abigail Sellen,et al.  Understanding photowork , 2006, CHI.

[13]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Hong Cheng,et al.  Sparsity induced similarity measure for label propagation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Shuicheng Yan,et al.  Learning With $\ell ^{1}$-Graph for Image Analysis , 2010, IEEE Transactions on Image Processing.

[18]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[19]  Minglun Gong,et al.  Organizing and browsing photos using different feature vectors and their evaluations , 2009, CIVR '09.

[20]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[21]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[22]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[23]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Wei-Ta Chu,et al.  Automatic summarization of travel photos using near-duplication detection and feature filtering , 2009, MM '09.

[25]  Luc Van Gool,et al.  World-scale mining of objects and events from community photo collections , 2008, CIVR '08.

[26]  Patrice Y. Simard,et al.  Metrics and Models for Handwritten Character Recognition , 1998 .

[27]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.