A Probabilistic Approach to People-Centric Photo Selection and Sequencing

We present a crowdsourcing (CS) study to examine how specific attributes probabilistically affect the selection and sequencing of images from personal photo collections. Thirteen image attributes are explored, including seven people-centric properties. We first propose a novel dataset shaping technique based on mixed integer linear programming (MILP) to identify a subset of photos in which the attributes of interest are uniformly distributed and minimally correlated. Shaping enables the synthesis of compact, balanced, and representative datasets for CS, and facilitates effective learning of the selection likelihood of an image as well as its relative position in a sequence, given its attributes. We further present an ILP-based slideshow creation framework to select and arrange (a subset of) appealing images from a personal photo library. Quantitative and qualitative evaluations confirm that our method outperforms regression-based and greedy approaches for photo selection and sequencing, generating slideshows similar in quality to those created by humans.

[1]  Xiaofeng Tao,et al.  Transient attributes for high-level understanding and editing of outdoor scenes , 2014, ACM Trans. Graph..

[2]  Nuria Oliver,et al.  Supporting personal photo storytelling for social albums , 2010, ACM Multimedia.

[3]  M. Erb,et al.  Are emotions contagious? Evoked emotions while viewing emotionally expressive faces: quality, quantity, time course and gender differences , 2001, Psychiatry Research.

[4]  Stefan Winkler,et al.  How do users make a people-centric slideshow? , 2013, CrowdMM '13.

[5]  Stefan Winkler,et al.  Emotion-based sequence of family photos , 2012, ACM Multimedia.

[6]  Andreas E. Savakis,et al.  Evaluation of image appeal in consumer photography , 2000, Electronic Imaging.

[7]  Gang Hua,et al.  Which faces to tag: Adding prior constraints into active learning , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Shehroz S. Khan,et al.  Evaluating visual aesthetics in photographic portraiture , 2012, CAe '12.

[10]  José San Pedro,et al.  Ranking and classifying attractiveness of photos in folksonomies , 2009, WWW '09.

[11]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[12]  Djemel Ziou,et al.  Image Collection Organization and Its Application to Indexing, Browsing, Summarization, and Semantic Retrieval , 2007, IEEE Transactions on Multimedia.

[13]  Tsuhan Chen,et al.  Using Group Prior to Identify People in Consumer Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  James Ze Wang,et al.  ACQUINE: aesthetic quality inference engine - real-time automatic rating of photo aesthetics , 2010, MIR '10.

[15]  Li Zhuo,et al.  Learning realistic facial expressions from web images , 2013, Pattern Recognit..

[16]  Tsuhan Chen,et al.  Aesthetic quality assessment of consumer photos with faces , 2010, 2010 IEEE International Conference on Image Processing.

[17]  Eric Gilbert,et al.  Faces engage us: photos with faces attract more likes and comments on Instagram , 2014, CHI.

[18]  Catherine C. Marshall,et al.  Crowdsourcing a Subjective Labeling Task: A Human-Centered Framework to Ensure Reliable Results , 2014 .

[19]  Stefan Winkler,et al.  Modeling Image Appeal Based on Crowd Preferences for Automated Person-Centric Collage Creation , 2014, CrowdMM '14.

[20]  Minglun Gong,et al.  Similarity-based image organization and browsing using multi-resolution self-organizing map , 2011, Image Vis. Comput..

[21]  Lina J. Karam,et al.  A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB) , 2009, IEEE Transactions on Image Processing.

[22]  Bingbing Ni,et al.  Learning to Photograph: A Compositional Perspective , 2013, IEEE Transactions on Multimedia.

[23]  Nenghai Yu,et al.  Monet: A System for Reliving Your Memories by Theme-Based Photo Storytelling , 2016, IEEE Transactions on Multimedia.

[24]  Gang Wang,et al.  Seeing People in Social Context: Recognizing People and Social Relationships , 2010, ECCV.

[25]  Phuoc Tran-Gia,et al.  Best Practices for QoE Crowdtesting: QoE Assessment With Crowdsourcing , 2014, IEEE Transactions on Multimedia.

[26]  Nathan Moroney,et al.  Low level features for image appeal measurement , 2009, Electronic Imaging.

[27]  Jianxiong Xiao,et al.  What Makes a Photograph Memorable? , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Tao Mei,et al.  Let Your Photos Talk: Generating Narrative Paragraph for Photo Stream via Bidirectional Attention Recurrent Neural Networks , 2017, AAAI.

[29]  Martin W. P. Savelsbergh,et al.  Integer-Programming Software Systems , 2005, Ann. Oper. Res..

[30]  Judith Redi,et al.  A Reliable Methodology to Collect Ground Truth Data of Image Aesthetic Appeal , 2016, IEEE Transactions on Multimedia.

[31]  Vinod Chandran,et al.  Representation of facial expression categories in continuous arousal-valence space: Feature and correlation , 2014, Image Vis. Comput..

[32]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[33]  Stevan Rudinac,et al.  Learning Crowdsourced User Preferences for Visual Summarization of Image Collections , 2013, IEEE Transactions on Multimedia.

[34]  Qiang Liu,et al.  Scoring Workers in Crowdsourcing: How Many Control Questions are Enough? , 2013, NIPS.

[35]  Wesley De Neve,et al.  Collaborative Face Recognition for Improved Face Annotation in Personal Photo Collections Shared on Online Social Networks , 2011, IEEE Transactions on Multimedia.

[36]  Pere Obrador,et al.  Region based image appeal metric for consumer photos , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[37]  Stefan Winkler,et al.  Shaping datasets: Optimal data selection for specific target distributions across dimensions , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[38]  Jiebo Luo,et al.  Photo Stream Alignment and Summarization for Collaborative Photo Collection and Sharing , 2012, IEEE Transactions on Multimedia.

[39]  Steffen Staab,et al.  Smart photo selection: interpret gaze as personal interest , 2014, CHI.

[40]  Jiebo Luo,et al.  Mining Compositional Features From GPS and Visual Cues for Event Recognition in Photo Collections , 2010, IEEE Transactions on Multimedia.

[41]  Stefan Winkler,et al.  Impact of image appeal on visual attention during photo triaging , 2013, 2013 IEEE International Conference on Image Processing.

[42]  Tsuhan Chen,et al.  Towards aesthetics: a photo quality assessment and photo selection system , 2010, ACM Multimedia.

[43]  Subramanian Ramanathan,et al.  Can computers learn from humans to see better?: inferring scene semantics from viewers' eye movements , 2011, ACM Multimedia.

[44]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[45]  Jian Sun,et al.  A rank-order distance based clustering algorithm for face tagging , 2011, CVPR 2011.

[46]  Valentin Simeonov,et al.  École polytechnique fédérale de Lausanne (EPFL) , 2018, The Grants Register 2019.

[47]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, SIGGRAPH 2011.

[48]  Paulo Eduardo Oliveira,et al.  Relative smoothing of discrete distributions with sparse observations , 2011 .

[49]  Sabine Süsstrunk,et al.  Measuring colorfulness in natural images , 2003, IS&T/SPIE Electronic Imaging.

[50]  A. Coutrot,et al.  How saliency, faces, and sound influence gaze in dynamic social scenes. , 2014, Journal of vision.

[51]  Andreas Girgensohn,et al.  Temporal event clustering for digital photo collections , 2003, ACM Multimedia.

[52]  Aljoscha Smolic,et al.  Automated Aesthetic Analysis of Photographic Images , 2015, IEEE Transactions on Visualization and Computer Graphics.

[53]  Tsuhan Chen,et al.  Clothing cosegmentation for recognizing people , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Ning Zhang,et al.  Beyond frontal faces: Improving Person Recognition using multiple cues , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  A. Kingstone,et al.  Saliency does not account for fixations to eyes within social scenes , 2009, Vision Research.

[56]  Touradj Ebrahimi,et al.  Epitome: a social game for photo album summarization , 2010, CMM '10.

[57]  Gang Hua,et al.  Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context , 2010, ECCV.

[58]  Stefan Winkler,et al.  PhotoCluster a multi-clustering technique for near-duplicate detection in personal photo collections , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[59]  Hwan-Gue Cho,et al.  A hierarchical photo visualization system emphasizing temporal and color-based coherences , 2012, Multimedia Tools and Applications.

[60]  Philip J. Corriveau,et al.  Online subjective testing for consumer-photo quality evaluation , 2016, J. Electronic Imaging.

[61]  Jens Viggo Clausen Parallel Branch and Bound — Principles and Personal Experiences , 1997 .