论文信息 - Predicting movie ratings from audience behaviors

Predicting movie ratings from audience behaviors

We propose a method of representing audience behavior through facial and body motions from a single video stream, and use these features to predict the rating for feature-length movies. This is a very challenging problem as: i) the movie viewing environment is dark and contains views of people at different scales and viewpoints; ii) the duration of feature-length movies is long (80-120 mins) so tracking people uninterrupted for this length of time is still an unsolved problem; and iii) expressions and motions of audience members are subtle, short and sparse making labeling of activities unreliable. To circumvent these issues, we use an infrared illuminated test-bed to obtain a visually uniform input. We then utilize motion-history features which capture the subtle movements of a person within a pre-defined volume, and then form a group representation of the audience by a histogram of pair-wise correlations over a small-window of time. Using this group representation, we learn our movie rating classifier from crowd-sourced ratings collected by rottentomatoes.com and show our prediction capability on audiences from 30 movies across 250 subjects (> 50 hrs).

[1] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2] James W. Davis,et al. The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[3] F. Strack,et al. Reports of subjective well-being: Judgmental processes and their methodological implications. , 1999 .

[4] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5] W. Murch. In the blink of an eye : a perspective on film editing , 2001 .

[6] Jitendra Malik,et al. Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7] Alex Pentland,et al. Social Network Computing , 2003, UbiComp.

[8] Simon Baker,et al. Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[9] Alex Pentland,et al. GroupMedia: distributed multi-modal interfaces , 2004, ICMI '04.

[10] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .

[11] Elisabeth André,et al. Emotion recognition based on physiological changes in music listening , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Erin SanGregory. What is a topic report , 2008 .

[13] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .

[14] Maja Pantic,et al. Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[15] Gwen Littlewort,et al. Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Suh-Yin Lee,et al. Emotion-based music recommendation by affinity discovery from film music , 2009, Expert Syst. Appl..

[17] Zhihong Zeng,et al. A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Zhihong Zeng,et al. A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[19] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[20] Michael J. Black,et al. Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21] Nicu Sebe,et al. Looking at the viewer: analysing facial activity to detect personal highlights of multimedia contents , 2010, Multimedia Tools and Applications.

[22] Rosalind W. Picard,et al. Acted vs. natural frustration and delight: Many people smile in natural frustration , 2011, Face and Gesture 2011.

[23] J.K. Aggarwal,et al. Human activity analysis , 2011, ACM Comput. Surv..

[24] Ivan Laptev,et al. Data-driven crowd analysis in videos , 2011, ICCV.

[25] Jeffrey F. Cohn,et al. Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[26] Haibin Ling,et al. Real time robust L1 tracker using accelerated proximal gradient approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27] R. Pieters,et al. Emotion-Induced Engagement in Internet Video Advertisements , 2012 .

[28] Daniel McDuff,et al. Crowdsourcing Facial Responses to Online Videos , 2012, IEEE Transactions on Affective Computing.

[29] Daniel McDuff,et al. Predicting online media effectiveness based on smile responses gathered over the Internet , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[30] Sridha Sridharan,et al. Fourier Lucas-Kanade Algorithm , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Geoff Hulten,et al. Measuring the engagement level of TV viewers , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).