Video classification and recommendation based on affective analysis of viewers

Most previous works on video classification and recommendation were only based on video contents, without considering the affective analysis of viewers. In this paper, we presented a novel method to classify and recommend videos based on affective analysis, mainly on facial expression recognition of viewers, by fusing the spatio-temporal features. For spatial features, we integrate Haar-like features into compositional ones according to the features' correlation, and train a mid classifier. Then this process is embedded into the improved AdaBoost learning algorithm to obtain spatial features. And for temporal feature fusion, we adopt HDCRFs based on HCRFs by introducing a time dimension variable. The spatial features are embedded into HDCRFs to recognize facial expressions. Experiments on the Cohn-Kanada database show that the proposed method has a promising performance. Then viewers' changing facial expressions are collected frame by frame from the camera when they are watching videos. Finally, we draw affective curves which tell the process of affection changes. Through the curves, we segment each video into affective sections, classify videos into categories, and list recommendation scores. Experimental results on our collected database show that most subjects are satisfied with the classification and recommendation results.

[1]  Hang-Bong Kang,et al.  Affective content detection using HMMs , 2003, ACM Multimedia.

[2]  Nicu Sebe,et al.  Looking at the viewer: analysing facial activity to detect personal highlights of multimedia contents , 2010, Multimedia Tools and Applications.

[3]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[5]  Nicu Sebe,et al.  Exploiting facial expressions for affective video summarisation , 2009, CIVR '09.

[6]  Katherine B. Martin,et al.  Facial Action Coding System , 2015 .

[7]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[8]  Jacob Whitehill,et al.  Haar features for FACS AU recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[9]  Seungmin Rho,et al.  SVR-based music mood classification and context-based music recommendation , 2009, ACM Multimedia.

[10]  Qiang Ji,et al.  Active and dynamic information fusion for facial expression understanding from image sequences , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Baoxin Li,et al.  YouTubeCat: Learning to categorize wild web videos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Qingshan Liu,et al.  Exploring facial expressions with compositional features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Qingshan Liu,et al.  Boosting encoded dynamic features for facial expression recognition , 2009, Pattern Recognit. Lett..

[14]  Shang-Hong Lai,et al.  Learning partially-observed hidden conditional random fields for facial expression recognition , 2009, CVPR.

[15]  Kiyoharu Aizawa,et al.  Latent topic driving model for movie affective scene classification , 2009, MM '09.

[16]  Peter Y. K. Cheung,et al.  Affective Level Video Segmentation by Utilizing the Pleasure-Arousal-Dominance Information , 2008, IEEE Transactions on Multimedia.

[17]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[18]  Meng Wang,et al.  Structure-sensitive manifold ranking for video concept detection , 2007, ACM Multimedia.

[19]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[20]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[22]  Yang Song,et al.  Taxonomic classification for web-based videos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Shaogang Gong,et al.  Robust facial expression recognition using local binary patterns , 2005, IEEE International Conference on Image Processing 2005.

[24]  C. Izard The face of emotion , 1971 .

[25]  Xian-Sheng Hua,et al.  Video Annotation Based on Kernel Linear Neighborhood Propagation , 2008, IEEE Transactions on Multimedia.

[26]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[27]  Ling-Yu Duan,et al.  Hierarchical movie affective content analysis based on arousal and valence features , 2008, ACM Multimedia.

[28]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[29]  Shiliang Zhang,et al.  Affective Visualization and Retrieval for Music Video , 2010, IEEE Transactions on Multimedia.

[30]  Rosalind W. Picard Affective Computing , 1997 .

[31]  Meng Wang,et al.  Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.