A SSIM-based approach for finding similar facial expressions

There are various scenarios where finding the most similar expression is the requirement rather than classifying one into discrete, pre-defined classes, for example, for facial expression transfer and facial expression based automatic album generation. This paper proposes a novel method for finding the most similar facial expression. Instead of the regular L2 norm distance, we investigate the use of the Structural SIMilarity (SSIM) metric for similarity comparison as a distance metric in a nearest neighbour unsupervised algorithm. The feature vectors are generated using Active Appearance Models (AAM). We also demonstrate how this technique can be extended and used for finding corresponding facial expression images across two or more subjects, which is useful in applications such as facial animation and automatic expression transfer. Person-independent facial expression performance results are shown on the Multi-PIE, FEEDTUM and AVOZES databases. We also compare the performance of the SSIM metric versus other distance metrics in a nearest neighbour search for finding the most similar facial expression to a given image.

[1]  Roland Göcke,et al.  Facial Expression Based Automatic Album Creation , 2010, ICONIP.

[2]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[4]  Roland Göcke,et al.  Iterative Error Bound Minimisation for AAM Alignment , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Prabir Bhattacharya,et al.  Classification of Facial Expressions Using K-Nearest Neighbor Classifier , 2007, MIRAGE.

[6]  Roland Göcke,et al.  Learning AAM fitting through simulation , 2009, Pattern Recognition.

[7]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[8]  Ioannis A. Kakadiaris,et al.  Expression-invariant multispectral face recognition: you can smile now! , 2006, SPIE Defense + Commercial Sensing.

[9]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[10]  Roland Göcke,et al.  Illumination and Expression Invariant Recognition Using SSIM Based Sparse Representation , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  F. Wallhoff,et al.  The Facial Expressions and Emotions Database Homepage (FEEDTUM) , 2005 .

[13]  Baining Guo,et al.  Real-Time Facial Expression Mapping for High Resolution 3D Meshes , 2006, Computer Graphics International.

[14]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15]  Roland Göcke,et al.  The audio-video australian English speech data corpus AVOZES , 2012, INTERSPEECH.

[16]  Marian Stewart Bartlett,et al.  Classifying Facial Actions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Changbo Hu,et al.  AAM derived face representations for robust facial action recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[18]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[19]  Zhou Wang,et al.  Translation insensitive image similarity in complex wavelet domain , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20]  Jovan Popović,et al.  Deformation transfer for triangle meshes , 2004, SIGGRAPH 2004.

[21]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Tomaso A. Poggio,et al.  Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.

[24]  Ira Kemelmacher-Shlizerman,et al.  Being John Malkovich , 2010, ECCV.

[25]  Tony Ezzat,et al.  Transferable videorealistic speech animation , 2005, SCA '05.

[26]  Jun-yong Noh,et al.  Expression cloning , 2001, SIGGRAPH 2001.

[27]  David J. Fleet,et al.  A framework for modeling appearance change in image sequences , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).