The Riemannian and Affine Geometry of Facial Expression and Action Recognition

Recent advances in human 2D and 3D landmarks tracking have made it possible to model facial expression and action recognition as a temporal sequence of landmarks. We work directly with the Euclidean or affine invariants of landmarks. These invariants are represented as points in different shape spaces (Positive Semi-Definite (PSD) manifold, Grassmann manifold) and therefore their temporal evolution can be seen as a trajectory in these spaces. Using Riemannian geometry, these trajectories can be compared and classified, which has immediate applications in facial expression and action recognition.

[1]  K. Nomizu,et al.  Foundations of Differential Geometry , 1963 .

[2]  N. Higham Computing the polar decomposition with applications , 1986 .

[3]  Michel Deza,et al.  Geometry of cuts and metrics , 2009, Algorithms and combinatorics.

[4]  Klaus Obermayer,et al.  Classi cation on Pairwise Proximity , 2007 .

[5]  P. Absil,et al.  Riemannian Geometry of Grassmann Manifolds with a View on Algorithmic Computation , 2004 .

[6]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[7]  Ashok Veeraraghavan,et al.  The Function Space of an Activity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  N. Ayache,et al.  Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[9]  Michael Werman,et al.  Affine Invariance Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[12]  Thomas Philip Runarsson,et al.  Support vector machines and dynamic time warping for time series , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[13]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[15]  Bart Vandereycken,et al.  Embedded geometry of the set of symmetric positive semidefinite matrices of fixed rank , 2009, 2009 IEEE/SP 15th Workshop on Statistical Signal Processing.

[16]  Lei Wang,et al.  Positive Semidefinite Metric Learning with Boosting , 2009, NIPS.

[17]  Silvere Bonnabel,et al.  Riemannian Metric and Geometric Mean for Positive Semidefinite Matrices of Fixed Rank , 2008, SIAM J. Matrix Anal. Appl..

[18]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[19]  Rama Chellappa,et al.  Towards view-invariant expression analysis using analytic shape manifolds , 2011, Face and Gesture 2011.

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Rama Chellappa,et al.  Silhouette-based gesture and action recognition via modeling trajectories on Riemannian shape manifolds , 2011, Comput. Vis. Image Underst..

[22]  Jake K. Aggarwal,et al.  Facial expression recognition with temporal modeling of shapes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[23]  Silvere Bonnabel,et al.  Regression on Fixed-Rank Positive Semidefinite Matrices: A Riemannian Approach , 2010, J. Mach. Learn. Res..

[24]  Tamás D. Gedeon,et al.  Collecting Large, Richly Annotated Facial-Expression Databases from Movies , 2012, IEEE MultiMedia.

[25]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[26]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Qiang Ji,et al.  Capturing Complex Spatio-temporal Relations among Facial Muscles for Facial Expression Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Limin Wang,et al.  Motionlets: Mid-level 3D Parts for Human Motion Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Tamás D. Gedeon,et al.  Emotion recognition in the wild challenge (EmotiW) challenge and workshop summary , 2013, ICMI '13.

[30]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[31]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[32]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Shiguang Shan,et al.  Deeply Learning Deformable Facial Action Parts Model for Dynamic Expression Analysis , 2014, ACCV.

[35]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[37]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Hongdong Li,et al.  Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Mehrtash Tafazzoli Harandi,et al.  Image set classification by symmetric positive semi-definite matrices , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40]  Sergio Escalera,et al.  Support vector machines with time series distance kernels for action classification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[41]  Zhi Liu,et al.  3D-based Deep Convolutional Neural Network for action recognition with depth sequences , 2016, Image Vis. Comput..

[42]  Anuj Srivastava,et al.  Action Recognition Using Rate-Invariant Analysis of Skeletal Shape Trajectories , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[44]  Alan L. Yuille,et al.  Mining 3D Key-Pose-Motifs for Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Yin Wang,et al.  Efficient Temporal Sequence Comparison and Classification Using Gram Matrix Embeddings on a Riemannian Manifold , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Mohammed Bennamoun,et al.  A spatio-temporal RBM-based model for facial expression recognition , 2016, Pattern Recognit..

[47]  Mohamed Daoudi,et al.  A Novel Space-Time Representation on the Positive Semidefinite Cone for Facial Expression Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[48]  Andrea Cavallaro,et al.  Learning Bases of Activity for Facial Expression Recognition , 2017, IEEE Transactions on Image Processing.

[49]  Rushil Anirudh,et al.  Elastic Functional Coding of Riemannian Trajectories , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Xiaoming Liu,et al.  On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[51]  Mohamed Daoudi,et al.  Barycentric Representation and Metric Learning for Facial Expression Recognition , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[52]  Stefano Berretti,et al.  A Novel Geometric Framework on Gram Matrix Trajectories for Human Behavior Understanding , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.