Long Short-Term Memory With Gate and State Level Fusion for Light Field-Based Face Recognition

Long Short-Term Memory (LSTM) is a prominent recurrent neural network for extracting dependencies from sequential data such as time-series and multi-view data, having achieved impressive results for different visual recognition tasks. A conventional LSTM network, hereafter referred only as LSTM network, can learn a model to posteriorly extract information from one input sequence. However, if two or more dependent sequences of data are simultaneously acquired, the LSTM networks may only process those sequences consecutively, not taking benefit of the information carried out by their mutual dependencies. In this context, this paper proposes two novel LSTM cell architectures that are able to jointly learn from multiple sequences simultaneously acquired, targeting to create richer and more effective models for recognition tasks. The efficacy of the novel LSTM cell architectures is assessed by integrating them into deep learning-based methods for face recognition with multi-view, light field images. The new cell architectures jointly learn the scene horizontal and vertical parallaxes available in a light field image, to capture richer spatio-angular information from both directions. A comprehensive evaluation, with the IST-EURECOM LFFD dataset using three challenging evaluation protocols, shows the advantage of using the novel LSTM cell architectures for face recognition over the state-of-the-art light field-based methods. These results highlight the added value of the novel cell architectures when learning from correlated input sequences.

[1]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[3]  Paulo Lobato Correia,et al.  Light field local binary patterns description for face recognition , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[4]  Lingxiao Wang,et al.  Feature Learning for One-Shot Face Recognition , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[5]  Chabane Djeraba,et al.  DLBP: A novel descriptor for depth image based face recognition , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Paulo Lobato Correia,et al.  LIGHT FIELD BASED FACE RECOGNITION VIA A FUSED DEEP REPRESENTATION , 2018, 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP).

[8]  Kiran B. Raja,et al.  Exploring the Usefulness of Light Field Cameras for Biometrics: An Empirical Study on Face and Iris Recognition , 2016, IEEE Transactions on Information Forensics and Security.

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Sixue Gong,et al.  Recurrent Embedding Aggregation Network for Video Face Recognition , 2019, ArXiv.

[11]  Arun Ross,et al.  Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[12]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[13]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[14]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Qionghai Dai,et al.  Light Field Image Processing: An Overview , 2017, IEEE Journal of Selected Topics in Signal Processing.

[17]  Yann LeCun,et al.  1.1 Deep Learning Hardware: Past, Present, and Future , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[18]  Kiran B. Raja,et al.  Multi-face Recognition at a Distance Using Light-Field Camera , 2013, 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[19]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[20]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[21]  Donald G. Dansereau,et al.  Plenoptic Signal Processing for Robust Vision in Field Robotics , 2013 .

[22]  Amit Garg amit,et al.  Lip reading using CNN and LSTM , 2016 .

[23]  Paulo Lobato Correia,et al.  Ear recognition in a light field imaging framework: a new perspective , 2018, IET Biom..

[24]  Heng Tao Shen,et al.  Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition , 2017, IEEE Signal Processing Letters.

[25]  Oscar Déniz-Suárez,et al.  Face recognition using Histograms of Oriented Gradients , 2011, Pattern Recognit. Lett..

[26]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[27]  Haiping Lu,et al.  MPCA: Multilinear Principal Component Analysis of Tensor Objects , 2008, IEEE Transactions on Neural Networks.

[28]  Yuanliu Liu,et al.  Video-based emotion recognition using CNN-RNN and C3D hybrid networks , 2016, ICMI.

[29]  Yurong Liu,et al.  A survey of deep neural network architectures and their applications , 2017, Neurocomputing.

[30]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[31]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Kiran B. Raja,et al.  Presentation Attack Detection for Face Recognition Using Light Field Camera , 2015, IEEE Transactions on Image Processing.

[33]  Paulo Lobato Correia,et al.  A Double-Deep Spatio-Angular Learning Framework for Light Field-Based Face Recognition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[35]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[36]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[37]  Maie Bachmann,et al.  Audiovisual emotion recognition in wild , 2018, Machine Vision and Applications.

[38]  Paulo Lobato Correia,et al.  The IST-EURECOM Light Field Face Database , 2017, 2017 5th International Workshop on Biometrics and Forensics (IWBF).

[39]  Paulo Lobato Correia,et al.  Facial Emotion Recognition Using Light Field Images with Deep Attention-Based Bidirectional LSTM , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Sung Wook Baik,et al.  Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features , 2018, IEEE Access.

[41]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[42]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[43]  Paulo Lobato Correia,et al.  A Deep Framework for Facial Emotion Recognition using Light Field Images , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII).

[44]  Paulo Lobato Correia,et al.  Light Field-Based Face Presentation Attack Detection: Reviewing, Benchmarking and One Step Further , 2018, IEEE Transactions on Information Forensics and Security.

[45]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[46]  Gang Wang,et al.  Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Paulo Lobato Correia,et al.  Face spoofing detection using a light field imaging framework , 2018, IET Biom..

[48]  Kiran B. Raja,et al.  A new perspective — Face recognition with light-field camera , 2013, 2013 International Conference on Biometrics (ICB).

[49]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Jürgen Schmidhuber,et al.  Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[52]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Kiran B. Raja,et al.  Comparative evaluation of super-resolution techniques for multi-face recognition using light-field camera , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[54]  Paulo Lobato Correia,et al.  Light Fields for Face Analysis , 2019, Sensors.

[55]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.