Simultaneous Semi-Coupled Dictionary Learning for Matching RGBD Data

Matching with hidden information which is available only during training and not during testing has recently become an important research problem. Matching data from two different modalities, known as cross-modal matching is another challenging problem due to the large variations in the data coming from different modalities. Often, these are treated as two independent problems. But for applications like matching RGBD data, when only one modality is available during testing, it can reduce to either of the two problems. In this work, we propose a framework which can handle both these scenarios seamlessly with applications to matching RGBD data of Lambertian objects. The proposed approach jointly uses the RGB and depth data to learn an illumination invariant canonical version of the objects. Dictionaries are learnt for the RGB, depth and the canonical data, such that the transformed sparse coefficients of the RGB and the depth data is equal to that of the canonical data. Given RGB or depth data, their sparse coefficients corresponding to their canonical version is computed which can be directly used for matching using a Mahalanobis metric. Extensive experiments on three datasets, EURECOM, VAP RGB-D-T and Texas 3D Face Recognition database show the effectiveness of the proposed framework.

[1]  Guisheng Liao,et al.  A novel extreme learning machine using privileged information , 2015, Neurocomputing.

[2]  Peter Tiño,et al.  Incorporating Privileged Information Through Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[3]  David J. Kriegman,et al.  Nine points of light: acquiring subspaces for face recognition under variable lighting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[4]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[5]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Rauf Izmailov,et al.  Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..

[8]  Thomas B. Moeslund,et al.  RGB-D-T Based Face Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[10]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[11]  Rama Chellappa,et al.  Robust Estimation of Albedo for Illumination-invariant Matching and Shape Recovery , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Alan C. Bovik,et al.  Anthropometric 3D Face Recognition , 2010, International Journal of Computer Vision.

[13]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[14]  Alberto Del Bimbo,et al.  A Set of Selected SIFT Features for 3D Facial Expression Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[15]  Nikhil Rasiwasia,et al.  Cluster Canonical Correlation Analysis , 2014, AISTATS.

[16]  Dong Xu,et al.  Distance Metric Learning Using Privileged Information for Face Verification and Person Re-Identification , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Dacheng Tao,et al.  Relative Attribute SVM+ Learning for Age Estimation , 2016, IEEE Transactions on Cybernetics.

[18]  Beng Chin Ooi,et al.  Effective deep learning-based multi-modal retrieval , 2015, The VLDB Journal.

[19]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Qiang Ji,et al.  Classifier learning with hidden information , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Christoph H. Lampert,et al.  Learning to Rank Using Privileged Information , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[23]  Quan Pan,et al.  Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[25]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[26]  Alan C. Bovik,et al.  Texas 3D Face Recognition Database , 2010, 2010 IEEE Southwest Symposium on Image Analysis & Interpretation (SSIAI).

[27]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[28]  Jean-Luc Dugelay,et al.  KinectFaceDB: A Kinect Database for Face Recognition , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[29]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[30]  Yu-Chiang Frank Wang,et al.  Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.