Variational Relevance Vector Machine for Tabular Data

We adopt the Relevance Vector Machine (RVM) framework to handle cases of tablestructured data such as image blocks and image descriptors. This is achieved by coupling the regularization coefficients of rows and columns of features. We present two variants of this new gridRVM framework, based on the way in which the regularization coefficients of the rows and columns are combined. Appropriate variational optimization algorithms are derived for inference within this framework. The consequent reduction in the number of parameters from the product of the table’s dimensions to the sum of its dimensions allows for better performance in the face of small training sets, resulting in improved resistance to overfitting, as well as providing better interpretation of results. These properties are demonstrated on synthetic data-sets as well as on a modern and challenging visual identification benchmark.

[1]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[2]  Gavin C. Cawley,et al.  Gene Selection in Cancer Classification using Sparse Logistic Regression with Bayesian Regularisation , 2006 .

[3]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[4]  C. Schmid,et al.  Description of Interest Regions with Center-Symmetric Local Binary Patterns , 2006, ICVGIP.

[5]  Alejandro F. Frangi,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004 .

[6]  Yaniv Taigman,et al.  Descriptor Based Methods in the Wild , 2008 .

[7]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[8]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[9]  Lior Wolf,et al.  Modeling Appearances with Low-Rank SVM , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Shuicheng Yan,et al.  Coupled Subspaces Analysis , 2004 .

[11]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[14]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[15]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Matti Pietikäinen,et al.  A Generalized Local Binary Pattern Operator for Multiresolution Gray Scale and Rotation Invariant Texture Classification , 2001, ICAPR.

[17]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[18]  Michael I. Jordan,et al.  Bayesian parameter estimation through variational methods , 2008 .

[19]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[20]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[21]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[22]  Yuan Qi,et al.  Predictive automatic relevance determination by expectation propagation , 2004, ICML.

[23]  Lei Zhang,et al.  Feature extraction based on Laplacian bidirectional maximum margin criterion , 2009, Pattern Recognit..

[24]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[25]  Ming Li,et al.  2D-LDA: A statistical linear discriminant analysis for image matrix , 2005, Pattern Recognit. Lett..