Boosting with Side Information

In many problems of machine learning and computer vision, there exists side information, i.e., information contained in the training data and not available in the testing phase. This motivates the recent development of a new learning approach known as learning with side information that aims to incorporate side information for improved learning algorithms. In this work, we describe a new training method of boosting classifiers that uses side information, which we term as AdaBoost+. In particular, AdaBoost+ employs a novel classification label imputation method to construct extra weak classifiers from the available information that simulate the performance of better weak classifiers obtained from the features in side information. We apply our method to two problems, namely handwritten digit recognition and facial expression recognition from low resolution images, where it demonstrates its effectiveness in classification performance.

[1]  Shu Liao,et al.  Facial Expression Recognition using Advanced Local Binary Patterns, Tsallis Entropies and Global Appearance Features , 2006, 2006 International Conference on Image Processing.

[2]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Yücel Altunbasak,et al.  Eigenface-domain super-resolution for face recognition , 2003, IEEE Trans. Image Process..

[4]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  P. Kuusela,et al.  Learning with side information: PAC learning bounds , 2004, J. Comput. Syst. Sci..

[6]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[7]  Vladimir Vapnik,et al.  Learning using hidden information (Learning with teacher) , 2009, 2009 International Joint Conference on Neural Networks.

[8]  Sugato Basu and Mikhail Bilenko and Raymond J. Mooney Semisupervised Clustering for Intelligent User Management , 2004 .

[9]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[10]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[12]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[13]  Ying-li Tian,et al.  Evaluation of Face Resolution for Expression Analysis , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[14]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[15]  Kristen Grauman,et al.  Sharing features between objects and their attributes , 2011, CVPR 2011.

[16]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[17]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[20]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[21]  PerlovskyLeonid 2009 Special Issue , 2009 .

[22]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[23]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[24]  Lei Zhang,et al.  Machine learning for clinical diagnosis from functional magnetic resonance imaging , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[26]  Volker Tresp,et al.  Some Solutions to the Missing Feature Problem in Vision , 1992, NIPS.

[27]  Foster J. Provost,et al.  Handling Missing Values when Applying Classification Models , 2007, J. Mach. Learn. Res..

[28]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[29]  Frédéric Jurie,et al.  Improving object classification using semantic attributes , 2010, BMVC.

[30]  Shaogang Gong,et al.  Robust facial expression recognition using local binary patterns , 2005, IEEE International Conference on Image Processing 2005.

[31]  Lior Wolf,et al.  Kernel Feature Selection with Side Data Using a Spectral Approach , 2004, ECCV.

[32]  Shaogang Gong,et al.  Multi-modal tensor face for simultaneous super-resolution and recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[33]  E. B. Andersen,et al.  Information Science and Statistics , 1986 .

[34]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML.

[35]  Gal Chechik,et al.  Extracting Relevant Structures with Side Information , 2002, NIPS.

[36]  Jiří Matas,et al.  Computer Vision - ECCV 2004 , 2004, Lecture Notes in Computer Science.

[37]  Pablo H. Hennings-Yeomans,et al.  Simultaneous super-resolution and feature extraction for recognition of low-resolution faces , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[39]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[41]  Gwen Littlewort,et al.  Recognizing facial expression: machine learning and application to spontaneous behavior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[42]  V. Vapnik,et al.  On the theory of learning with Privileged Information , 2010, NIPS 2010.