Bayesian learning for weakly supervised object classification

We explore the extent to which we can exploit interest point detectors for representing and recognising classes of objects. Detectors propose sparse sets of candidate regions based on local salience and stability criteria. However, local selection does not take into account discrimination reliability across instances in the same object class, so we realise selection by learning from weakly supervised data in the form of images paired with their captions. Through experiments on a wide variety of object classes and detectors, we show that modeling object recognition as a constrained data association problem and learning the Bayesian way by integrating over multiple hypotheses leads to sparse classifiers that outperform contemporary methods. Moreover, our learned representations based on local features leave little room for improvement on standard image databases, so we propose new data sets to corroborate models for general object recognition.

[1]  D. McFadden A Method of Simulated Moments for Estimation of Discrete Response Models Without Numerical Integration , 1989 .

[2]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[3]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[4]  Jun S. Liu Peskun's theorem and a modified discrete-state Gibbs sampler , 1996 .

[5]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[7]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[8]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[9]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[10]  Thomas Hofmann,et al.  Multiple instance learning with generalized support vector machines , 2002, AAAI/IAAI.

[11]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[12]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[13]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[14]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[15]  Andrew W. Moore,et al.  New Algorithms for Efficient High-Dimensional Nonparametric Classification , 2006, J. Mach. Learn. Res..

[16]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[18]  Cordelia Schmid,et al.  Selection of scale-invariant parts for object class recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Nando de Freitas,et al.  Bayesian Feature Weighting for Unsupervised Learning, with Application to Object Recognition , 2003, AISTATS.

[20]  Peter Auer,et al.  Weak Hypotheses and Boosting for Generic Object Detection and Recognition , 2004, ECCV.

[21]  Yee Whye Teh,et al.  Faces and names in the news , 2004, CVPR 2004.

[22]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[23]  C. Schmid,et al.  Scale-invariant shape features for recognition of object categories , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24]  Nando de Freitas,et al.  A Statistical Model for General Contextual Object Recognition , 2004, ECCV.

[25]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[26]  Nando de Freitas,et al.  A Constrained Semi-supervised Learning Approach to Data Association , 2004, ECCV.

[27]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .