Learning about Individuals from Group Statistics

We propose a new problem formulation which is similar to, but more informative than, the binary multiple-instance learning problem. In this setting, we are given groups of instances (described by feature vectors) along with estimates of the fraction of positively-labeled instances per group. The task is to learn an instance level classifier from this information. That is, we are trying to estimate the unknown binary labels of individuals from knowledge of group statistics. We propose a principled probabilistic model to solve this problem that accounts for uncertainty in the parameters and in the unknown individual labels. This model is trained with an efficient MCMC algorithm. Its performance is demonstrated on both synthetic and real-world data arising in general object recognition.

[1]  B. D. Finetti,et al.  Bayesian inference and decision techniques : essays in honor of Bruno de Finetti , 1986 .

[2]  Kotagiri Ramamohanarao,et al.  Sparse Bayesian Learning for Regression and Classification using Markov Chain Monte Carlo , 2002, ICML.

[3]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[4]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[5]  Nando de Freitas,et al.  A Constrained Semi-supervised Learning Approach to Data Association , 2004, ECCV.

[6]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[7]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[8]  N. V. Vinodchandran,et al.  SVM-based generalized multiple-instance learning via approximate box counting , 2004, ICML.

[9]  Thomas Hofmann,et al.  Multiple-Instance Learning via Disjunctive Programming Boosting , 2003, NIPS.

[10]  Nando de Freitas,et al.  From Fields to Trees , 2004, UAI.

[11]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[14]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[15]  Hendrik Kück,et al.  Bayesian formulations of multiple instance learning with applications to general object recognition , 2004 .

[16]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  D. McFadden A Method of Simulated Moments for Estimation of Discrete Response Models Without Numerical Integration , 1989 .