Multi-Instance Mixture Models and Semi-Supervised Learning

Multi-instance (MI) learning is a variant of supervised learning where labeled examples consist of bags (i.e. multi-sets) of feature vectors instead of just a single feature vector. Under standard assumptions, MI learning can be understood as a type of semisupervised learning (SSL). The difference between MI learning and SSL is that positive bag labels provide weak label information for the instances that they contain. MI learning tasks can be approximated as SSL tasks by disregarding this weak label information, allowing the direct application of existing SSL techniques. To give insight into this connection we first introduce multi-instance mixture models (MIMMs), an adaption of mixture model classifiers for multi-instance data. We show how to learn such models using an Expectation-Maximization algorithm in the case where the instance-level class distributions are members of an exponential family. The cost of the semi-supervised approximation to multiinstance learning is explored, both theoretically and empirically, by analyzing the properties of MIMMs relative to semi-supervised mixture models.

[1]  Ian Witten,et al.  Data Mining , 2000 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Xin Xu,et al.  Statistical Learning in Multiple Instance Problems , 2003 .

[4]  Luc De Raedt,et al.  Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract) , 1998, ILP.

[5]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[6]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[7]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[8]  Adrian E. Raftery,et al.  Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[9]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[10]  Hans-Peter Kriegel,et al.  An EM-Approach for Clustering Multi-Instance Objects , 2006, PAKDD.

[11]  Shuang-Hong Yang,et al.  Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora , 2009, NIPS.

[12]  Yi Zhang,et al.  Learning from multi-topic web documents for contextual advertisement , 2008, KDD.

[13]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: III. Valued relations , 1999 .

[14]  Søren Feodor Nielsen,et al.  Inference and Missing Data: Asymptotic Results , 1997 .

[15]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[16]  James R. Foulds,et al.  A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[17]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[18]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[19]  Zhi-Hua Zhou,et al.  On the relation between multi-instance learning and semi-supervised learning , 2007, ICML '07.

[20]  Vittorio Castelli,et al.  On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[21]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[22]  Oded Maron,et al.  Learning from Ambiguity , 1998 .

[23]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[24]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[25]  James R. Foulds,et al.  Speeding Up and Boosting Diverse Density Learning , 2010, Discovery Science.