A Comparison of Multi-instance Learning Algorithms

Motivated by various challenging real-world applications, such as drug activity prediction and image retrieval, multi-instance (MI) learning has attracted considerable interest in recent years. Compared with standard supervised learning, the MI learning task is more difficult as the label information of each training example is incomplete. Many MI algorithms have been proposed. Some of them are specifically designed for MI problems whereas others have been upgraded or adapted from standard single-instance learning algorithms. Most algorithms have been evaluated on only one or two benchmark datasets, and there is a lack of systematic comparisons of MI learning algorithms. This thesis presents a comprehensive study of MI learning algorithms that aims to compare their performance and find a suitable way to properly address different MI problems. First, it briefly reviews the history of research on MI learning. Then it discusses five general classes of MI approaches that cover a total of 16 MI algorithms. After that, it presents empirical results for these algorithms that were obtained from 15 datasets which involve five different real-world application domains. Finally, some conclusions are drawn from these results: (1) applying suitable standard single-instance learners to MI problems can often generate the best result on the datasets that were tested, (2) algorithms exploiting the standard asymmetric MI assumption do not show significant advantages over approaches using the so-called collective assumption, and (3) different MI approaches are suitable for different application domains, and no MI algorithm works best on all MI problems.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[3]  A. W. Kemp,et al.  Kendall's Advanced Theory of Statistics. , 1994 .

[4]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[5]  Peter Auer,et al.  On Learning From Multi-Instance Examples: Empirical Evaluation of a Theoretical Approach , 1997, ICML.

[6]  Ashwin Srinivasan,et al.  Mutagenesis: ILP experiments in a non-determinate biological domain , 1994 .

[7]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[8]  Bernhard Pfahringer,et al.  A Toolbox for Learning from Relational Data with Propositional and Multi-instance Learners , 2004, Australian Conference on Artificial Intelligence.

[9]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[10]  V. Gladyshev,et al.  A Study in Modeling Low-Conservation Protein Superfamilies , 2004 .

[11]  T. Fan,et al.  A structure-activity analysis of antagonism of the growth factor and angiogenic activity of basic fibroblast growth factor by suramin and related polyanions. , 1994, British Journal of Cancer.

[12]  Alfonso Valencia,et al.  Evaluation of BioCreAtIvE assessment of task 2 , 2005, BMC Bioinformatics.

[13]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[14]  Aravind Srinivasan,et al.  Approximating hyper-rectangles: learning and pseudo-random sets , 1997, STOC '97.

[15]  Ryszard S. Michalski,et al.  Inductive inference of VL decision rules , 1977, SGAR.

[16]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[17]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[18]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[19]  Zhi-Hua Zhou,et al.  Ensembles of Multi-instance Learners , 2003, ECML.

[20]  Sally A. Goldman,et al.  Multiple-Instance Learning of Real-Valued Data , 2001, J. Mach. Learn. Res..

[21]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[22]  Jan Ramon,et al.  Multi instance neural networks , 2000, ICML 2000.

[23]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[24]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[25]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[26]  David Page,et al.  Multiple Instance Regression , 2001, ICML.

[27]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[28]  N. V. Vinodchandran,et al.  SVM-based generalized multiple-instance learning via approximate box counting , 2004, ICML.

[29]  Giancarlo Ruffo,et al.  Learning single and multiple instance decision tree for computer security applications , 2000 .

[30]  Peter Auer,et al.  A Boosting Approach to Multiple Instance Learning , 2004, ECML.

[31]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[32]  Qi Zhang,et al.  Content-Based Image Retrieval Using Multiple-Instance Learning , 2002, ICML.

[33]  Eibe Frank,et al.  Applying propositional learning algorithms to multi-instance data , 2003 .

[34]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[35]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[36]  Xin Xu,et al.  Statistical Learning in Multiple Instance Problems , 2003 .

[37]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[38]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[39]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[40]  Oded Maron,et al.  Learning from Ambiguity , 1998 .

[41]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[42]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[43]  Yann Chevaleyre,et al.  A Framework for Learning Rules from Multiple Instance Data , 2001, ECML.