Learning Curves for the Analysis of Multiple Instance Classifiers

In Multiple Instance Learning (MIL) problems, objects are represented by a set of feature vectors, in contrast to the standard pattern recognition problems, where objects are represented by a single feature vector. Numerous classifiers have been proposed to solve this type of MIL classification problem. Unfortunately only two datasets are standard in this field (MUSK-1 and MUSK-2), and all classifiers are evaluated on these datasets using the standard classification error. In practice it is very informative to investigate their learning curves, i.e. the performance on train and test set for varying number of training objects. This paper offers an evaluation of several classifiers on the standard datasets MUSK-1 and MUSK-2 as a function of the training size. This suggests that for smaller datasets a Parzen density estimator may be preferrer over the other 'optimal' classifiers given in the literature.

[1]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[2]  Sheng Gao,et al.  A Generalized Discriminative Muitiple Instance Learning for Multimedia Semantic Concept Detection , 2006, 2006 International Conference on Image Processing.

[3]  Zhi-Hua Zhou,et al.  Ensembles of Multi-instance Learners , 2003, ECML.

[4]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[5]  Charles X. Ling,et al.  AUC: A Better Measure than Accuracy in Comparing Learning Algorithms , 2003, Canadian Conference on AI.

[6]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[7]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[8]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[9]  Thomas Hofmann,et al.  Multiple instance learning with generalized support vector machines , 2002, AAAI/IAAI.

[10]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[11]  Saharon Rosset,et al.  Model selection via the AUC , 2004, ICML.

[12]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[13]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[14]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[16]  Hendrik Blockeel,et al.  Machine Learning: ECML 2003 , 2003, Lecture Notes in Computer Science.

[17]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[18]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[19]  Don R. Hush,et al.  Multiple instance learning using simple classifiers , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[20]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.