Fast Bundle Algorithm for Multiple-Instance Learning

We present a bundle algorithm for multiple-instance classification and ranking. These frameworks yield improved models on many problems possessing special structure. Multiple-instance loss functions are typically nonsmooth and nonconvex, and current algorithms convert these to smooth nonconvex optimization problems that are solved iteratively. Inspired by the latest linear-time subgradient-based methods for support vector machines, we optimize the objective directly using a nonconvex bundle method. Computational results show this method is linearly scalable, while not sacrificing generalization accuracy, permitting modeling on new and larger data sets in computational chemistry and other applications. This new implementation facilitates modeling with kernels.

[1]  Claude Lemaréchal,et al.  An Algorithm for Minimizing Convex Functions , 1974, IFIP Congress.

[2]  P. Wolfe Note on a method of conjugate subgradients for minimizing nondifferentiable functions , 1974 .

[3]  P. Wolfe,et al.  A METHOD OF CONJUGATE SUBGRADIENTS FOR , 1975 .

[4]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[5]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[6]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[7]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[8]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[9]  Jan Ramon,et al.  Multi instance neural networks , 2000, ICML 2000.

[10]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[11]  Sally A. Goldman,et al.  Multiple-Instance Learning of Real-Valued Data , 2001, J. Mach. Learn. Res..

[12]  David Page,et al.  Multiple Instance Regression , 2001, ICML.

[13]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[14]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[15]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[16]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[17]  Slobodan Petar Rendic Summary of information on human CYP enzymes: human P450 metabolism data , 2002, Drug metabolism reviews.

[18]  Marko Mäkelä,et al.  Survey of Bundle Methods for Nonsmooth Optimization , 2002, Optim. Methods Softw..

[19]  R. Sheridan,et al.  A model for predicting likely sites of CYP3A4-mediated metabolism on drug-like molecules. , 2003, Journal of medicinal chemistry.

[20]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[21]  Peter Auer,et al.  A Boosting Approach to Multiple Instance Learning , 2004, ECML.

[22]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[23]  Antonio Fuduli,et al.  Minimizing Nonconvex Nonsmooth Functions via Cutting Planes and Proximity Control , 2003, SIAM J. Optim..

[24]  N. V. Vinodchandran,et al.  An extended kernel for generalized multiple-instance learning , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[25]  Ashwin Srinivasan,et al.  Multi-instance tree learning , 2005, ICML.

[26]  Joseph F. Murray,et al.  Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application , 2005, J. Mach. Learn. Res..

[27]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[28]  A. Ruszczynski,et al.  Nonlinear Optimization , 2006 .

[29]  Annabella Astorino,et al.  Nonsmooth Optimization Techniques for Semisupervised Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  R. Sheridan,et al.  Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9. , 2007, Journal of medicinal chemistry.

[31]  Alexander J. Smola,et al.  A scalable modular convex solver for regularized risk minimization , 2007, KDD '07.

[32]  Murat Dundar,et al.  Multiple-Instance Learning Algorithms for Computer-Aided Detection , 2008, IEEE Transactions on Biomedical Engineering.

[33]  N. V. Vinodchandran,et al.  Kernels for Generalized Multiple-Instance Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Nenghai Yu,et al.  Multiple-instance ranking: Learning to rank images for image retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Edward W. Wild,et al.  Multiple Instance Classification via Successive Linear Programming , 2008 .

[36]  Kristin P. Bennett,et al.  Multiple instance ranking , 2008, ICML '08.

[37]  F. Guengerich Cytochrome p450 and chemical toxicology. , 2008, Chemical research in toxicology.

[38]  C. Brown,et al.  Cytochromes P450: a structure-based summary of biotransformations using representative substrates. , 2008, Drug metabolism reviews.

[39]  Yan Song,et al.  An Improved Multiple Instance Learning Algorithm for Object Extraction , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[40]  Jun Gao,et al.  Identifying Multi-instance Outliers , 2010, SDM.