Multiple Instance Learning on Structured Data

Most existing Multiple-Instance Learning (MIL) algorithms assume data instances and/or data bags are independently and identically distributed. But there often exists rich additional dependency/structure information between instances/bags within many applications of MIL. Ignoring this structure information limits the performance of existing MIL algorithms. This paper explores the research problem as multiple instance learning on structured data (MILSD) and formulates a novel framework that considers additional structure information. In particular, an effective and efficient optimization algorithm has been proposed to solve the original non-convex optimization problem by using a combination of Concave-Convex Constraint Programming (CCCP) method and an adapted Cutting Plane method, which deals with two sets of constraints caused by learning on instances within individual bags and learning on structured data. Our method has the nice convergence property, with specified precision on each set of constraints. Experimental results on three different applications, i.e., webpage classification, market targeting, and protein fold identification, clearly demonstrate the advantages of the proposed method over state-of-the-art methods.

[1]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[4]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[5]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[6]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[7]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[8]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[9]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[10]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[11]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[14]  Tao Mei,et al.  Concurrent Multiple Instance Learning for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  David D. Jensen,et al.  Identifying Predictive Structures in Relational Data Using Multiple Instance Learning , 2003, ICML.

[16]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[17]  J. E. Kelley,et al.  The Cutting-Plane Method for Solving Convex Programs , 1960 .

[18]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[19]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[20]  Krzysztof C. Kiwiel,et al.  Proximity control in bundle methods for convex nondifferentiable minimization , 1990, Math. Program..

[21]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[22]  N. V. Vinodchandran,et al.  SVM-based generalized multiple-instance learning via approximate box counting , 2004, ICML.