A convex relaxation for weakly supervised classifiers

This paper introduces a general multi-class approach to weakly supervised classification. Inferring the labels and learning the parameters of the model is usually done jointly through a block-coordinate descent algorithm such as expectation-maximization (EM), which may lead to local minima. To avoid this problem, we propose a cost function based on a convex relaxation of the soft-max loss. We then propose an algorithm specifically designed to efficiently solve the corresponding semidefinite program (SDP). Empirically, our method compares favorably to standard ones on different datasets for multiple instance learning and semi-supervised learning, as well as on clustering tasks.

[1]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[2]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[3]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[4]  Jean Ponce,et al.  Efficient Optimization for Discriminative Latent Class Models , 2010, NIPS.

[5]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[6]  Eyke Hüllermeier,et al.  Learning from ambiguously labeled examples , 2005, Intell. Data Anal..

[7]  Hongbin Zha,et al.  Adaptive p-posterior mixture-model kernels for multiple instance learning , 2008, ICML '08.

[8]  Peter Auer,et al.  A Boosting Approach to Multiple Instance Learning , 2004, ECML.

[9]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[12]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[13]  Zhi-Hua Zhou,et al.  On the relation between multi-instance learning and semi-supervised learning , 2007, ICML '07.

[14]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[15]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[16]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[17]  B. Taskar,et al.  Learning from ambiguously labeled images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Tomás Lozano-Pérez,et al.  Image database retrieval with multiple-instance learning techniques , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[19]  Zhi-Hua Zhou,et al.  Adapting RBF Neural Networks to Multi-Instance Learning , 2006, Neural Processing Letters.

[20]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[21]  Peter V. Gehler,et al.  Deterministic Annealing for Multiple-Instance Learning , 2007, AISTATS.

[22]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[23]  S. Fienberg An Iterative Procedure for Estimation in Contingency Tables , 1970 .

[24]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[25]  James T. Kwok,et al.  Marginalized Multi-Instance Kernels , 2007, IJCAI.

[26]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[27]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[28]  Dale Schuurmans,et al.  Convex Relaxations of Latent Variable Training , 2007, NIPS.

[29]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[30]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[31]  Zaïd Harchaoui,et al.  DIFFRAC: a discriminative and flexible framework for clustering , 2007, NIPS.

[32]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[33]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[34]  Francis R. Bach,et al.  Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[35]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[36]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[37]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[38]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[39]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[41]  Ashwin Srinivasan,et al.  Multi-instance tree learning , 2005, ICML.

[42]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[43]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[44]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.