SyMIL: MinMax Latent SVM for Weakly Labeled Data

Designing powerful models able to handle weakly labeled data are a crucial problem in machine learning. In this paper, we propose a new multiple instance learning (MIL) framework. Examples are represented as bags of instances, but we depart from standard MIL assumptions by introducing a symmetric strategy (SyMIL) that seeks discriminative instances in positive and negative bags. The idea is to use the instance the most distant from the hyper-plan to classify the bag. We provide a theoretical analysis featuring the generalization properties of our model. We derive a large margin formulation of our problem, which is cast as a difference of convex functions, and optimized using concave-convex procedure. We provide a primal version optimizing with stochastic subgradient descent and a dual version optimizing with one-slack cutting-plane. Successful experimental results are reported on standard MIL and weakly supervised object detection data sets: SyMIL significantly outperforms competitive methods (mi/MI/Latent-SVM), and gives very competitive performance compared to state-of-the-art works. We also analyze the selected instances of symmetric and asymmetric approaches on weakly supervised object detection and text classification tasks. Finally, we show complementarity of SyMIL with recent works on learning with label proportions on standard MIL data sets.

[1]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[2]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[3]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[6]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[7]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[8]  Dan Zhang,et al.  MILEAGE: Multiple Instance LEArning with Global Embedding , 2013, ICML.

[9]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[10]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[11]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[12]  Ming-Syan Chen,et al.  Video Event Detection by Inferring Temporal Instance Labels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[14]  Naftali Tishby,et al.  Multi-instance learning with any hypothesis class , 2011, J. Mach. Learn. Res..

[15]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[16]  Martha Palmer,et al.  Verb semantics for English-Chinese translation , 1995, Machine Translation.

[17]  Kevin Miller,et al.  Max-Margin Min-Entropy Models , 2012, AISTATS.

[18]  Yang Wang,et al.  Kernel Latent SVM for Visual Recognition , 2012, NIPS.

[19]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Dong Liu,et al.  $\propto$SVM for learning with label proportions , 2013, ICML 2013.

[21]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[22]  Daphne Koller,et al.  Shape-Based Object Localization for Descriptive Classification , 2008, International Journal of Computer Vision.

[23]  Francis R. Bach,et al.  A convex relaxation for weakly supervised classifiers , 2012, ICML.

[24]  Peter V. Gehler,et al.  Deterministic Annealing for Multiple-Instance Learning , 2007, AISTATS.

[25]  Fernando De la Torre,et al.  Multiple instance learning via Gaussian processes , 2014, Data Mining and Knowledge Discovery.

[26]  Joachim M. Buhmann,et al.  Ellipsoidal Multiple Instance Learning , 2013, ICML.

[27]  Razvan C. Bunescu,et al.  Multiple instance learning for sparse positive bags , 2007, ICML '07.

[28]  Thomas Deselaers,et al.  A Conditional Random Field for Multiple-Instance Learning , 2010, ICML.

[29]  Edward W. Wild,et al.  Multiple Instance Classification via Successive Linear Programming , 2008 .

[30]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[31]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[32]  Matthieu Cord,et al.  WELDON: Weakly Supervised Learning of Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[34]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[35]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[36]  Stefan R ping SVM Classifier Estimation from Group Probabilities , 2010, ICML 2010.

[37]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets , 2012, ArXiv.

[38]  Alexander J. Smola,et al.  Estimating Labels from Label Proportions , 2009, J. Mach. Learn. Res..