Weak supervision and other non-standard classification problems: A taxonomy

A taxonomy of weakly supervised classification problems.Weak supervision in learning and prediction stages.Problem structure: instance-label relationship.Organization of the field: similarities and differences among frameworks.Revealing unexplored challenging frameworks. In recent years, different researchers in the machine learning community have presented new classification frameworks which go beyond the standard supervised classification in different aspects. Specifically, a wide spectrum of novel frameworks that use partially labeled data in the construction of classifiers has been studied. With the objective of drawing up a description of the state-of-the-art, three identifying characteristics of these novel frameworks have been considered: (1) the relationship between instances and labels of a problem, which may be beyond the one-instance one-label standard, (2) the possible provision of partial class information for the training examples, and (3) the possible provision of partial class information also for the examples in the prediction stage. These three ideas have been formulated as axes of a comprehensive taxonomy that organizes the state-of-the-art. The proposed organization allows us both to understand similarities/differences among the different classification problems already presented in the literature as well as to discover unexplored frameworks that might be seen as further challenges and research opportunities. A representative set of state-of-the-art problems has been used to illustrate the novel taxonomy and support the discussion.

[1]  Eyke Hüllermeier,et al.  Optimizing the F-Measure in Multi-Label Classification: Plug-in Rule Approach versus Structured Loss Minimization , 2013, ICML.

[2]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[3]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[4]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[5]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[6]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[7]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[8]  Francesco Orabona,et al.  Learning from Candidate Labeling Sets , 2010, NIPS.

[9]  O. M. Halck,et al.  Using Hard Classifiers to Estimate Conditional Class Probabilities , 2002, ECML.

[10]  Pedro Larrañaga,et al.  Learning Bayesian classifiers from positive and unlabeled examples , 2007, Pattern Recognit. Lett..

[11]  Dragos D. Margineantu,et al.  Class Probability Estimation and Cost-Sensitive Classification Decisions , 2002, ECML.

[12]  Alexander J. Smola,et al.  Estimating labels from label proportions , 2008, ICML '08.

[13]  Sanjiv Kumar,et al.  Classification of Weakly-Labeled Data with Partial Equivalence Relations , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[15]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[16]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[17]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Weak Label , 2013, IJCAI.

[18]  George Karypis,et al.  The Set Classification Problem and Solution Methods , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[19]  Philip S. Yu,et al.  Text classification without negative examples revisit , 2006, IEEE Transactions on Knowledge and Data Engineering.

[20]  Yoshua Bengio,et al.  Learning from Partial Labels with Minimum Entropy , 2004 .

[21]  Xindong Wu,et al.  Eliminating Class Noise in Large Datasets , 2003, ICML.

[22]  Zhi-Hua Zhou,et al.  Semi-supervised multi-instance multi-label learning for video annotation task , 2012, ACM Multimedia.

[23]  Iñaki Inza,et al.  Learning Bayesian network classifiers from label proportions , 2013, Pattern Recognit..

[24]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[25]  Rong Jin,et al.  Multi-label learning with incomplete class assignments , 2011, CVPR 2011.

[26]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[27]  Juan José del Coz,et al.  Learning Nondeterministic Classifiers , 2009, J. Mach. Learn. Res..

[28]  David R. Musicant,et al.  Supervised Learning by Training on Aggregate Outputs , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[29]  Concha Bielza,et al.  Multi-dimensional classification with Bayesian networks , 2011, Int. J. Approx. Reason..

[30]  Wang Yong,et al.  Nearest Neighbor Algorithm for Positive and Unlabeled Learning with Uncertainty , 2010 .

[31]  Eyke Hüllermeier,et al.  Learning from ambiguously labeled examples , 2005, Intell. Data Anal..

[32]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[33]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[34]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[35]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[36]  Gang Chen,et al.  Semi-supervised Multi-label Learning by Solving a Sylvester Equation , 2008, SDM.

[37]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Thomas Gärtner,et al.  Label Ranking Algorithms: A Survey , 2010, Preference Learning.

[39]  Iñaki Inza,et al.  Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers , 2012, Neurocomputing.

[40]  Ludmila I. Kuncheva,et al.  Full-class set classification using the Hungarian algorithm , 2010, Int. J. Mach. Learn. Cybern..

[41]  Bernard De Baets,et al.  Supervised learning algorithms for multi-class classification problems with partial class memberships , 2011, Fuzzy Sets Syst..