Superset Learning Based on Generalized Loss Minimization

In standard supervised learning, each training instance is associated with an outcome from a corresponding output space (e.g., a class label in classification or a real number in regression). In the superset learning problem, the outcome is only characterized in terms of a superset—a subset of candidates that covers the true outcome but may also contain additional ones. Thus, superset learning can be seen as a specific type of weakly supervised learning, in which training examples are ambiguous. In this paper, we introduce a generic approach to superset learning, which is motivated by the idea of performing model identification and "data disambiguation" simultaneously. This idea is realized by means of a generalized risk minimization approach, using an extended loss function that compares precise predictions with set-valued observations. As an illustration, we instantiate our meta learning technique for the problem of label ranking, in which the output space consists of all permutations of a fixed set of items. The label ranking method thus obtained is compared to existing approaches tackling the same problem.

[1]  Thomas G. Dietterich,et al.  Learnability of the Superset Label Learning Problem , 2014, ICML.

[2]  Rich Caruana,et al.  Classification with partial labels , 2008, KDD.

[3]  Eyke Hüllermeier,et al.  Label Ranking Methods based on the Plackett-Luce Model , 2010, ICML.

[4]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[5]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[6]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[7]  Dan Roth,et al.  Constraint Classification: A New Approach to Multiclass Classification , 2002, ALT.

[8]  Eyke Hüllermeier,et al.  Learning from imprecise and fuzzy observations: Data disambiguation through generalized loss minimization , 2013, Int. J. Approx. Reason..

[9]  Dan Roth,et al.  Constraint Classification for Multiclass Classification and Ranking , 2002, NIPS.

[10]  Frank M.T.A. Busing,et al.  Salient Goals Direct and Energise Students' Actions in the Classroom , 2012 .

[11]  Mauro Dell'Amico,et al.  Assignment Problems , 1998, IFIP Congress: Fundamentals - Foundations of Computer Science.

[12]  Eyke Hüllermeier,et al.  Learning from ambiguously labeled examples , 2005, Intell. Data Anal..

[13]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[14]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[15]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[16]  Yangguang Liu,et al.  A Taxonomy of Label Ranking Algorithms , 2014, J. Comput..

[17]  Thomas G. Dietterich,et al.  A Conditional Multinomial Mixture Model for Superset Label Learning , 2012, NIPS.

[18]  Mauro Dell'Amico,et al.  8. Quadratic Assignment Problems: Algorithms , 2009 .

[19]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[20]  Jesús Cid-Sueiro,et al.  Proper losses for learning from partial labels , 2012, NIPS.