Label Distribution Learning

Although multi-label learning can deal with many problems with label ambiguity, it does not fit some real applications well where the overall distribution of the importance of the labels matters. This paper proposes a novel learning paradigm named label distribution learning (LDL) for such kind of applications. The label distribution covers a certain number of labels, representing the degree to which each label describes the instance. LDL is a more general learning framework which includes both single-label and multi-label learning as its special cases. This paper proposes six working LDL algorithms in three ways: problem transformation, algorithm adaptation, and specialized algorithm design. In order to compare the performance of the LDL algorithms, six representative and diverse evaluation measures are selected via a clustering analysis, and the first batch of label distribution datasets are collected and made publicly available. Experimental results on one artificial and 15 real-world datasets show clear advantages of the specialized algorithms, which indicates the importance of special design for the characteristics of the LDL problem.

[1]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[3]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[4]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[5]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[6]  J. William Ahwood,et al.  CLASSIFICATION , 1931, Foundations of Familiar Language.

[7]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Geoff Holmes,et al.  Scalable and efficient multi-label classification for evolving data streams , 2012, Machine Learning.

[9]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[13]  A Gordon,et al.  Classification, 2nd Edition , 1999 .

[14]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[15]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[16]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[17]  Xiao Sun,et al.  Discriminate the Falsely Predicted Protein-Coding Genes in Aeropyrum Pernix K1 Genome Based on Graphical Representation , 2012 .

[18]  Hans-Jürgen Zimmermann,et al.  Practical Applications of Fuzzy Technologies , 1999 .

[19]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[20]  Thierry Denoeux,et al.  Handling possibilistic labels in pattern classification using evidential reasoning , 2001, Fuzzy Sets Syst..

[21]  Min Wu,et al.  Multi-label ensemble based on variable pairwise constraint projection , 2013, Inf. Sci..

[22]  Milos Hauskrecht,et al.  Learning classification models from multiple experts , 2013, J. Biomed. Informatics.

[23]  Zhi-Hua Zhou,et al.  Multi-Label Learning by Instance Differentiation , 2007, AAAI.

[24]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[25]  Xiaoyan Zhu,et al.  A Generative Probabilistic Model for Multi-label Classification , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[26]  ZhouZhi-Hua,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006 .

[27]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[28]  Elena P. Sapozhnikova,et al.  ART-Based Neural Networks for Multi-label Classification , 2009, IDA.

[29]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[30]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[31]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[32]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Thierry Denoeux,et al.  Learning from data with uncertain labels by boosting credal classifiers , 2009, U '09.

[34]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning , 2008 .

[35]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[37]  Eyke Hüllermeier,et al.  Combining Instance-Based Learning and Logistic Regression for Multilabel Classification , 2009, ECML/PKDD.

[38]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[40]  Cordelia Schmid,et al.  Label-Embedding for Attribute-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[42]  Eyke Hüllermeier,et al.  Graded Multilabel Classification: The Ordinal Case , 2010, ICML.

[43]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[44]  Milos Hauskrecht,et al.  Learning Classification with Auxiliary Probabilistic Information , 2011, 2011 IEEE 11th International Conference on Data Mining.

[45]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[46]  Eyke Hüllermeier,et al.  Preference Learning , 2005, Künstliche Intell..

[47]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[48]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[49]  Florentin Wörgötter,et al.  Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms , 2005, Neural Computation.

[50]  Eyke Hüllermeier,et al.  Combining instance-based learning and logistic regression for multilabel classification , 2009, Machine Learning.

[51]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[52]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[53]  Xin Geng,et al.  Leveraging Implicit Relative Labeling-Importance Information for Effective Multi-label Learning , 2015, 2015 IEEE International Conference on Data Mining.

[54]  Yuhong Guo,et al.  Multi-Label Classification Using Conditional Dependency Networks , 2011, IJCAI.

[55]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[56]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[57]  Xin Geng,et al.  Label Distribution Learning , 2013, ICDM Workshops.

[58]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[59]  Xin Geng,et al.  Multilabel Ranking with Inconsistent Rankers , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[61]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[62]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.