Classification with Label Distribution Learning

Label Distribution Learning (LDL) is a novel learning paradigm, aim of which is to minimize the distance between the model output and the groundtruth label distribution. We notice that, in real-word applications, the learned label distribution model is generally treated as a classification model, with the label corresponding to the highest model output as the predicted label, which unfortunately prompts an inconsistency between the training phrase and the test phrase. To solve the inconsistency, we propose in this paper a new Label Distribution Learning algorithm for Classification (LDL4C). Firstly, instead of KL-divergence, absolute loss is applied as the measure for LDL4C. Secondly, samples are reweighted with information entropy. Thirdly, large margin classifier is adapted to boost discrimination precision. We then reveal that theoretically LDL4C seeks a balance between generalization and discrimination. Finally, we compare LDL4C with existing LDL algorithms on 17 real-word datasets, and experimental results demonstrate the effectiveness of LDL4C in classification.

[1]  Yi Ren,et al.  Sense Beauty by Label Distribution Learning , 2017, IJCAI.

[2]  Xin Geng,et al.  Label Distribution Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[3]  Xin Geng,et al.  Theoretical Analysis of Label Distribution Learning , 2019, AAAI.

[4]  Xin Geng,et al.  Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning , 2015, IJCAI.

[5]  Jianxin Wu,et al.  Deep Label Distribution Learning With Label Ambiguity , 2016, IEEE Transactions on Image Processing.

[6]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[7]  Xin Geng,et al.  Crowd counting in public video surveillance by label distribution learning , 2015, Neurocomputing.

[8]  Andreas Maurer,et al.  A Vector-Contraction Inequality for Rademacher Complexities , 2016, ALT.

[9]  Alan F. Smeaton,et al.  Proceedings of the 23rd ACM international conference on Multimedia , 2015, MM 2015.

[10]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[11]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[12]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[13]  Jianxin Wu,et al.  Age Estimation Using Expectation of Label Distribution Learning , 2018, IJCAI.

[14]  Michael Wooldridge,et al.  Proceedings of the 24th International Conference on Artificial Intelligence , 2015 .

[15]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[17]  Ning Xu,et al.  Label Enhancement for Label Distribution Learning , 2018, IEEE Transactions on Knowledge and Data Engineering.

[18]  Bingbing Ni,et al.  Sense beauty via face, dressing, and/or voice , 2012, ACM Multimedia.

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[21]  Algorithmic Learning Theory , 1994, Lecture Notes in Computer Science.

[22]  Jie Xu,et al.  SCUT-FBP: A Benchmark Dataset for Facial Beauty Perception , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[23]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[24]  Xin Geng,et al.  Emotion Distribution Recognition from Facial Expressions , 2015, ACM Multimedia.