Bidirectional Loss Function for Label Enhancement and Distribution Learning

Label distribution learning (LDL) is an interpretable and general learning paradigm that has been applied in many real-world applications. In contrast to the simple logical vector in single-label learning (SLL) and multi-label learning (MLL), LDL assigns labels with a description degree to each instance. In practice, two challenges exist in LDL, namely, how to address the dimensional gap problem during the learning process of LDL and how to exactly recover label distributions from existing logical labels, i.e., Label Enhancement (LE). For most existing LDL and LE algorithms, the fact that the dimension of the input matrix is much higher than that of the output one is alway ignored and it typically leads to the dimensional reduction owing to the unidirectional projection. The valuable information hidden in the feature space is lost during the mapping process. To this end, this study considers bidirectional projections function which can be applied in LE and LDL problems simultaneously. More specifically, this novel loss function not only considers the mapping errors generated from the projection of the input space into the output one but also accounts for the reconstruction errors generated from the projection of the output space back to the input one. This loss function aims to potentially reconstruct the input data from the output data. Therefore, it is expected to obtain more accurate results. Finally, experiments on several real-world datasets are carried out to demonstrate the superiority of the proposed method for both LE and LDL.

[1]  Xianzhong Long,et al.  Weakly supervised label distribution learning based on transductive matrix completion with sample correlations , 2019, Pattern Recognit. Lett..

[2]  Weiwei Li,et al.  Label distribution learning with label-specific features , 2019, IJCAI.

[3]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Zhi-Hua Zhou,et al.  Multi-instance multi-label learning , 2008, Artif. Intell..

[7]  Xin Geng,et al.  Multi-Label Manifold Learning , 2016, AAAI.

[8]  Xin Geng,et al.  Facial Age Estimation by Adaptive Label Distribution Learning , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  Miao Xu,et al.  Incomplete Label Distribution Learning , 2017, IJCAI.

[10]  Xin Geng,et al.  Emotion Distribution Recognition from Facial Expressions , 2015, ACM Multimedia.

[11]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[12]  Weiwei Li,et al.  Label Distribution Learning with Label Correlations via Low-Rank Approximation , 2019, IJCAI.

[13]  A. Sayed,et al.  Foundations and Trends ® in Machine Learning > Vol 7 > Issue 4-5 Ordering Info About Us Alerts Contact Help Log in Adaptation , Learning , and Optimization over Networks , 2011 .

[14]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[15]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[16]  Zechao Li,et al.  Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Zhang Yi,et al.  Fuzzy SVM with a new fuzzy membership function , 2006, Neural Computing & Applications.

[18]  Xin Geng,et al.  Label Distribution Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[19]  ZhouZhi-Hua,et al.  Facial Age Estimation by Learning from Label Distributions , 2013 .

[20]  Tommy W. S. Chow,et al.  Multi-Label Low-dimensional Embedding with Missing Labels , 2017, Knowl. Based Syst..

[21]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Jing-Yu Yang,et al.  Multi-label learning with label-specific feature reduction , 2016, Knowl. Based Syst..

[23]  Jianxin Wu,et al.  Deep Label Distribution Learning With Label Ambiguity , 2016, IEEE Transactions on Image Processing.

[24]  Dawei Zhao,et al.  Multi-label learning with kernel extreme learning machine autoencoder , 2019, Knowl. Based Syst..

[25]  Bo Li,et al.  Label Distribution-Based Facial Attractiveness Computation by Deep Residual Learning , 2016, IEEE Transactions on Multimedia.

[26]  Günther Palm,et al.  A Study of the Robustness of KNN Classifiers Trained Using Soft Labels , 2006, ANNPR.

[27]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[28]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[29]  Yang Gao,et al.  Joint multi-label classification and label correlations with missing labels and feature selection , 2019, Knowl. Based Syst..

[30]  Oscar Castillo,et al.  Hybrid Intelligent Systems for Pattern Recognition Using Soft Computing - An Evolutionary Approach for Neural Networks and Fuzzy Systems , 2005, Studies in Fuzziness and Soft Computing.

[31]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[32]  Xin Geng,et al.  Logistic Boosting Regression for Label Distribution Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ya-Xiang Yuan,et al.  A modified BFGS algorithm for unconstrained optimization , 1991 .

[34]  Xin Geng,et al.  Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning , 2015, IJCAI.

[35]  Yu Zhang,et al.  Label Distribution Learning by Exploiting Label Correlations , 2018, AAAI.

[36]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[37]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[39]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[40]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Shen Furao,et al.  Latent Semantics Encoding for Label Distribution Learning , 2019, IJCAI.

[42]  Ning Xu,et al.  Label Enhancement for Label Distribution Learning , 2019 .

[43]  Xiuyi Jia,et al.  Label Distribution Learning by Exploiting Sample Correlations Locally , 2018, AAAI.

[44]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[45]  Dinggang Shen,et al.  A novel relational regularization feature selection method for joint regression and classification in AD diagnosis , 2017, Medical Image Anal..

[46]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..