On Learning Vector Representations in Hierarchical Label Spaces

An important problem in multi-label classification is to capture label patterns or underlying structures that have an impact on such patterns. This paper addresses one such problem, namely how to exploit hierarchical structures over labels. We present a novel method to learn vector representations of a label space given a hierarchy of labels and label co-occurrence patterns. Our experimental results demonstrate qualitatively that the proposed method is able to learn regularities among labels by exploiting a label hierarchy as well as label co-occurrences. It highlights the importance of the hierarchical information in order to obtain regularities which facilitate analogical reasoning over a label space. We also experimentally illustrate the dependency of the learned representations on the label hierarchy.

[1]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[2]  Inderjit S. Dhillon,et al.  Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[3]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[4]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[5]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.

[6]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[7]  Eyke Hüllermeier,et al.  On label dependence and loss minimization in multi-label classification , 2012, Machine Learning.

[8]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[9]  Hsuan-Tien Lin,et al.  Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.

[10]  James T. Kwok,et al.  MultiLabel Classification on Tree- and DAG-Structured Hierarchies , 2011, ICML.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Arthur Zimek,et al.  A Study of Hierarchical and Flat Classification of Proteins , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[14]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[16]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[17]  James T. Kwok,et al.  Efficient Multi-label Classification with Many Labels , 2013, ICML.

[18]  Hsuan-Tien Lin,et al.  Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.