Mandatory Leaf Node Prediction in Hierarchical Multilabel Classification

In hierarchical classification, the output labels reside on a tree- or directed acyclic graph (DAG)-structured hierarchy. On testing, the prediction paths of a given test example may be required to end at leaf nodes of the label hierarchy. This is called mandatory leaf node prediction (MLNP) and is particularly useful, when the leaf nodes have much stronger semantic meaning than the internal nodes. However, while there have been a lot of MLNP methods in hierarchical multiclass classification, performing MLNP in hierarchical multilabel classification is difficult. In this paper, we propose novel MLNP algorithms that consider the global label hierarchy structure. We show that the joint posterior probability over all the node labels can be efficiently maximized by dynamic programming for label trees, or greedy algorithm for label DAGs. In addition, both algorithms can be further extended for the minimization of the expected symmetric loss. Experiments are performed on real-world MLNP data sets with label trees and label DAGs. The proposed method consistently outperforms other hierarchical and flat multilabel classification methods.

[1]  Lin Xiao,et al.  Hierarchical Classification via Orthogonal Transfer , 2011, ICML.

[2]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[3]  J. Jośe A HIERARCHICAL APPROACH TO AUTOMATIC MUSICAL GENRE CLASSIFICATION , 2003 .

[4]  Solomon Eyal Shimony,et al.  Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[5]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[6]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[7]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[8]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[9]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[10]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[11]  Alex Alves Freitas,et al.  Adapting non-hierarchical multilabel classification methods for hierarchical multilabel classification , 2011, Intell. Data Anal..

[12]  Concha Bielza,et al.  Bayesian Chain Classifiers for Multidimensional Classification , 2011, IJCAI.

[13]  Thomas Hofmann,et al.  Exploiting Known Taxonomies in Learning Overlapping Concepts , 2007, IJCAI.

[14]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[15]  Jeff G. Schneider,et al.  A Composite Likelihood View for Multi-Label Classification , 2012, AISTATS.

[16]  James T. Kwok,et al.  MultiLabel Classification on Tree- and DAG-Structured Hierarchies , 2011, ICML.

[17]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[18]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[19]  Eyke Hüllermeier,et al.  Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains , 2010, ICML.

[20]  Giorgio Valentini,et al.  True Path Rule Hierarchical Ensembles for Genome-Wide Gene Function Prediction , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Grigorios Tsoumakas,et al.  Effective and Efficient Multilabel Classification in Domains with Large Number of Labels , 2008 .

[22]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[23]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[24]  N. Reid,et al.  AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS , 2011 .

[25]  Nicolò Cesa-Bianchi,et al.  Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction , 2009, MLSB.

[26]  Alexander Lerch,et al.  A HIERARCHICAL APPROACH TO AUTOMATIC MUSICAL GENRE CLASSIFICATION , 2003 .

[27]  Ivor W. Tsang,et al.  Incorporating the Loss Function Into Discriminative Clustering of Structured Outputs , 2010, IEEE Transactions on Neural Networks.

[28]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[29]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[30]  Tibério S. Caetano,et al.  Submodular Multi-Label Learning , 2011, NIPS.

[31]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[32]  Volkan Cevher,et al.  Model-Based Compressive Sensing , 2008, IEEE Transactions on Information Theory.

[33]  Joydeep Ghosh,et al.  Automatically learning document taxonomies for hierarchical classification , 2005, WWW '05.

[34]  Alexander C. Berg,et al.  Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition , 2011, NIPS.

[35]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[36]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[37]  Jayme G. A. Barbedo,et al.  Automatic Genre Classification of Musical Signals , 2007, EURASIP J. Adv. Signal Process..

[38]  Geoff Holmes,et al.  Classifier Chains for Multi-label Classification , 2009, ECML/PKDD.

[39]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[40]  Claudio Gentile,et al.  Hierarchical classification: combining Bayes with SVM , 2006, ICML.

[41]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Kun Zhang,et al.  Multi-label learning by exploiting label dependency , 2010, KDD.

[43]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[44]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.