A TDIDT technique for multi-label classification

There are numerous problems of increasing significance where a pattern can have several classes simultaneously associated. This kind of problems, usually called multi-label problems, should be tackled with specific techniques in order to generate models more accurate than those obtained with classical classification algorithms. This work presents the adaptation of the J48 algorithm to multi-label classification. The developed algorithm allows the generation of interpretable models and has been tested over several datasets and experiments show that it has a performance which is similar to other multi-label tree-based approaches being specially suitable to be used as base-classifier in an ensemble.

[1]  Zhi-Hua Zhou,et al.  A k-nearest neighbor based algorithm for multi-label classification , 2005, 2005 IEEE International Conference on Granular Computing.

[2]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Jason Weston,et al.  Kernel methods for Multi-labelled classification and Categ orical regression problems , 2001, NIPS 2001.

[5]  Grigorios Tsoumakas,et al.  Effective and Efficient Multilabel Classification in Domains with Large Number of Labels , 2008 .

[6]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[7]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[8]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[9]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Sung Hyun Park,et al.  An unbiased method for constructing multilabel classification trees , 2004, Comput. Stat. Data Anal..

[12]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[13]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15]  Petra Perner,et al.  Machine Learning and Data Mining in Pattern Recognition , 2009, Lecture Notes in Computer Science.

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[17]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[18]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[19]  Rémi Gilleron,et al.  Learning Multi-label Alternating Decision Trees from Texts and Data , 2003, MLDM.

[20]  Sebastián Ventura,et al.  A Niching Algorithm to Learn Discriminant Functions with Multi-Label Patterns , 2009, IDEAL.

[21]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.