A multi-label classification algorithm based on label-specific features

Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and negative instances set of each class firstly and then select mk features of high density from the positive and negative instances set of each class, respectively; the intersection is taken as the label-specific features of the corresponding class. Finally, multi-label data are classified on the basis of label-specific features. The algorithm can show the label-specific features of each class. Experiments show that our proposed method, the MLSF algorithm, performs significantly better than the other state-of-the-art multi-label learning approaches.

[1]  Juho Rousu,et al.  On Maximum Margin Hierarchical Classification , 2004 .

[2]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[3]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[4]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[5]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[6]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[7]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[8]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[9]  Grigorios Tsoumakas,et al.  Multi-Label Classification , 2009, Database Technologies: Concepts, Methodologies, Tools, and Applications.

[10]  Koby Crammer,et al.  A new family of online algorithms for category ranking , 2002, SIGIR '02.

[11]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[12]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Marko Grobelnik,et al.  Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II , 2009 .

[14]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[15]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[16]  Yang Yu,et al.  Predicting Future Customers via Ensembling Gradually Expanded Trees , 2007, Int. J. Data Warehous. Min..

[17]  Joachim M. Buhmann,et al.  Classification of Multi-labeled Data: A Generative Approach , 2008, ECML/PKDD.

[18]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[19]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[20]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[21]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.