Hierarchical Text Categorization in a Transductive Setting

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework that permits to classify documents in internal and leaf nodes of a hierarchy of categories. Experimental results on real world datasets are reported.

[1]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[2]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[3]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[4]  Igor Kononenko,et al.  Reliable Classifications with Machine Learning , 2002, ECML.

[5]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[6]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[7]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[8]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Guoping Wang,et al.  Learning with progressive transductive Support Vector Machine , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[10]  Padmini Srinivasan,et al.  Hierarchical Text Categorization Using Neural Networks , 2004, Information Retrieval.

[11]  Hans-Hermann Bock,et al.  Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data , 2000 .

[12]  Andreas S. Weigend,et al.  Exploiting Hierarchy in Text Categorization , 1999, Information Retrieval.

[13]  Michelangelo Ceci,et al.  Classifying web documents in a hierarchy of categories: a comprehensive study , 2007, Journal of Intelligent Information Systems.

[14]  NgHwee Tou,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997 .

[15]  Dunja Mladenic,et al.  Machine Learning on non-homogeneous, distributed text data , 1998 .

[16]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[17]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[18]  Y Yang,et al.  An evaluation of statistical approaches to MEDLINE indexing. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[19]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[20]  Kristin P. Bennett,et al.  Combining support vector and mathematical programming methods for classification , 1999 .

[21]  Guoping Wang,et al.  Learning with progressive transductive support vector machine , 2003, Pattern Recognit. Lett..

[22]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[23]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[24]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[25]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[26]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.