论文信息 - Linear Discriminant Analysis in Document Classification

Linear Discriminant Analysis in Document Classification

Document representation using the bag-of-words approach may require bringing the dimensionality of the representation down in order to be able to make effective use of various statistical classification methods. Latent Semantic Indexing (LSI) is one such method that is based on eigendecomposition of the covariance of the document-term matrix. Another often used approach is to select a small number of most important features out of the whole set according to some relevant criterion. This paper points out that LSI ignores discrimination while concentrating on representation. Furthermore, selection methods fail to produce a feature set that jointly optimizes class discrimination. As a remedy, we suggest supervised linear discriminative transforms, and report good classification results applying these to the Reuters-21578 database.

Kari Torkkola | K. Torkkola

[1] Sanjoy Dasgupta,et al. Experiments with Random Projection , 2000, UAI.

[2] Samy Bengio,et al. SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[3] Craig Boutilier,et al. Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence , 2000 .

[4] Ata Kabán,et al. Fast Extraction of Semantic Features from a Latent Semantic Indexed Text Corpus , 2004, Neural Processing Letters.

[5] Dunja Mladenic,et al. Feature Subset Selection in Text-Learning , 1998, ECML.

[6] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[7] Prabhakar Raghavan,et al. Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.

[8] L. K. Hansen,et al. Independent Components in Text , 2000 .

[9] William M. Campbell,et al. Mutual Information in Learning Feature Transformations , 2000, ICML.

[10] Daphne Koller,et al. Toward Optimal Feature Selection , 1996, ICML.

[11] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..