论文信息 - Hierarchically Classifying Documents Using Very Few Words

Hierarchically Classifying Documents Using Very Few Words

The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. One can use existing classifiers by ignoring the hierarchical structure, treating the topics as separate classes. Unfortunately, in the context of text categorization, we are faced with a large number of classes and a huge number of relevant features needed to distinguish between them. Consequently, we are restricted to using only very simple classifiers, both because of computational cost and the tendency of complex models to overfit. We propose an approach that utilizes the hierarchical topic structure to decompose the classification task into a set of simpler problems, one at each node in the classification tree. As we show, each of these smaller problems can be solved accurately by focusing only on a very small set of features, those relevant to the task at hand. This set of relevant features varies widely throughout the hierarchy, so that, while the overall relevant feature set may be large, each classifier only examines a small subset. The use of reduced feature sets allows us to utilize more complex (probabilistic) models, without encountering the computational and robustness difficulties described above.

Daphne Koller | Mehran Sahami | D. Koller | M. Sahami

[1] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2] Chris Buckley,et al. OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[3] Michael J. Pazzani,et al. Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[4] David Maxwell Chickering,et al. Learning Bayesian Networks is , 1994 .

[5] Mehran Sahami,et al. Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[6] Daphne Koller,et al. Toward Optimal Feature Selection , 1996, ICML.

[7] Nir Friedman,et al. Learning Bayesian Networks with Local Structure , 1996, UAI.

[8] G. Provan. Eecient Learning of Selective Bayesian Network Classiiers , 1996 .