Document Classification By Machine: Theory and Practice
暂无分享,去创建一个
In this note, we present results concerning the theory and practice of determining for a given document which of several categories it best fits. We describe a mathematical model of classification schemes and the one scheme which can be proved optimal among all those based on word frequencies. Finally, we report the results of an experiment which illustrates the efficacy of this classification method.
[1] David D. Lewis,et al. Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.
[2] Beth Sundheim,et al. Overview of the Third Message Understanding Evaluation and Conference , 1991, MUC.