论文信息 - Using cases to represent context for text classification

Using cases to represent context for text classification

Research on text classification has typically focused on keyword searches and statistical techniques. Keywords alone cannot always distinguish the relevant from the irrelevant texts and some relevant texts do not contain any reliable keywords at all. Our approach to text classification uses case-based reasoning to represent natural language contexts that can be used to classify texts with extremely high precision. The case base of natural language contexts is acquired automatically during sentence analysis using a training corpus of texts and their correct relevancy classifications. A text is represented as a set of cases and we classify a text as relevant if any of its cases are deemed to be relevant. We rely on the statistical properties of the case base to determine whether similar cases are highly correlated with relevance for the domain. Preliminary experiments suggest that case-based text classification can achieve very high levels of precision and outperforms our previous algorithms based on relevancy signatures.

Ellen Riloff

[1] Wendy G. Lehnert,et al. Symbolic/Subsymbolic Sentence Analysi: Exploiting the Best of Two Worlds , 1988 .

[2] Wendy G. Lehnert,et al. Case-based Problem Solving with a Large Knowledge Base of Learned Cases , 1987, AAAI.

[3] David L. Waltz,et al. Toward memory-based reasoning , 1986, CACM.

[4] Ellen Riloff,et al. Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[5] Gerald Salton,et al. Automatic text processing , 1988 .

[6] Kevin D. Ashley. Modeling legal argument - reasoning with cases and hypotheticals , 1991, Artificial intelligence and legal reasoning.

[7] Claire Cardie,et al. University of Massachusetts: Description of the CIRCUS System as Used for MUC-4 , 1992, MUC.

[8] Gerard Salton,et al. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[9] Ellen Riloff,et al. Classifying Texts Using Relevancy Signatures , 1992, AAAI.

[10] Kristian J. Hammond,et al. CHEF: A Model of Case-Based Planning , 1986, AAAI.

[11] E. Riloff,et al. Automated dictionary construction for information extraction from text , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.