论文信息 - Question terminology and representation for question type classification

Question terminology and representation for question type classification

Question terminology is a set of terms which appear in keywords, idioms and fixed expressions commonly observed in questions. This paper investigates ways to automatically extract question terminology from a corpus of questions and represent them for the purpose of classifying by question type. Our key interest is to see whether or not semantic features can enhance the representation of strongly lexical nature of question sentences. We compare two feature sets: one with lexical features only, and another with a mixture of lexical and semantic features. For evaluation, we measure the classification accuracy made by two machine learning algorithms, C5.0 and PEBLS, by using a procedure called domain cross-validation, which effectively measures the domain transferability of features.

Noriko Tomuro

[1] Barry Smyth,et al. Genre Classification and Domain Transfer for Information Filtering , 2002, ECIR.

[2] Claire Cardie,et al. Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[3] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[4] Noriko Tomuro,et al. The Use of Question Types to Match Questions in FAQFinder , 2002 .

[5] Dekang Lin,et al. Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[6] Steven Salzberg,et al. A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[7] Philip Resnik,et al. Selectional Preference and Sense Disambiguation , 1997 .

[8] Ron Kohavi,et al. Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[9] Hinrich Schütze,et al. Automatic Detection of Text Genre , 1997, ACL.

[10] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[11] Janyce Wiebe,et al. Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[12] Eduard H. Hovy,et al. Toward Semantics-Based Answer Pinpointing , 2001, HLT.

[13] Sanda M. Harabagiu,et al. FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[14] Kristian J. Hammond,et al. Question Answering from Frequently Asked Question Files: Experiences with the FAQ FINDER System , 1997, AI Mag..