Feature Extraction for Learning to Classify Questions

In this paper, we present a new approach to learning the classification of questions Question classification received interest recently in the context of question answering systems for which categorizing a given question would be beneficial to allow improved processing of the document to identify an answer Our approach relies on relative simple preprocessing of the question and uses standard decision tree learning We also compared our results from decision tree learning with those obtained using Naive Bayes Both results compare favorably to several very recent studies using more sophisticated preprocessing and/or more sophisticated learning techniques Furthermore, the fact that decision tree learning proved more successful than Naive Bayes is significant in itself as decision tree learning is usually believed to be less suitable for NLP tasks.