Question analysis for Indonesian comparative question

Information seeking is one of human needs today. Comparing things using search engine surely take more times than search only one thing. In this paper, we analyzed comparative questions for comparative question answering system. Comparative question is a question that comparing two or more entities. We grouped comparative questions into 5 types: selection between mentioned entities, selection between unmentioned entities, selection between any entity, comparison, and yes or no question. Then we extracted 4 types of information from comparative questions: entity, aspect, comparison, and constraint. We built classifiers for classification task and information extraction task. Features used for classification task are bag of words, whether for information extraction, we used lexical, 2 previous and following words lexical, and previous label as features. We tried 2 scenarios: classification first and extraction first. For classification first, we used classification result as a feature for extraction. Otherwise, for extraction first, we used extraction result as features for classification. We found that the result would be better if we do extraction first before classification. For the extraction task, classification using SMO gave the best result (88.78%), while for classification, it is better to use naive bayes (82.35%).