Offline Definition Extraction Using Machine Learning for Knowledge-Oriented Question Answering

In this paper, we propose an approach to offline definition extraction using machine learning. We introduced a new framework of knowledge-based question answering (QA) system. In this framework, the answers are extracted and saved in answer base beforehand. For adapting to large scale application, the answer extraction should be independent with question. We call this task as offline answer extraction. We propose an approach to offline definition extraction using machine learning. We manually label the definition in documents and take them as training data to train the definition extraction model. We employ three classification models: Decision tree, Naive Bayesian and SVM. The experiment results show that SVM has best performance in definition extraction. Our approach outperforms the baseline in experiment. The experiment results indicate that our approach is effective.