TETEYEQ: Amharic Question Answering For Factoid Questions

The number of Amharic documents on the Web is increasing as many newspaper publishers started their services electronically. People were relying on IR systems to satisfy their information needs but it has been criticized for lack of delivering “readymade” information to the user, so that the Question Answering systems emerge as best solution to get the required information to the user with the help of information extraction techniques. The language specific issues in Amharic are extensively studied and hence, document normalization was found very crucial for the performance of our Question Answering system. The performance on normalized documents is found to be higher than on un-normalized ones. A distinct technique was used to determine the question types, possible question focuses, and expected answer types as well as to generate proper Information Retrieval query, based on our language specific issue investigations. An approach in document retrieval focuses on retrieving three types of documents (Sentence, paragraph, and file). An algorithm has been developed for sentence/paragraph re-ranking and answer selection. The named-entity-(gazetteer) and pattern-based answer pinpointing algorithms developed help locating possible answer particles in a document. The rule based question classification module classifies about 89% of the question correctly. The document retrieval component shows greater coverage of relevant document retrieval (97%) while the sentence based retrieval has the least (93%) which contributes to the better recall of our system. The gazetteer-based answer selection using a paragraph answer selection technique answers 72% of the questions correctly which can be considered as promising. The file based answer selection technique exhibits better recall (91%) which indicates that most relevant documents which are thought to have the correct answer are returned.

[1]  Shih-Hung Wu,et al.  ASQA: Academia Sinica Question Answering System for NTCIR-5 CLQA , 2005, NTCIR.

[2]  M. Narasimha Murty,et al.  Adapting question answering techniques to the Web , 2002, Language Engineering Conference, 2002. Proceedings.

[3]  Anne-Laure Ligozat,et al.  Towards an Automatic Validation of Answers in Question Answering , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[4]  Jimmy J. Lin,et al.  What Works Better for Question Answering: Stemming or Morphological Query Expansion? , 2004 .

[5]  Lu Han,et al.  Research on Chinese FAQ Question Answering System in Restricted Domain , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[6]  F. Ren,et al.  Web-based question answering system for restricted domain based of integrating method using semantic information , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[7]  Bing Zhang,et al.  The Development and Application of Chinese Intelligent Question Answering System Based on J2EE Technology , 2008, First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008).