Template-based Question Answering analysis on the LC-QuAD2.0 Dataset

In recent years, template-based question answer has picked up steam as a solution for evaluating RDF triples. Once we delve into the domain of template-based question answering, two important questions arise which are, the size of the dataset used as the knowledge base and the process of training used on that knowledge base. Previous studies attempted this problem with the LC-QuAD dataset and recursive neural network for training. This paper studies the same problem with a larger and newer benchmark dataset called LC-QuAD 2.0 and training using different machine learning models. The objective of this paper is to provide a comparative study using the newer LC-QuAD 2.0 dataset that has an updated schema and 30,000 question-answer pairs. Our study will focus on using and comparing two Machine Learning models and 3 different pre-processing techniques to generate results and identify the best model for this problem.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Jens Lehmann,et al.  LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs , 2017, SEMWEB.

[3]  Elena Cabrio,et al.  Multilingual Question Answering over Linked Data (QALD-3): Lab Overview , 2013, CLEF.

[4]  Yanghua Xiao,et al.  KBQA: An Online Template Based Question Answering System over Freebase , 2016, IJCAI.

[5]  Enrico Motta,et al.  Evaluating question answering over linked data , 2013, J. Web Semant..

[6]  Philipp Cimiano,et al.  Evaluating Architectural Choices for Deep Learning Approaches for Question Answering Over Knowledge Bases , 2019, 2019 IEEE 13th International Conference on Semantic Computing (ICSC).

[7]  Jens Lehmann,et al.  Template-based question answering over RDF data , 2012, WWW.

[8]  Richard K. G. Do,et al.  Convolutional neural networks: an overview and application in radiology , 2018, Insights into Imaging.

[9]  Kemele M. Endris,et al.  Question Answering on Linked Data: Challenges and Future Directions , 2016, WWW.

[10]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[11]  Gerhard Weikum,et al.  Automated Template Generation for Question Answering over Knowledge Graphs , 2017, WWW.

[12]  Hui Jiang,et al.  FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase , 2019, NAACL.

[13]  Jens Lehmann,et al.  LC-QuAD 2.0: A Large Dataset for Complex Question Answering over Wikidata and DBpedia , 2019, SEMWEB.

[14]  Abbas Z. Kouzani,et al.  Random forest based lung nodule classification aided by clustering , 2010, Comput. Medical Imaging Graph..

[15]  Ricardo Usbeck,et al.  Template-based Question Answering using Recursive Neural Networks , 2020, 2021 IEEE 15th International Conference on Semantic Computing (ICSC).