Open Domain Question Answering via Semantic Enrichment

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance. Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18%~54% improvement under F_1 metric, compared with various existing QA systems.

[1]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[2]  Elaine Marsh,et al.  MUC-7 Evaluation of IE Technology: Overview of Results , 1998, MUC.

[3]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[4]  Ellen M. Voorhees,et al.  Building a question answering test collection , 2000, SIGIR '00.

[5]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[6]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[7]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  Jimmy J. Lin,et al.  Data-Intensive Question Answering , 2001, TREC.

[10]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[11]  Jennifer Chu-Carroll,et al.  A Multi-Strategy and Multi-Source Approach to Question Answering , 2002, TREC.

[12]  Jong-Hyeok Lee,et al.  Question Answering Approach Using a WordNet-based Answer Type Taxonomy , 2002, TREC.

[13]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16]  Dekang Lin,et al.  A Probabilistic Answer Type Model , 2006, EACL.

[17]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[18]  Alexander H. Waibel,et al.  A Pattern Learning Approach to Question Answering Within the Ephyra Framework , 2006, TSD.

[19]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[20]  Luo Si,et al.  A probabilistic graphical model for joint answer ranking in question answering , 2007, SIGIR.

[21]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[22]  Stephen E. Robertson,et al.  On rank-based effectiveness measures and optimization , 2007, Information Retrieval.

[23]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[24]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[25]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[26]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[27]  Oren Etzioni Search needs a shake-up , 2011, Nature.

[28]  Qiang Wu,et al.  Learning to Rank Using an Ensemble of Lambda-Gradient Models , 2010, Yahoo! Learning to Rank Challenge.

[29]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[30]  Krisztian Balog,et al.  Hierarchical target type identification for entity-oriented queries , 2012, CIKM.

[31]  Siddharth Patwardhan,et al.  Question analysis: How Watson reads a clue , 2012, IBM J. Res. Dev..

[32]  Aditya Kalyanpur,et al.  Typing candidate answers using type coercion , 2012, IBM J. Res. Dev..

[33]  Gerhard Weikum,et al.  Natural Language Questions for the Web of Data , 2012, EMNLP.

[34]  Jens Lehmann,et al.  Template-based question answering over RDF data , 2012, WWW.

[35]  Avirup Sil,et al.  The MSR Systems for Entity Linking and Temporal Slot Filling at TAC 2013 , 2013, TAC.

[36]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[37]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[38]  Xiaoqiang Luo,et al.  Finding What Matters in Questions , 2013, NAACL.

[39]  Oren Etzioni,et al.  Paraphrase-Driven Learning for Open Question Answering , 2013, ACL.

[40]  Xuchen Yao,et al.  Information Extraction over Structured Data: Question Answering with Freebase , 2014, ACL.

[41]  Jonathan Berant,et al.  Semantic Parsing via Paraphrasing , 2014, ACL.

[42]  Ramesh Nallapati,et al.  Joint question clustering and relevance prediction for open domain non-factoid question answering , 2014, WWW.

[43]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[44]  Rahul Gupta,et al.  Knowledge base completion via search-based question answering , 2014, WWW.

[45]  Dongyan Zhao,et al.  Natural language question answering over RDF: a graph data driven approach , 2014, SIGMOD Conference.

[46]  Oren Etzioni,et al.  Open question answering over curated and extracted knowledge bases , 2014, KDD.

[47]  Ryen W. White,et al.  Questions vs. Queries in Informational Search Tasks , 2015, WWW.

[48]  Wen-tau Yih,et al.  Web-based Question Answering: Revisiting AskMSR , 2015 .