Semantic Models for Question Answering

The research presented in this paper focuses on the adoption of semantic models for Question Answering (QA) systems. We propose a framework which exploits semantic technologies to analyze the question, retrieve and rank relevant passages. It exploits: (a) Natural Language Processing algorithms for the analysis of questions and candidate answers both in English and Italian; (b) Information Retrieval (IR) probabilistic models for retrieving candidate answers and (c) Machine Learning methods for question classification. The data source for the answers is an unstructured text document collection stored in search indices. The aim of the research is to improve the system performances by introducing semantic models in every step of the answering process.

[1]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[2]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[3]  Qiang Wu,et al.  Learning to Rank Using an Ensemble of Lambda-Gradient Models , 2010, Yahoo! Learning to Rank Challenge.

[4]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[5]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[6]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[7]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[8]  Richi Nayak,et al.  Expertise Analysis in a Question Answer Portal for Author Ranking , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[9]  Anselmo Peñas,et al.  Overview of ResPubliQA 2009: Question Answering Evaluation over European Legislation , 2009, CLEF.

[10]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[11]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[12]  Bruce C. Straits,et al.  Approaches to social research , 1993 .

[13]  Luis Gravano,et al.  Learning search engine specific query transformations for question answering , 2001, WWW '01.

[14]  Dominic Widdows,et al.  Semantic Vectors: a Scalable Open Source Package and Online Technology Management Application , 2008, LREC.

[15]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[16]  Tao Qin,et al.  A general approximation framework for direct optimization of information retrieval measures , 2010, Information Retrieval.

[17]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[18]  Erik T. Mueller,et al.  Watson: Beyond Jeopardy! , 2013, Artif. Intell..

[19]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[20]  Jaime G. Carbonell,et al.  Rank learning for factoid question answering with linguistic and semantic constraints , 2010, CIKM.

[21]  Hui Xiong,et al.  Towards expert finding by leveraging relevant categories in authority ranking , 2011, CIKM '11.

[22]  Adrian Popescu,et al.  User profiling for answer quality assessment in Q&A communities , 2013, DUBMOD '13.

[23]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[24]  S. Harabagiu Finding Answers in Large Collections of Texts : Paragraph Indexing W Abductive Inference , 2002 .

[25]  Marius Paca Open-Domain Question Answering from Large Text Collections , 2003, Computational Linguistics.

[26]  Sören Auer,et al.  The emerging web of linked data , 2011, ISWSA '11.

[27]  Djoerd Hiemstra,et al.  Modeling Documents as Mixtures of Persons for Expert Finding , 2008, ECIR.

[28]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[29]  Sanda M. Harabagiu,et al.  Experiments with Open-Domain Textual Question Answering , 2000, COLING.

[30]  Jimmy J. Lin An exploration of the principles underlying redundancy-based factoid question answering , 2007, TOIS.

[31]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[32]  M. Felisa Verdejo,et al.  Overview of the Answer Validation Exercise 2007 , 2007, CLEF.

[33]  Duen-Ren Liu,et al.  Expert finding in question-answering websites: a novel hybrid approach , 2010, SAC '10.

[34]  Fang Liu,et al.  Improving Question Retrieval in Community Question Answering Using World Knowledge , 2013, IJCAI.

[35]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[36]  Mihai Surdeanu,et al.  Learning to Rank Answers to Non-Factoid Questions from Web Collections , 2011, CL.

[37]  W. Bruce Croft,et al.  A Translation Model for Sentence Retrieval , 2005, HLT.

[38]  Djoerd Hiemstra,et al.  Modeling multi-step relevance propagation for expert finding , 2008, CIKM '08.

[39]  Working Notes for CLEF 2008 Workshop co-located with the 12th European Conference on Digital Libraries (ECDL 2008) , Aarhus, Denmark, September 17-19, 2008 , 2014, CLEF.

[40]  Walter F. Bauer,et al.  Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM computer conference , 1961 .

[41]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[42]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[43]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[44]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[45]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[46]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[47]  Julio J. Castillo The Contribution of FaMAF at QA@CLEF 2008. Answer Validation Exercise , 2008, CLEF.

[48]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[49]  Inderjeet Mani,et al.  How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done , 2000, LREC.

[50]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[51]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[52]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[53]  Robert F. Simmons,et al.  Answering English questions by computer: a survey , 1965, CACM.

[54]  Stefan Riezler,et al.  Improved answer ranking in social question-answering portals , 2011, SMUC '11.

[55]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[56]  Yi Zhang,et al.  Graph-based ranking algorithms for e-mail expertise analysis , 2003, DMKD '03.

[57]  Tong Zhang,et al.  Statistical Analysis of Bayes Optimal Subset Ranking , 2008, IEEE Transactions on Information Theory.

[58]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[59]  Pasquale Lops,et al.  An Artificial Player for a Language Game , 2012, IEEE Intelligent Systems.

[60]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[61]  Hamish Cunningham,et al.  FREyA: An Interactive Way of Querying Linked Data Using Natural Language , 2011, ESWC Workshops.

[62]  Bernardo Magnini,et al.  Is It the Right Answer? Exploiting Web Redundancy for Answer Validation , 2002, ACL.

[63]  Suzan Verberne,et al.  What Is Not in the Bag of Words for Why-QA? , 2010, CL.

[64]  Roser Morante,et al.  QA4MRE 2011-2013: Overview of Question Answering for Machine Reading Evaluation , 2013, CLEF.

[65]  Qing Yang,et al.  Predicting Best Answerers for New Questions in Community Question Answering , 2010, WAIM.

[66]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[67]  Haiqiang Chen,et al.  Social Network Structure Behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track , 2006, TREC.

[68]  Evangelos E. Milios,et al.  Finding expert users in community question answering , 2012, WWW.

[69]  Annalina Caputo,et al.  Exploiting Distributional Semantic Models in Question Answering , 2012, 2012 IEEE Sixth International Conference on Semantic Computing.

[70]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[71]  Marco Gori,et al.  WebCrow: A Web-Based System for Crossword Solving , 2005, AAAI.

[72]  Meredith Ringel Morris,et al.  A Comparison of Information Seeking Using Search Engines and Social Networks , 2010, ICWSM.

[73]  Bernardo Magnini,et al.  Evaluating Multilingual Question Answering Systems at CLEF , 2010, LREC.

[74]  Wolfgang Wahlster,et al.  Intelligent Interactive Entertainment Grand Challenges , 2006, IEEE Intelligent Systems.

[75]  Alessandro Moschitti,et al.  Structural relationships for large-scale learning of answer re-ranking , 2012, SIGIR '12.

[76]  Elizabeth D. Liddy,et al.  Question-Answering: CNLP at TREC-10 Question Answering Track , 2001 .

[77]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.

[78]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[79]  David M. Pennock,et al.  1 Billion Pages = 1 Million Dollars? Mining the Web to Play "Who Wants to be a Millionaire?" , 2002, UAI.

[80]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[81]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[82]  F. Maxwell Harper,et al.  Facts or friends?: distinguishing informational and conversational questions in social Q&A sites , 2009, CHI.

[83]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[84]  Irwin King,et al.  Routing questions to appropriate answerers in community question answering services , 2010, CIKM.

[85]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[86]  Michael R. Lyu,et al.  Question routing in community question answering: putting category in its place , 2011, CIKM '11.

[87]  Michael L. Littman,et al.  A probabilistic approach to solving crossword puzzles , 2002, Artif. Intell..

[88]  Georgia Koutrika,et al.  Questioning Yahoo! Answers , 2007 .

[89]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[90]  Pável Calado,et al.  Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow , 2013, SIGIR.

[91]  Annalina Caputo,et al.  Distributional Semantics for Answer Re-ranking in Question Answering , 2013, IIR.

[92]  Pierpaolo Basile,et al.  QuestionCube: a Framework for Question Answering , 2012, IIR.

[93]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[94]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[95]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[96]  Ingo Gl University of Hagen at CLEF 2008: Answer Validation Exercise , 2007 .

[97]  Paul Buitelaar,et al.  A System Description of Natural Language Query over DBpedia , 2012, ILD@ESWC.

[98]  Ming-Wei Chang,et al.  Question Answering Using Enhanced Lexical Semantic Models , 2013, ACL.

[99]  Proceedings of the Yahoo! Learning to Rank Challenge, held at ICML 2010, Haifa, Israel, June 25, 2010 , 2011, Yahoo! Learning to Rank Challenge.

[100]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[101]  Malvina Nissim,et al.  Question Answering with QED at TREC 2005 , 2005, TREC.

[102]  Quoc V. Le,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[103]  Aravind K. Joshi,et al.  Ranking and Reranking with Perceptron , 2005, Machine Learning.

[104]  David A. Ferrucci IBM's Watson/DeepQA , 2011, SIGARCH Comput. Archit. News.

[105]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[106]  Tomoharu Iwata,et al.  Effective Question Recommendation Based on Multiple Features for Question Answering Communities , 2010, ICWSM.

[107]  Eugene Agichtein,et al.  Finding the right facts in the crowd: factoid question answering over social media , 2008, WWW.

[108]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[109]  Bert F. Green,et al.  Baseball: an automatic question-answerer , 1899, IRE-AIEE-ACM '61 (Western).

[110]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[111]  Pasquale Lops,et al.  A Virtual Player for "Who Wants to Be a Millionaire?" based on Question Answering , 2013, AI*IA.

[112]  Shengrui Wang,et al.  Identifying authoritative actors in question-answering forums: the case of Yahoo! answers , 2008, KDD.

[113]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[114]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[115]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[116]  Michael L. Littman,et al.  Review: Computer Language Games , 2000, Computers and Games.

[117]  Enrico Motta,et al.  Integration of micro-gravity and geodetic data to constrain shallow system mass changes at Krafla Volcano, N Iceland , 2006 .

[118]  Philipp Cimiano,et al.  Natural Language Interfaces: What Is the Problem? - A Data-Driven Quantitative Analysis , 2009, NLDB.

[119]  Michael R. Lyu,et al.  A classification-based approach to question routing in community question answering , 2012, WWW.

[120]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[121]  Ruli Manurung,et al.  Contextual Approach for Paragraph Selection in Question Answering Task , 2010, CLEF.

[122]  Enrico Motta,et al.  Evaluating question answering over linked data , 2013, J. Web Semant..

[123]  Anirban Dasgupta,et al.  Vote calibration in community question-answering systems , 2012, SIGIR '12.

[124]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[125]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[126]  Luca Maria Aiello,et al.  Distributed Representations for Semantic Matching in non-factoid Question Answering , 2014, SMIR@SIGIR.

[127]  Piero Molino Semantic models for answer re-ranking in question answering , 2013, SIGIR.

[128]  Jimmy J. Lin,et al.  Web question answering: is more always better? , 2002, SIGIR '02.

[129]  Prasenjit Majumder,et al.  Question Answering System: Retrieving Relevant Passages , 2010, CLEF.

[130]  David R. Karger,et al.  Tie strength in question & answer on social network sites , 2012, CSCW '12.

[131]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[132]  Justo Puerto,et al.  Dynamic programming analysis of the TV game "Who wants to be a millionaire?" , 2007, Eur. J. Oper. Res..

[133]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[134]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[135]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[136]  Marco Baroni,et al.  Morph-it! A free corpus-based morphological resource for the Italian language , 2005 .

[137]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[138]  W. Bruce Croft,et al.  Finding experts in community-based question-answering services , 2005, CIKM '05.