L'apprentissage d'ordonnancement pour l'appariement de questions

Cet article presente une approche permettant a un utilisateur d’interroger une base de connaissances type FAQ, c’est-a-dire un ensemble de questions et leurs reponses respectives redigees en langue naturelle. Le composant presente dans cet article apparie la question de l’utilisateur a une ou plusieurs questions de la base de connaissances. Pour cela, nous utilisons un composant deja existant d’analyse de questions, capable de selectionner un ensemble de questions candidates proches de la question utilisateur, et de produire des traits propres a chaque couple (question utilisateur, question candidate). Ce composant est chaine a un modele permettant l’ordonnancement des questions candidates, qui est appris automatiquement de facon supervisee, une partie seulement du corpus d'apprentissage etant annotee manuellement, et le reste grâce a des regles add-hoc. Ces travaux reprennent les resultats d’un domaine de recherche recent, l’apprentissage d’ordonnancement (Learning to Rank), et les adaptent a une application industrielle innovante, l’appariement de questions comme paradigme d’acces a la connaissance. Une experimentation evalue sur des donnees issues d’un systeme en production la qualite de chacune des phases d’apprentissage.

[1]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[2]  Antonio Cisternino,et al.  PiQASso: Pisa Question Answering System , 2001, TREC.

[3]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[4]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[5]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[6]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[7]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[8]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[9]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[10]  Abraham Bernstein,et al.  Evaluating the usability of natural language query languages and interfaces to Semantic Web knowledge bases , 2010, J. Web Semant..

[11]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[12]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[13]  Soumen Chakrabarti Breaking Through the Syntax Barrier: Searching with Entities and Relations , 2004, PKDD.

[14]  Kalina Bontcheva,et al.  Improving habitability of natural language interfaces for querying ontologies with feedback and clarification dialogues , 2013, J. Web Semant..

[15]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.

[16]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[17]  Harry R. Tennant,et al.  Talk to Your Semantic Web , 2005, IEEE Internet Comput..

[18]  Christoph Meinel,et al.  Student's Perception of a Semantic Search Engine , 2005, CELDA.

[19]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[20]  Christoph Meinel,et al.  A simple solution for an intelligent librarian system , 2005, IADIS AC.

[21]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[22]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[23]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[24]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[25]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[26]  Sheilla E. Desert WESTLAW is Natural v. Boolean Searching: A Performance Study , 1993 .

[27]  Tomek Strzalkowski,et al.  Questioning Answering By Pattern Matching, Web-Proofing, Semantic Form Proofing , 2003, TREC.

[28]  Sasa M. Dekleva Is natural language querying practical? , 1994, DATB.

[29]  Helmut Berger,et al.  A Natural Language Query Interface for Tourism Information , 2003, ENTER.

[30]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[31]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[32]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.