This paper presents the participation of FIDJI system to the Web Question-Answering evaluation campaign organized by Quaero in 2009. FIDJI is an open-domain question-answering system which combines syntactic information with traditional QA techniques such as named entity recognition and term weighting in order to validate answers through multiple documents. It was originally designed to process ``clean'' document collections. Overall results are significantly lower than in traditional campaigns but results (for French evaluation) are quite good compared to other state-of-the-art systems. They show that a syntax-based strategy, applied on uncleaned Web data, can still obtain good results. Moreover, we obtain much higher scores on ``complex'' questions, i.e. `how' and `why' questions, which are more representative of real user needs. These results show that questioning the Web with advanced linguistic techniques can be done without heavy pre-processing and with results that come near to best systems that use strong resources and large structured indexes.
[1]
Jean-Pierre Chanod,et al.
Robustness beyond shallowness: incremental deep parsing
,
2002,
Natural Language Engineering.
[2]
Brigitte Grau,et al.
Utilisation de la syntaxe pour valider les réponses à des questions par plusieurs documents
,
2009,
CORIA.
[3]
Olivier Galibert,et al.
Question Answering on Web Data: The QA Evaluation in Quæro
,
2010,
LREC.
[4]
Quintard Ludovic,et al.
OVERVIEW OF THE QUAERO 2008 MONOLINGUAL QUESTION ANSWERING TRACK
,
2008
.
[5]
Boris Katz,et al.
Syntactic and Semantic Decomposition Strategies for Question Answering from Multiple Resources *
,
2005
.
[6]
Sven Hartrumpf,et al.
University of Hagen at QA@CLEF 2008: Efficient Question Answering with Question Decomposition and Multiple Answer Streams
,
2008,
CLEF.