We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can navigate and follow links using mouse clicks. Spoken or written queries may combine search expressions with browser commands and search space restrictions. In interpreting input queries, the system has to be fault-tolerant to account for spontanous speech phenomena as well as typing or speech recognition errors which often distort the meaning of the utterance and are difficult to detect and correct. Our parser integrates shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes. Parsing relies on a two-layered approach: typical meta-expressions like those concerning search, newspaper types and dates are identified and excluded from the search string to be sent to the search engine. The search terms which are left after preprocessing are then grouped according to co-occurrence statistics which have been derived from a newspaper corpus. These co-occurrence statistics concern typical noun phrases as they appear in newspaper texts.
[1]
Werner Winiwarter,et al.
Syntactic Analysis for Natural Language Interfaces - the Integrated Deductive Approach
,
1993
.
[2]
Steven Abney,et al.
Parsing By Chunks
,
1991
.
[3]
Nigel Gilbert,et al.
Simulating speech systems
,
1991
.
[4]
Michel Généreux,et al.
Evaluating Multi-modal Input Modes in a Wizard-of-Oz Study for the Domain of Web Search
,
2001,
BCS HCI/IHM.
[5]
Karsten L. Worm.
A Model for Robust Processing of Spontaneous Speech by Integrating Viable Fragments
,
1998,
COLING-ACL.
[6]
Gregory Grefenstette,et al.
Use of syntactic context to produce term association lists for text retrieval
,
1992,
SIGIR '92.