One of the problems facing the user of a bilingual dictionary is producing multiword expressions and phrases in the target language when the explicit phrasal translation does not appear in the dictionary. Defining collocations as the preferred choice of words for expressing the desired concept, the DECIDE project has been exploring how collocational information from monoand bilingual dictionaries and raw text corpora can be discovered, extracted and stored online. During the project, we have developed tools for identifying potential collocations from raw text; for marking up English, French and German text for use in an interactive corpus query tool; for accessing lexical and grammatical patterns over such a corpus via this corpus query tool; for accessing collocations derived from online bilingual dictionaries; and for documenting such collocations using available text corpora. Finally, we have produced a common interface to these textual, corpus and dictionary tools, and used this interface to create a multilingual lexicon of the collocational choices of support verbs for nominalizations of speech act verbs. This paper presents an overview of this European Union sponsored project, its objectives, its methodology, and its results.
[1]
A. Wierzbicka.
English Speech Act Verbs: A Semantic Dictionary
,
1987
.
[2]
Gregory Grefenstette,et al.
Explorations in automatic thesaurus discovery
,
1994
.
[3]
Thierry Fontenelle,et al.
Prototype extraction tools for dictionaries
,
1995
.
[4]
Thierry Fontenelle,et al.
Turning a bilingual dictionary into a lexical semantic database
,
1997
.
[5]
Steven Abney,et al.
Parsing By Chunks
,
1991
.
[6]
Anthony P. Cowie.
Multiword Lexical Units and Communicative Language Teaching
,
1992
.