We describe an application of sentence alignment techniques and approximate string matching to the problem of extracting lexicographically interesting word-word pairs from multilingual corpora. Since our interest is in support systems for lexicographers rather than in fully automatic construction of lexicons, we would like to provide access to parameters allowing a tunable trade-oo between precision and recall. We evaluate two techniques for doing this. Since sentence alignment tends to associate semantically similar words, approximate string matching draws attention to orthographic similarities, they can be used to serve diierent lexicographic purposes, as can the combination of the two techniques, which amounts, inter alia, to a tool for uncovering faux amis. We conclude by sketching a simple and exible means for allowing lexicographers to provide information which has the potential to improve system performance.
[1]
강승식,et al.
[서평]「Electric Words : Dictionaries, Computers and Meanings」
,
1997
.
[2]
Della Summers.
LEXICOGRAPHY-The importance of representativeness in relation to frequency
,
2022
.
[3]
Gerard Salton,et al.
Improving retrieval performance by relevance feedback
,
1997,
J. Am. Soc. Inf. Sci..
[4]
Thomas G. Szymanski,et al.
A fast algorithm for computing longest common subsequences
,
1977,
CACM.
[5]
Ted Briscoe,et al.
Towards Automatic Extraction of Argument Structure from Corpora
,
1995
.