We examine the Differential Grammar , a representat ion designed to discr iminate which of a set of eonfusable al ternat ives is most likely in the context it occurs in. This approach is useful whereever uncer ta inty may exist about the ident i ty of a token or sequence of tokens, including in speech recognition, optical character recognition and machine t ransla t ion. In this paper our appl ica t ion is word processing: we discuss mul t ip le models of confusion which may be used in the identification of confused words, we show how significant contexts may be identified and condensed into Differential Grammars , and we contrast the performance of our implementa t ion with tha t of two commercial g r ammar checkers which purpor t to handle the confused word problem.
[1]
Eugene Charniak,et al.
Statistical language learning
,
1997
.
[2]
David M. W. Powers,et al.
“ A Statistical Grammar Checker ”
,
1996
.
[3]
Steven Finch,et al.
Finding structure in language
,
1995
.
[4]
Sydney Abbey,et al.
What is A “Method”?
,
1991
.
[5]
Kenneth Ward Church,et al.
Poisson mixtures
,
1995,
Natural Language Engineering.
[6]
Kenneth Ward Church,et al.
A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams
,
1991
.
[7]
Adam Kilgarriff,et al.
Which words are particularly characteristic of a text? a survey of statistical approaches
,
1996
.