论文信息 - Handling Sparse Data by Successive Abstraction

Handling Sparse Data by Successive Abstraction

A general, practical method for handling sparse data that avoids held-out data and iterative reestimation is derived from first principles. It has been tested on a part-of-speech tagging task and out-performed (deleted) interpolation with context-independent weights, even when the latter used a globally optimal parameter setting determined a posteriori.

Christer Samuelsson | C. Samuelsson

[1] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .

[3] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[4] Steven J. DeRose,et al. Grammatical Category Disambiguation by Statistical Optimization , 1988, CL.

[5] Frederick B. Thompson,et al. English for the computer , 1899, AFIPS '66 (Fall).

[6] I. Good. THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .