Language Acquisition and Data Compression
暂无分享,去创建一个
Statistical data compression requires a stochastic language model which must rapidly adapt to new data as it is encountered. A grammatical inference engine is introduced which satisfies this requirement; it is able to discover structure in arbitrary data using nothing more than the predictions of a simple trigram model. We show that compression may be used as an alternative to perplexity for language model evaluation, and that the information processing techniques employed by our system may reflect what happens in the human brain.