A Domain-independent Dictionary-free Lexical Acquisition Model For Chinese Document

A domain independent dictionary free lexical acquisition model is presented in this paper,which introduces a self increasing algorithm to acquire the co occurrence patterns of Chinese characters,and introduces some criteria such as support and confidence to filter these co occurrence patterns to get lexical items.Experiments show that it can acquire lexical items with high frequency effectively and efficiently without the support of the dictionary and the supervised learning in term of corpus.The model proposed in this paper particularly suits for lexical frequency sensitive but time critical Chinese information processing applications,such as real time automatic Chinese text classification systems.