Word Extraction from Corpora and Its Part-of-Speech Estimation Using Distributional Analysis
暂无分享,去创建一个
Unknown words are inevitable at any step of analysis in natural language processing. We propose a method to extract words from a corpus and estimate the probability that each word belongs to given parts of speech (POSs), using a distributional analysis. Our experiments have shown that this method is effective for inferring the POS of unknown words.
[1] Hinrich Schütze,et al. Distributional Part-of-Speech Tagging , 1995, EACL.
[2] Shinsuke Mori,et al. Parsing Without Grammar , 1995, IWPT.
[3] Mitchell P. Marcus. Overview of the Fifth DARPA Speech and Natural Language Workshop , 1992, HLT.
[4] A. Ross. Structural Linguistics , 1953, Nature.
[5] Eric Brill,et al. Automatically Acquiring Phrase Structure Using Distributional Analysis , 1992, HLT.