论文信息 - Word Extraction from Corpora and Its Part-of-Speech Estimation Using Distributional Analysis

Word Extraction from Corpora and Its Part-of-Speech Estimation Using Distributional Analysis

Unknown words are inevitable at any step of analysis in natural language processing. We propose a method to extract words from a corpus and estimate the probability that each word belongs to given parts of speech (POSs), using a distributional analysis. Our experiments have shown that this method is effective for inferring the POS of unknown words.

Makoto Nagao | Shinsuke Mori

[1] Hinrich Schütze,et al. Distributional Part-of-Speech Tagging , 1995, EACL.

[2] Shinsuke Mori,et al. Parsing Without Grammar , 1995, IWPT.

[3] Mitchell P. Marcus. Overview of the Fifth DARPA Speech and Natural Language Workshop , 1992, HLT.

[4] A. Ross. Structural Linguistics , 1953, Nature.

[5] Eric Brill,et al. Automatically Acquiring Phrase Structure Using Distributional Analysis , 1992, HLT.