A Hybrid Approach to Word Segmentation and POS Tagging
暂无分享,去创建一个
In this paper, we present a hybrid method for word segmentation and POS tagging. The target languages are those in which word boundaries are ambiguous, such as Chinese and Japanese. In the method, word-based and character-based processing is combined, and word segmentation and POS tagging are conducted simultaneously. Experimental results on multiple corpora show that the integrated method has high accuracy.
[1] Tetsuji Nakagawa,et al. Chinese and Japanese Word Segmentation Using Word-Level and Character-Level Information , 2004, COLING.
[2] Yuji Matsumoto,et al. Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.
[3] Hwee Tou Ng,et al. Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based? , 2004, EMNLP.
[4] Nianwen Xu,et al. Chinese Word Segmentation as Character Tagging , 2003, Int. J. Comput. Linguistics Chin. Lang. Process..