论文信息 - Using Co-occurrence Statistics as an Information Source for Partial Parsing of Chinese

Using Co-occurrence Statistics as an Information Source for Partial Parsing of Chinese

Our partial parser for Chinese uses a learned classifier to guide a bottom-up parsing process. We describe improvements in performance obtained by expanding the information available to the classifier, from POS sequences only, to include measures of word association derived from co-occurrence statistics. We compare performance using different measures of association, and find that Yule's coefficient of colligation Y gives somewhat better results over other measures.

Qiang Zhou | Elliott Franco Drabek

[1] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[3] Dekai Wu,et al. Are phrase structured grammars useful in statistical parsing , 1999 .

[4] Maosong Sun,et al. Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data , 2022, International Conference on Computational Linguistics.

[5] J. Jenkins,et al. Word association norms , 1964 .

[6] Dan Roth,et al. Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[7] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[8] Keh-Yih Su,et al. A Corpus-Based Approach to Automatic Compound Extraction , 1994, ACL.

[9] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[10] Steven Abney,et al. Parsing By Chunks , 1991 .

[11] Adwait Ratnaparkhi,et al. A Linear Observed Time Statistical Parser Based on Maximum Entropy Models , 1997, EMNLP.