Construction and Application of the Knowledge Base of Chinese Multi-word Expressions
暂无分享,去创建一个
In a language, Multi-word Expressions (MWEs, also called “idiomatic expressions” or “set phrases”) are very common in everyday usage. Most linguists hold that MWEs be an inclusive concept that should consist of not only lexical units such as idioms, idiomatic expressions, xiehouyu, proper nouns, but also non-lexical units such as proverbs, maxims and adages. Even those that are statistically idiosyncratic are to be listed in MWEs. In NLP tasks like word segmentation and semantic role labeling remain a bottle-neck problem. Therefore, to construct a knowledge base for MWEs with relatively complete entries and tagged attributes will be an effective solution for the above-mentioned problem. This paper introduces relevant information about the construction and application of an MWE knowledge base by the Institute of Computational Linguistics at Peking University(ICL/PKU), in which the author expects to provide due help to research in this regard.
[1] Shiwen Yu,et al. Chinese Idiom Knowledge Base for Chinese Information Processing , 2012, CLSW.
[2] Qun Liu,et al. HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.
[3] Sabine Fiedler,et al. English Phraseology: A Coursebook , 2007 .
[4] Timothy Baldwin,et al. An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.