Neural recovery machine for Chinese dropped pronoun

Dropped pronouns (DPs) are ubiquitous in prodrop languages like Chinese, Japanese etc. Previous work mainly focused on painstakingly exploring the empirical features for DPs recovery. In this paper, we propose a neural recovery machine (NRM) to model and recover DPs in Chinese to avoid the non-trivial feature engineering process. The experimental results show that the proposed NRM significantly outperforms the state-of-the-art approaches on two heterogeneous datasets. Further experimental results of Chinese zero pronoun (ZP) resolution show that the performance of ZP resolution can also be improved by recovering the ZPs to DPs.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Yalin Liu,et al.  Recovering dropped pronouns from Chinese text messages , 2015, ACL.

[3]  Yue Gao,et al.  Filtering of Brand-Related Microblogs Using Social-Smooth Multiview Embedding , 2016, IEEE Transactions on Multimedia.

[4]  Sadao Kurohashi,et al.  A Discriminative Approach to Japanese Zero Anaphora Resolution with Large-scale Lexicalized Case Frames , 2011, IJCNLP.

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Bowen Zhou,et al.  Enlisting the Ghost: Modeling Empty Categories for Machine Translation , 2013, ACL.

[7]  Nianwen Xue,et al.  Dependency-based empty category detection via phrase structure trees , 2013, NAACL.

[8]  Weinan Zhang,et al.  A Deep Neural Network for Chinese Zero Pronoun Resolution , 2016, IJCAI.

[9]  Chen Chen,et al.  Chinese Zero Pronoun Resolution: Some Recent Advances , 2013, EMNLP.

[10]  Seong-Bae Park,et al.  A Two-Step Zero Pronoun Resolution by Reducing Candidate Cardinality , 2012, PRICAI.

[11]  Hwee Tou Ng,et al.  Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach , 2007, EMNLP.

[12]  Chen Chen,et al.  Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers , 2015, ACL.

[13]  Fang Kong,et al.  A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution , 2010, EMNLP.

[14]  Chung-Hsien Wu,et al.  Transfer-based statistical translation of Taiwanese sign language using PCFG , 2007, TALIP.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Yuji Matsumoto,et al.  Zero-anaphora resolution by learning rich syntactic pattern features , 2007, TALIP.

[17]  David Ha,et al.  long short term memory , 2015 .

[18]  Elizabeth Baran,et al.  Annotating dropped pronouns in Chinese newswire text , 2012, LREC.

[19]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[20]  Chen Chen,et al.  Chinese Zero Pronoun Resolution: An Unsupervised Approach Combining Ranking and Integer Linear Programming , 2014, AAAI.

[21]  Nianwen Xue,et al.  Chasing the ghost: recovering empty categories in the Chinese Treebank , 2010, COLING.

[22]  Chengqing Zong,et al.  Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) , 2015, IJCNLP 2015.

[23]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[24]  Young-Joo Kim,et al.  Subject/Object Drop in the Acquisition of Korean: A Cross-Linguistic Comparison , 2000 .

[25]  Andy Way,et al.  A Novel Approach to Dropped Pronoun Translation , 2016, NAACL.

[26]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[27]  Fang Kong,et al.  A Clause-Level Hybrid Approach to Chinese Empty Element Recovery , 2013, IJCAI.

[28]  Yoav Goldberg,et al.  Language-Independent Parsing with Empty Elements , 2011, ACL.