ChID: A Large-scale Chinese IDiom Dataset for Cloze Test

Cloze-style reading comprehension in Chinese is still limited due to the lack of various corpora. In this paper we propose a large-scale Chinese cloze test dataset ChID, which studies the comprehension of idiom, a unique language phenomenon in Chinese. In this corpus, the idioms in a passage are replaced by blank symbols and the correct answer needs to be chosen from well-designed candidate idioms. We carefully study how the design of candidate idioms and the representation of idioms affect the performance of state-of-the-art models. Results show that the machine accuracy is substantially worse than that of human, indicating a large space for further research.

[1]  Nathanael Chambers,et al.  LSDSem 2017 Shared Task: The Story Cloze Test , 2017, LSDSem@EACL.

[2]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[3]  S. Fotos The Cloze Test as an Integrative Measure of EFL Proficiency: A Substitute for Essays on College Entrance Examinations?* , 1991 .

[4]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[5]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[6]  Heng Ji,et al.  Chengyu Cloze Test , 2018, BEA@NAACL-HLT.

[7]  Sandro Pezzelle,et al.  The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.

[8]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[9]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[10]  Cristina Cacciari,et al.  Idioms: Processing, Structure, and Interpretation , 1993 .

[11]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[12]  Jing Li,et al.  Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings , 2018, NAACL.

[13]  John D. Kelleher,et al.  Evaluation of a Substitution Method for Idiom Transformation in Statistical Machine Translation , 2014, MWE@EACL.

[14]  David A. McAllester,et al.  Who did What: A Large-Scale Person-Centered Cloze Dataset , 2016, EMNLP.

[15]  Wilson L. Taylor,et al.  “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[16]  R. Jackendoff Foundations of Language: Brain, Meaning, Grammar, Evolution , 2002 .

[17]  Marion Weller,et al.  How to Account for Idiomatic German Support Verb Constructions in Statistical Machine Translation , 2015, MWE@NAACL-HLT.

[18]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[19]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[20]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[21]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[22]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Guokun Lai,et al.  Large-scale Cloze Test Dataset Created by Teachers , 2017, EMNLP.

[25]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[26]  Danqi Chen,et al.  A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task , 2016, ACL.

[27]  Ting Liu,et al.  Consensus Attention-based Neural Networks for Chinese Reading Comprehension , 2016, COLING.

[28]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[31]  Jon Jonz Cloze item types and second language comprehension , 1991 .

[32]  Rico Sennrich,et al.  Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method , 2017, LREC.

[33]  Casimir C. Klimasauskas Neural networks for , 1996 .

[34]  Xinyan Xiao,et al.  DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications , 2017, QA@ACL.

[35]  Terrence L. Fine,et al.  Feedforward Neural Network Methodology , 1999, Information Science and Statistics.

[36]  Dimitra Anastasiou,et al.  Idiom Treatment Experiments in Machine Translation , 2010 .

[37]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[38]  Annie Tremblay,et al.  PROFICIENCY ASSESSMENT STANDARDS IN SECOND LANGUAGE ACQUISITION RESEARCH , 2011, Studies in Second Language Acquisition.

[39]  Alison Wray,et al.  Formulaic Language and the Lexicon: List of Figures and Tables , 2002 .