Where’s My Head? Definition, Data Set, and Models for Numeric Fused-Head Identification and Resolution

We provide the first computational treatment of fused-heads constructions (FHs), focusing on the numeric fused-heads (NFHs). FHs constructions are noun phrases in which the head noun is missing and is said to be “fused” with its dependent modifier. This missing information is implicit and is important for sentence understanding. The missing references are easily filled in by humans but pose a challenge for computational models. We formulate the handling of FHs as a two stages process: Identification of the FH construction and resolution of the missing head. We explore the NFH phenomena in large corpora of English text and create (1) a data set and a highly accurate method for NFH identification; (2) a 10k examples (1 M tokens) crowd-sourced data set of NFH resolution; and (3) a neural baseline for the NFH resolution task. We release our code and data set, to foster further research into this challenging problem.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Tadashi Nomoto,et al.  Resolving Zero Anaphora in Japanese , 1993, EACL.

[3]  Yi-Jun Chen,et al.  An Empirical Study of Zero Anaphora Resolution in Chinese Based on Centering Model , 2001, ROCLING/IJCLCLP.

[4]  John Robert Ross,et al.  GAPPING AND THE ORDER OF CONSTITUENTS , 1970 .

[5]  Mark Johnson,et al.  An Improved Non-monotonic Transition System for Dependency Parsing , 2015, EMNLP.

[6]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[7]  Daniel Gillick,et al.  Exploring the steps of Verb Phrase Ellipsis , 2016, CORBON@HLT-NAACL.

[8]  Yu Zhang,et al.  Deep Reinforcement Learning for Chinese Zero Pronoun Resolution , 2018, ACL.

[9]  Yuji Matsumoto,et al.  Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution , 2006, ACL.

[10]  A. Thallaj,et al.  Guess what? , 2011, Saudi journal of anaesthesia.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Jackie Chi Kit Cheung,et al.  Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms , 2016, EMNLP.

[13]  Daisuke Kawahara,et al.  Japanese Zero Reference Resolution Considering Exophora and Author/Reader Mentions , 2013, EMNLP.

[14]  Anette Frank,et al.  Predicate-specific Annotations for Implicit Role Binding: Corpus Annotation, Data Analysis and Evaluation Experiments , 2013, IWCS.

[15]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[16]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[17]  Philipp Koehn,et al.  Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, LAW-ID@ACL 2013, August 8-9, 2013, Sofia, Bulgaria , 2013, LAW-ID@ACL.

[18]  R. Carter,et al.  Cambridge Grammar of English , 2006 .

[19]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[20]  Ari Rappoport,et al.  Universal Conceptual Cognitive Annotation (UCCA) , 2013, ACL.

[21]  Chen Chen,et al.  Chinese Zero Pronoun Resolution with Deep Neural Networks , 2016, ACL.

[22]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[23]  Vincent Ng,et al.  Supervised Noun Phrase Coreference Research: The First Fifteen Years , 2010, ACL.

[24]  Dan Roth,et al.  Reasoning about Quantities in Natural Language , 2015, TACL.

[25]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Dan Roth,et al.  Solving General Arithmetic Word Problems , 2016, EMNLP.

[28]  Na-Rae Han,et al.  Korean Null Pronouns: Classification and Annotation , 2004, ACL 2004.

[29]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[30]  Mateusz Kopec,et al.  Zero subject detection for Polish , 2014, EACL.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Chris Dyer,et al.  Neural Arithmetic Logic Units , 2018, NeurIPS.

[33]  Sun-Young Oh English Zero Anaphora as an Interactional Resource , 2005 .

[34]  D. Inkpen,et al.  TO BE OR NOT TO BE A ZERO PRONOUN : A MACHINE LEARNING APPROACH FOR ROMANIAN , 2010 .

[35]  Rui Zhang,et al.  Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering , 2018, ACL.

[36]  Joyce Yue Chai,et al.  Semantic Role Labeling of Implicit Arguments for Nominal Predicates , 2012, CL.

[37]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[38]  Yu Zhang,et al.  Zero Pronoun Resolution with Attention-based Neural Network , 2018, COLING.

[39]  Antonio Ferrández Rodríguez,et al.  A Computational Approach to Zero-pronouns in Spanish , 2000, ACL.

[40]  Luke S. Zettlemoyer,et al.  Higher-Order Coreference Resolution with Coarse-to-Fine Inference , 2018, NAACL.

[41]  Jong-Hoon Oh,et al.  Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network , 2016, EMNLP.

[42]  Sebastian Riedel,et al.  Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers , 2018, ACL.

[43]  Ido Dagan,et al.  Recognizing Textual Entailment: Models and Applications , 2013, Recognizing Textual Entailment: Models and Applications.

[44]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[45]  Marta Recasens,et al.  Sense Anaphoric Pronouns: Am I One? , 2016, CORBON@HLT-NAACL.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[47]  Erik Cambria,et al.  Anaphora and Coreference Resolution: A Review , 2018, Inf. Fusion.

[48]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[49]  N. Shepherd Cambridge Grammar of English , 2007 .

[50]  A. Lobeck Ellipsis: Functional Heads, Licensing, and Identification , 1995 .

[51]  Fang Kong,et al.  A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution , 2010, EMNLP.

[52]  Hwee Tou Ng,et al.  A Machine Learning Approach to Identification and Resolution of One-Anaphora , 2005, IJCAI.

[53]  Roger Levy,et al.  Solving logic puzzles: From robust processing to precise semantics , 2004, Proceedings of the 2nd Workshop on Text Meaning and Interpretation - TextMean '04.