Multilingual Zero-shot Constituency Parsing

Zero-shot constituency parsing aims to extract parse trees from neural models such as pre-trained language models (PLMs) without further training or the need to train an additional parser. This paper improves upon existing zero-shot parsing paradigms by introducing a novel chart-based parsing method, showing gains in zero-shot parsing performance. Furthermore, we attempt to broaden the range of zero-shot parsing applications by examining languages other than English and by utilizing multilingual models, demonstrating that it is feasible to generate parse tree-like structures for sentences in eight other languages using our method.

[1]  Dan Klein,et al.  Constituency Parsing with a Self-Attentive Encoder , 2018, ACL.

[2]  John Cocke,et al.  Programming languages and their compilers: Preliminary notes , 1969 .

[3]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[4]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[5]  Frank Keller,et al.  An Imitation Learning Approach to Unsupervised Parsing , 2019, ACL.

[6]  Qun Liu,et al.  Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT , 2020, ACL.

[7]  Mikel Artetxe,et al.  On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.

[8]  Furu Wei,et al.  Visualizing and Understanding the Effectiveness of BERT , 2019, EMNLP.

[9]  Mohit Yadav,et al.  Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders , 2019, NAACL.

[10]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[11]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[12]  Nizar Habash,et al.  Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages , 2013, SPMRL@EMNLP.

[13]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[14]  Sanghwan Bae,et al.  Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations , 2018, AAAI.

[15]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[16]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[17]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[18]  Tapio Salakoski,et al.  Multilingual is not enough: BERT for Finnish , 2019, ArXiv.

[19]  Dan Klein,et al.  Multilingual Alignment of Contextual Word Representations , 2020, ICLR.

[20]  Mark Dredze,et al.  Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT , 2019, EMNLP.

[21]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[22]  Laurent Romary,et al.  CamemBERT: a Tasty French Language Model , 2019, ACL.

[23]  Vishrav Chaudhary,et al.  CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data , 2019, LREC.

[24]  Vineeth N. Balasubramanian,et al.  Zero-Shot Task Transfer , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Samuel R. Bowman,et al.  Neural Unsupervised Parsing Beyond English , 2019, EMNLP.

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  Fedor Moiseev,et al.  Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.

[28]  Jihun Choi,et al.  Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction , 2020, ICLR.

[29]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[30]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[31]  Rudolf Rosa,et al.  Inducing Syntactic Trees from BERT Representations , 2019, ArXiv.

[32]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[33]  Rudolf Rosa,et al.  From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions , 2019, BlackboxNLP@ACL.

[34]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[35]  Yonatan Belinkov,et al.  Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  Alexander M. Rush,et al.  Compound Probabilistic Context-Free Grammars for Grammar Induction , 2019, ACL.

[38]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[39]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[40]  Yoshua Bengio,et al.  Straight to the Tree: Constituency Parsing with Neural Syntactic Distance , 2018, ACL.

[41]  Junru Zhou,et al.  Head-Driven Phrase Structure Grammar Parsing on Penn Treebank , 2019, ACL.

[42]  Benjamin Lecouteux,et al.  FlauBERT: Unsupervised Language Model Pre-training for French , 2020, LREC.

[43]  Aaron C. Courville,et al.  Neural Language Modeling by Jointly Learning Syntax and Lexicon , 2017, ICLR.

[44]  Hung-Yi Lee,et al.  Tree Transformer: Integrating Tree Structures into Self-Attention , 2019, EMNLP/IJCNLP.

[45]  Dan Roth,et al.  Cross-Lingual Ability of Multilingual BERT: An Empirical Study , 2019, ICLR.

[46]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[47]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[48]  Alexander M. Rush,et al.  Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.

[49]  Tommaso Caselli,et al.  BERTje: A Dutch BERT Model , 2019, ArXiv.

[50]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[51]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[52]  Kevin Gimpel,et al.  Visually Grounded Neural Syntax Acquisition , 2019, ACL.

[53]  Dan Klein,et al.  Neural CRF Parsing , 2015, ACL.

[54]  Dan Klein,et al.  Multilingual Constituency Parsing with Self-Attention and Pre-Training , 2018, ACL.

[55]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[56]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[57]  Yoav Goldberg,et al.  Assessing BERT's Syntactic Abilities , 2019, ArXiv.

[58]  Christopher D. Manning,et al.  A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[59]  Aaron C. Courville,et al.  Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.

[60]  Anna Rumshisky,et al.  A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.

[61]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[62]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[63]  Chris Dyer,et al.  A Critical Analysis of Biased Parsers in Unsupervised Parsing , 2019, ArXiv.