Adversarial Neural Networks for Cross-lingual Sequence Tagging

We study cross-lingual sequence tagging with little or no labeled data in the target language. Adversarial training has previously been shown to be effective for training cross-lingual sentence classifiers. However, it is not clear if language-agnostic representations enforced by an adversarial language discriminator will also enable effective transfer for token-level prediction tasks. Therefore, we experiment with different types of adversarial training on two tasks: dependency parsing and sentence compression. We show that adversarial training consistently leads to improved cross-lingual performance on each task compared to a conventionally trained baseline.

[1]  Claire Cardie,et al.  Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification , 2016, TACL.

[2]  Preslav Nakov,et al.  Cross-language Learning with Adversarial Neural Networks , 2017, CoNLL.

[3]  Lucia Specia,et al.  Cross-lingual Sentence Compression for Subtitles , 2012, EAMT.

[4]  Lior Wolf,et al.  Language Generation with Recurrent Generative Adversarial Networks without Pre-training , 2017, ArXiv.

[5]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[6]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[7]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[8]  Sigrid Klerke,et al.  Improving sentence compression by learning to predict gaze , 2016, NAACL.

[9]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[10]  Josef Steinberger,et al.  Knowledge-poor Multilingual Sentence Compression , 2007 .

[11]  Claire Cardie,et al.  Multinomial Adversarial Networks for Multi-Domain Text Classification , 2018, NAACL.

[12]  Jungo Kasai,et al.  Robust Multilingual Part-of-Speech Tagging via Adversarial Training , 2017, NAACL.

[13]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[14]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[15]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders for Generating Discrete Structures , 2017, ArXiv.

[16]  Yoshua Bengio,et al.  Boundary-Seeking Generative Adversarial Networks , 2017, ICLR 2017.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Radu Soricut,et al.  Multilingual Word Embeddings using Multigraphs , 2016, ArXiv.

[19]  David Weiss,et al.  DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks , 2017, ArXiv.

[20]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[21]  Kazuhide Yamamoto,et al.  Japanese sentence compression using Simple English Wikipedia , 2015, 2015 International Conference on Asian Language Processing (IALP).

[22]  Hong Wang,et al.  Adversarial Sequence Tagging , 2016, IJCAI.

[23]  Joakim Nivre,et al.  Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging , 2013, TACL.

[24]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[25]  François Yvon,et al.  Parallel Sentence Compression , 2016, COLING.

[26]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[27]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[28]  Sandeep Subramanian,et al.  Adversarial Generation of Natural Language , 2017, Rep4NLP@ACL.

[29]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[30]  François Yvon,et al.  Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning , 2014, EMNLP.

[31]  Barbara Plank,et al.  Multilingual Projection for Parsing Truly Low-Resource Languages , 2016, TACL.