Deep Pivot-Based Modeling for Cross-language Cross-domain Transfer with Minimal Guidance

While cross-domain and cross-language transfer have long been prominent topics in NLP research, their combination has hardly been explored. In this work we consider this problem, and propose a framework that builds on pivot-based learning, structure-aware Deep Neural Networks (particularly LSTMs and CNNs) and bilingual word embeddings, with the goal of training a model on labeled data from one (language, domain) pair so that it can be effectively applied to another (language, domain) pair. We consider two setups, differing with respect to the unlabeled data available for model training. In the full setup the model has access to unlabeled data from both pairs, while in the lazy setup, which is more realistic for truly resource-poor languages, unlabeled data is available for both domains but only for the source language. We design our model for the lazy setup so that for a given target domain, it can train once on the source language and then be applied to any target language without re-training. In experiments with nine English-German and nine English-French domain pairs our best model substantially outperforms previous models even when it is trained in the lazy setup and previous models are trained in the full setup.

[1]  Joakim Nivre,et al.  Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging , 2013, TACL.

[2]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[3]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[4]  Benno Stein,et al.  Cross-Lingual Adaptation Using Structural Correspondence Learning , 2010, TIST.

[5]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[6]  Eugene Charniak,et al.  Automatic Domain Adaptation for Parsing , 2010, NAACL.

[7]  Ken-ichi Kawarabayashi,et al.  Unsupervised Cross-Domain Word Representation Learning , 2015, ACL.

[8]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[9]  Ari Rappoport,et al.  Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets , 2007, ACL.

[10]  Manaal Faruqui,et al.  Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.

[11]  Van Rooyen G-J,et al.  Learning structural correspondences across different linguistic domains with synchronous neural language models , 2012 .

[12]  Samuel L. Smith,et al.  Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[13]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[14]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[15]  Danushka Bollegala,et al.  Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification , 2011, ACL.

[16]  Andrea Esuli,et al.  Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification , 2017, ERCIM News.

[17]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[18]  Lei Shi,et al.  Cross Language Text Classification by Model Translation and Semi-Supervised Learning , 2010, EMNLP.

[19]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[22]  Roi Reichart,et al.  Pivot Based Language Modeling for Improved Neural Domain Adaptation , 2018, NAACL.

[23]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[24]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[25]  Xiaojun Wan,et al.  Attention-based LSTM Network for Cross-Lingual Sentiment Classification , 2016, EMNLP.

[26]  Anders Søgaard,et al.  A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..

[27]  David Yarowsky,et al.  Cross-lingual Dependency Parsing Based on Distributed Representations , 2015, ACL.

[28]  Danushka Bollegala,et al.  Relation Adaptation: Learning to Extract Novel Relations with Minimum Supervision , 2011, IJCAI.

[29]  Wei Yang,et al.  A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings , 2017, EMNLP.

[30]  Hinrich Schütze,et al.  Towards Robust Cross-Domain Domain Adaptation for Part-of-Speech Tagging , 2013, IJCNLP.

[31]  Alexander M. Rush,et al.  Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints , 2012, EMNLP-CoNLL.

[32]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[33]  Xiaojun Wan,et al.  Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[34]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[35]  Yixin Chen,et al.  Automatic Feature Decomposition for Single View Co-training , 2011, ICML.

[36]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[37]  Jianfei Yu,et al.  Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification , 2016, EMNLP.

[38]  Roi Reichart,et al.  Neural Structural Correspondence Learning for Domain Adaptation , 2016, CoNLL.