Adversarial Adaptation of Synthetic or Stale Data

Two types of data shift common in practice are 1. transferring from synthetic data to live user data (a deployment shift), and 2. transferring from stale data to current data (a temporal shift). Both cause a distribution mismatch between training and evaluation, leading to a model that overfits the flawed training data and performs poorly on the test data. We propose a solution to this mismatch problem by framing it as domain adaptation, treating the flawed training dataset as a source domain and the evaluation dataset as a target domain. To this end, we use and build on several recent advances in neural domain adaptation such as adversarial training (Ganinet al., 2016) and domain separation network (Bousmalis et al., 2016), proposing a new effective adversarial training scheme. In both supervised and unsupervised adaptation scenarios, our approach yields clear improvement over strong baselines.

[1]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[2]  Xiao Li,et al.  Extracting structured information from user queries with semi-supervised conditional random fields , 2009, SIGIR.

[3]  Gökhan Tür,et al.  Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[4]  Dilek Z. Hakkani-Tür,et al.  Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Young-Bum Kim,et al.  Pre-training of Hidden-Unit CRFs , 2015, ACL.

[6]  Young-Bum Kim,et al.  Weakly Supervised Slot Tagging with Partially Labeled Sequences from Web Search Click Logs , 2015, NAACL.

[7]  Larry P. Heck,et al.  Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding , 2016, INTERSPEECH.

[8]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[9]  Young-Bum Kim,et al.  Scalable Semi-Supervised Query Classification Using Matrix Sketching , 2016, ACL.

[10]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[11]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[12]  Ruhi Sarikaya The technology powering personal digital assistants , 2015, INTERSPEECH.

[13]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[14]  Gökhan Tür,et al.  Extending domain coverage of language understanding systems via intent transfer between domains using knowledge graphs and search query click logs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Young-Bum Kim,et al.  Domain Attention with an Ensemble of Experts , 2017, ACL.

[18]  Young-Bum Kim,et al.  Natural Language Model Re-usability for Scaling to Different Domains , 2016, EMNLP.

[19]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[20]  Gary Geunbae Lee,et al.  Triangular-Chain Conditional Random Fields , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[22]  Bing Liu,et al.  Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks , 2016, SIGDIAL Conference.

[23]  Geoffrey Zweig,et al.  Joint semantic utterance classification and slot filling with recursive neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[24]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[27]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[28]  Fabrice Lefèvre,et al.  Zero-shot semantic parser for spoken language understanding , 2015, INTERSPEECH.

[29]  Young-Bum Kim,et al.  New Transfer Learning Techniques for Disparate Label Sets , 2015, ACL.

[30]  Young-Bum Kim,et al.  Compact Lexicon Selection with Spectral Methods , 2015, ACL.

[31]  Ruhi Sarikaya,et al.  Shrinkage based features for slot tagging with conditional random fields , 2014, INTERSPEECH.

[32]  Ruhi Sarikaya,et al.  An Empirical Investigation of Word Class-Based Features for Natural Language Understanding , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[33]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[34]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[35]  Young-Bum Kim,et al.  Frustratingly Easy Neural Domain Adaptation , 2016, COLING.

[36]  Young-Bum Kim,et al.  Domainless Adaptation by Constrained Decoding on a Schema Lattice , 2016, COLING.

[37]  Young-Bum Kim,et al.  Task specific continuous word representations for mono and multi-lingual spoken language understanding , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Ruhi Sarikaya,et al.  A discriminative model based entity dictionary weighting approach for spoken language understanding , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[39]  Gökhan Tür Multitask Learning for Spoken Language Understanding , 2006, ICASSP.

[40]  Regina Barzilay,et al.  Aspect-augmented Adversarial Networks for Domain Adaptation , 2017, TACL.

[41]  Asli Celikyilmaz,et al.  Convolutional Neural Network Based Semantic Tagging with Entity Embeddings , 2015 .

[42]  Young-Bum Kim,et al.  An overview of end-to-end language understanding and dialog management for personal digital assistants , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).