PDALN: Progressive Domain Adaptation over a Pre-trained Model for Low-Resource Cross-Domain Named Entity Recognition

Cross-domain Named Entity Recognition (NER) transfers the NER knowledge from high-resource domains to the low-resource target domain. Due to limited labeled resources and domain shift, cross-domain NER is a challenging task. To address these challenges, we propose a progressive domain adaptation Knowledge Distillation (KD) approach – PDALN. It achieves superior domain adaptability by employing three components: (1) Adaptive data augmentation techniques, which alleviate cross-domain gap and label sparsity simultaneously; (2) Multi-level Domain invariant features, derived from a multi-grained MMD (Maximum Mean Discrepancy) approach, to enable knowledge transfer across domains; (3) Advanced KD schema, which progressively enables powerful pre-trained language models to perform domain adaptation. Extensive experiments on four benchmarks show that PDALN can effectively adapt high-resource domains to low-resource target domains, even if they are diverse in terms and writing styles. Comparison with other baselines indicates the state-of-the-art performance of PDALN.

[1]  Zhiyuan Liu,et al.  Low-Resource Name Tagging Learned with Weakly Labeled Data , 2019, EMNLP.

[2]  Chao Zhang,et al.  BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision , 2020, KDD.

[3]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[4]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[5]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[6]  Philip S. Yu,et al.  Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[7]  Ruslan Salakhutdinov,et al.  Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks , 2016, ICLR.

[8]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[9]  Alan Ritter,et al.  Results of the WNUT16 Named Entity Recognition Shared Task , 2016, NUT@COLING.

[10]  Shiliang Sun,et al.  A survey of multi-source domain adaptation , 2015, Inf. Fusion.

[11]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[12]  Jie Tang,et al.  Self-Supervised Learning: Generative or Contrastive , 2020, IEEE Transactions on Knowledge and Data Engineering.

[13]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Wei Lu,et al.  Neural Adaptation Layers for Cross-domain Named Entity Recognition , 2018, EMNLP.

[16]  Franck Dernoncourt,et al.  Transfer Learning for Named-Entity Recognition with Neural Networks , 2017, LREC.

[17]  Salim Jouili,et al.  Improving Topic Quality by Promoting Named Entities in Topic Modeling , 2018, ACL.

[18]  Ken Chen,et al.  Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition , 2018, NAACL.

[19]  Philip Yu,et al.  MZET: Memory Augmented Zero-Shot Fine-grained Named Entity Typing , 2020, COLING.

[20]  Xu Sun,et al.  A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media , 2017, AAAI.

[21]  Liang Xiao,et al.  Cross-Domain NER using Cross-Domain Language Modeling , 2019, ACL.

[22]  Thamar Solorio,et al.  A Multi-task Approach for Named Entity Recognition in Social Media Data , 2017, NUT@EMNLP.

[23]  Wenpeng Yin,et al.  Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System , 2021, NAACL.

[24]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[25]  Pascale Fung,et al.  Zero-Resource Cross-Domain Named Entity Recognition , 2020, REPL4NLP.

[26]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[27]  Philip S. Yu,et al.  Find or Classify? Dual Strategy for Slot-Value Predictions on Multi-Domain Dialog State Tracking , 2019, STARSEM.

[28]  Philip S. Yu,et al.  Augmenting Sequential Recommendation with Pseudo-Prior Items via Reversely Pre-training Transformer , 2021, SIGIR.

[29]  Madian Khabsa,et al.  CLEAR: Contrastive Learning for Sentence Representation , 2020, ArXiv.

[30]  Joel Nothman,et al.  Named Entity Recognition in Wikipedia , 2009, PWNLP@IJCNLP.

[31]  Philip S. Yu,et al.  Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference , 2020, EMNLP.

[32]  H. Ng,et al.  Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training , 2020, EMNLP.

[33]  Iryna Gurevych,et al.  Low Resource Sequence Tagging with Weak Labels , 2020, AAAI.

[34]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[35]  Philip S. Yu,et al.  JSCN: Joint Spectral Convolutional Network for Cross Domain Recommendation , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[36]  Yashar Mehdad,et al.  Domain Adaptation for Named Entity Recognition in Online Media with Word Embeddings , 2016, ArXiv.

[37]  Deniz Karatay,et al.  User Interest Modeling in Twitter with Named Entity Recognition , 2015, #MSM.

[38]  Philip S. Yu,et al.  Composed Variational Natural Language Generation for Few-shot Intents , 2020, FINDINGS.

[39]  Yue Zhang,et al.  Multi-Cell Compositional LSTM for NER Domain Adaptation , 2020, ACL.

[40]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[41]  Rick Siow Mong Goh,et al.  Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition , 2019, ACL.