A Prototypical Semantic Decoupling Method via Joint Contrastive Learning for Few-Shot Name Entity Recognition

Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances. Most existing prototype-based sequence labeling models tend to memorize entity mentions which would be easily confused by close prototypes. In this paper, we proposed a Prototypical Semantic Decoupling method via joint Contrastive learning (PSDC) for few-shot NER. Specifically, we decouple class-specific prototypes and contextual semantic prototypes by two masking strategies to lead the model to focus on two different semantic information for inference. Besides, we further introduce joint contrastive learning objectives to better integrate two kinds of decoupling information and prevent semantic collapse. Experimental results on two few-shot NER benchmarks demonstrate that PSDC consistently outperforms the previous SOTA methods in terms of overall performance. Extensive analysis further validates the effectiveness and generalization of PSDC.

[1]  T. Zhao,et al.  Decomposed Meta-Learning for Few-Shot Named Entity Recognition , 2022, FINDINGS.

[2]  Qingyu Zhou,et al.  An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling , 2021, NAACL.

[3]  Sarkar Snigdha Sarathi Das,et al.  CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning , 2021, ACL.

[4]  Haitao Zheng,et al.  Few-NERD: A Few-shot Named Entity Recognition Dataset , 2021, ACL.

[5]  Danqi Chen,et al.  SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.

[6]  Yi Yang,et al.  Frustratingly Simple Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning , 2020, EMNLP.

[7]  Zhihan Zhou,et al.  Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network , 2020, ACL.

[8]  Junnan Li,et al.  Prototypical Contrastive Learning of Unsupervised Representations , 2020, ICLR.

[9]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[10]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[11]  Leon Derczynski,et al.  Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[12]  Amir Zeldes,et al.  The GUM corpus: creating multilayer resources in the classroom , 2016, Language Resources and Evaluation.

[13]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[14]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[15]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[16]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[17]  Hwee Tou Ng,et al.  Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[18]  Hang Li,et al.  Named entity recognition in query , 2009, SIGIR.

[19]  Diego Mollá Aliod,et al.  Named Entity Recognition for Question Answering , 2006, ALTA.

[20]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[21]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .