论文信息 - CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning

CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning

Named Entity Recognition (NER) in FewShot setting is imperative for entity tagging in low resource domains. Existing approaches only learn class-specific semantic features and intermediate representations from source domains. This affects generalizability to unseen target domains, resulting in suboptimal performances. To this end, we present CONTAINER, a novel contrastive learning technique that optimizes the inter-token distribution distance for Few-Shot NER. Instead of optimizing class-specific attributes, CONTAINER optimizes a generalized objective of differentiating between token categories based on their Gaussian-distributed embeddings. This effectively alleviates overfitting issues originating from training domains. Our experiments in several traditional test domains (OntoNotes, CoNLL’03, WNUT ’17, GUM) and a new large scale Few-Shot NER dataset (Few-NERD) demonstrate that on average, CONTAINER outperforms previous methods by 3%-13% absolute F1 points while showing consistent performance trends, even in challenging scenarios where previous approaches could not achieve appreciable performance.

Rebecca J. Passonneau | Arzoo Katiyar | Rui Zhang | Sarkar Snigdha Sarathi Das

[1] Eduard H. Hovy,et al. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[2] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4] Leyang Cui,et al. Template-Based Named Entity Recognition Using BART , 2021, FINDINGS.

[5] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[6] Jian Sun,et al. Induction Networks for Few-Shot Text Classification , 2019, EMNLP/IJCNLP.

[7] Leon Derczynski,et al. Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[8] Özlem Uzuner,et al. Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus , 2015, J. Biomed. Informatics.

[9] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10] Abbas Ghaddar,et al. WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition , 2017, IJCNLP.

[11] Douwe Kiela,et al. True Few-Shot Learning with Language Models , 2021, NeurIPS.

[12] Guillaume Lample,et al. Neural Architectures for Named Entity Recognition , 2016, NAACL.

[13] Joshua B. Tenenbaum,et al. One shot learning of simple visual concepts , 2011, CogSci.

[14] Fei-FeiLi,et al. One-Shot Learning of Object Categories , 2006 .

[15] Yan Wang,et al. SimpleShot: Revisiting Nearest-Neighbor Classification for Few-Shot Learning , 2019, ArXiv.