Learning Structured Representations of Entity Names Using Active Learning and Weak Supervision

Structured representations of entity names are useful for many entity-related tasks such as entity normalization and variant generation. Learning the implicit structured representations of entity names without context and external knowledge is particularly challenging. In this paper, we present a novel learning framework that combines active learning and weak supervision to solve this problem. Our experimental evaluation show that this framework enables the learning of high-quality models from merely a dozen or so labeled examples.

[1]  Heng Ji,et al.  Collaborative Ranking: A Case Study on Entity Linking , 2011, EMNLP.

[2]  Paul Thompson,et al.  Name Searching and Information Retrieval , 1997, EMNLP.

[3]  Kun Qian,et al.  PARTNER: Human-in-the-Loop Entity Name Understanding with Deep Learning , 2020, AAAI.

[4]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[5]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[6]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[7]  Xiaolong Wang,et al.  CNN-based ranking for biomedical entity normalization , 2017, BMC Bioinformatics.

[8]  H. V. Jagadish,et al.  Exploiting Structure in Representation of Named Entities using Active Learning , 2018, COLING.

[9]  Joachim Denzler,et al.  Active and Incremental Learning with Weak Supervision , 2020, KI - Künstliche Intelligenz.

[10]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[11]  David Nadeau,et al.  Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision , 2007 .

[12]  Jungo Kasai,et al.  Low-resource Deep Entity Resolution with Transfer and Active Learning , 2019, ACL.

[13]  Raghav Kaushik,et al.  A grammar-based entity representation framework for data cleaning , 2009, SIGMOD Conference.

[14]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[15]  Prithviraj Sen,et al.  Learning-Based Methods with Human-in-the-Loop for Entity Resolution , 2019, CIKM.

[16]  Pierre Lison,et al.  Named Entity Recognition without Labelled Data: A Weak Supervision Approach , 2020, ACL.

[17]  Fei Wang,et al.  A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization , 2018, AAAI.

[18]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[19]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[20]  H. V. Jagadish,et al.  LUSTRE: An Interactive System for Entity Structured Representation and Variant Generation , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[21]  Xu Sun,et al.  A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media , 2017, AAAI.

[22]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[23]  Prithviraj Sen,et al.  Active Learning for Large-Scale Entity Resolution , 2017, CIKM.

[24]  Shiying Luo,et al.  Weakly Supervised Sequence Tagging from Noisy Rules , 2020, AAAI.

[25]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[26]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[27]  Sophie Rosset,et al.  Tree Representations in Probabilistic Models for Extended Named Entities Detection , 2012, EACL.

[28]  Prithviraj Sen,et al.  SystemER: A Human-in-the-loop System for Explainable Entity Resolution , 2019, Proc. VLDB Endow..