Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition

Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries.

[1]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Hong Jun Jang,et al.  Delayed Combination of Feature Embedding in Bidirectional LSTM CRF for NER , 2020, Applied Sciences.

[4]  Khurram Shahzad,et al.  Urdu Named Entity Recognition , 2019, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[5]  Harshad B. Bhadka,et al.  A Survey on Various Approach used in Named Entity Recognition for Indian Languages , 2017 .

[6]  Utpal Garain,et al.  Named Entity Recognition with Word Embeddings and Wikipedia Categories for a Low-Resource Language , 2017, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[7]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[8]  Dil Nawaz Hakro,et al.  Sindhi Named Entity Recognition (Sner) , 2017 .

[9]  Ricardo Ribeiro,et al.  Named Entity Recognition for Sensitive Data Discovery in Portuguese , 2020, Applied Sciences.

[10]  Iryna Gurevych,et al.  Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks , 2017, ArXiv.

[11]  Rohini K. Srihari,et al.  An Information-Extraction System for Urdu---A Resource-Poor Language , 2010, TALIP.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Yuenan Liu,et al.  Information Extraction from Electronic Medical Records Using Multitask Recurrent Neural Network with Contextual Word Embedding , 2019, Applied Sciences.

[14]  M. Memon,et al.  Handling Ambiguities in Sindhi Named Entity Recognition (SNER) , 2017 .

[15]  Wenjie Li,et al.  Joint Model of Entity Recognition and Relation Extraction with Self-attention Mechanism , 2020, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[16]  Liangli Ma,et al.  Combined Self-Attention Mechanism for Chinese Named Entity Recognition in Military , 2019, Future Internet.

[17]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[18]  Masanori Hattori,et al.  Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition , 2016, NLPCC/ICCPOL.

[19]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[20]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  Kewei Tu,et al.  Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning , 2021, ACL/IJCNLP.