A narrow-domain entity recognition method based on domain relevance measurement and context information

Entity recognition is the basis of text mining. With the further development of knowledge-driven applications, types of target entities are increasingly subdivided. The lack of corpus and the limited number of entity have been the main challenges of entity recognition. Based on this observation, this paper proposes a weak-supervision method for recognizing entities from a specifically narrow domain by fusing domain relevance measurement and context information. The experimental result shows that the proposed method has high efficiency and accuracy without manual participation.

[1]  Rafael Dueire Lins,et al.  Assessing sentence similarity through lexical, syntactic and semantic analysis , 2016, Comput. Speech Lang..

[2]  Jianhua Chen,et al.  Learning non-taxonomical semantic relations from domain texts , 2011, Journal of Intelligent Information Systems.

[3]  Ted Pedersen,et al.  Name Discrimination by Clustering Similar Contexts , 2005, CICLing.

[4]  Rafael Valencia-García,et al.  Ontology learning from biomedical natural language documents using UMLS , 2011, Expert Syst. Appl..

[5]  Diana Inkpen,et al.  Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[6]  Chin-Yew Lin,et al.  Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[7]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition using Support Vector Machine: A Language Independent Approach , 2010 .

[8]  Noémie Elhadad,et al.  Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts , 2013, J. Biomed. Informatics.

[9]  Yu Zhengtao,et al.  A Kind of Nonferrous Metal Industry Entity Recognition Model Based on Deep Neural Network Architecture , 2015 .

[10]  ElhadadNoémie,et al.  Unsupervised biomedical named entity recognition , 2013 .

[11]  Ah-Hwee Tan,et al.  CRCTOL: A semantic-based domain ontology learning system , 2010, J. Assoc. Inf. Sci. Technol..

[12]  Björn Gambäck,et al.  NTNU-CORE: Combining strong features for semantic similarity , 2013, *SEM@NAACL-HLT.

[13]  Tao Chen,et al.  Disease named entity recognition by combining conditional random fields and bidirectional recurrent neural networks , 2016, Database J. Biol. Databases Curation.

[14]  Wen Wen,et al.  Product named entity recognition for Chinese query questions based on a skip-chain CRF model , 2012, Neural Computing and Applications.

[15]  Maarten de Rijke,et al.  Finding Similar Sentences across Multiple Languages in Wikipedia , 2006 .

[16]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[17]  Xiang Zhang,et al.  Mechanical Product Concept Extraction Method , 2011 .

[18]  Bairong Shen,et al.  Combined SVM-CRFs for Biological Named Entity Recognition with Maximal Bidirectional Squeezing , 2012, PloS one.

[19]  Shie-Jue Lee,et al.  A Similarity Measure for Text Classification and Clustering , 2014, IEEE Transactions on Knowledge and Data Engineering.

[20]  Khurshid Ahmad,et al.  The head-modifier principle and multilingual term extraction , 2005, Natural Language Engineering.