With the advancements of new technologies, a large volume of digital data is getting generated every second from various internal and external sources like social networking, organizations and any business applications. Big data refers to enormous digital data that are high in volume, velocity, varieties. The traditional conventional approach fails to handle large data sets using their tools and techniques. Big data proved to be an effective mean for collecting, analyzing and processing data despite their size and data formats structured, semi-structured or unstructured. Large set of information and data are produced from different organizations and social activities. Text mining or text analytics plays a significant role in deriving relevant information from text in digital environment. Text mining includes technique like entity extraction which automatically extracts structured information from unstructured or semi-structured documents. This paper details how entity extraction is useful in processing human language texts by using natural language processing. Entity extraction based on method like part-of-speech tagging which helps in determining the noun, verb, adverb and adjectives associated with a sentence. Enhanced entity extraction method will be mainly useful for filtering entities based on their part-of-speeches by removing any ambiguities. Entity extraction focuses on relevant parts of a document and represents them in a structured manner.
[1]
Min Song,et al.
PKDE4J: Entity and relation extraction for public knowledge discovery
,
2015,
J. Biomed. Informatics.
[2]
Abhishek Gupta,et al.
Full Length R eview Article A SURVEY ON TEXT ANALYTICS AND CLASSIFICATION TECHNIQUES FOR TEXT DOCUMENTS
,
2015
.
[3]
W. Premchaiswadi,et al.
Thai personal named entity extraction without using word segmentation or POS tagging
,
2009,
2009 Eighth International Symposium on Natural Language Processing.
[4]
B. V. Pawar,et al.
Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages
,
2016
.
[5]
Siddhartha R. Jonnalagadda,et al.
PDF text classification to leverage information extraction from publication reports
,
2016,
J. Biomed. Informatics.
[6]
Manjit Kaur,et al.
BIG Data and Methodology-A review
,
2013
.
[7]
Srinivasa Rao Kundeti,et al.
Clinical named entity recognition: Challenges and opportunities
,
2016,
2016 IEEE International Conference on Big Data (Big Data).
[8]
Kjetil Nørvåg,et al.
Extracting Named Entities and Synonyms from Wikipedia
,
2010,
2010 24th IEEE International Conference on Advanced Information Networking and Applications.