The Named Entity Recognizer Framework

Name Entity Recognition (NER) has been emerged as one of the Natural Language Processing (NLP) technology. This paper presents a Name Entity Recognition System for English and Hindi language. The English language mixes case text so there is presence of some clues such as initial capitalized letters clearly indicates the presence of name entities like name, place etc. but Hindi language doesn't provide such clues, so it is difficult to identify name entities in Hindi. In order to overcome this problem we store corresponding Hindi text of English words in our database. For this we build our own database which contains places, names and organization entities with their respective sub-categories as well as their Hindi transliteration. Since it's not possible to store all the numbers and dates in the database, we solve this type of problem with the help of already defined different date formats and patterns as well as matcher function.