On a generalized name entity recognizer based on Hidden Markov Models

This paper presents a Named Entity Recognition (NER) system based on Hidden Markov Models. The system design is language independent, and the target language and scope of the NER is determined by the training corpus. The NER is formed by two subsystems that detect and label the entities independently. Each subsystem implements a different approach of that statistical theory, showing that each component may complement the results of the other one. Unlike most of the previous works, two labels are returned when the components provide different results. This redundancy is an advantage when human supervision is mandatory at the end of the process such as in intelligence environments.