Named Entity Recognition

This chapter presents the application of ETL to language independent named entity recognition (NER). The NER task consists of finding all proper nouns in a text and classifying them among several given categories of interest. We apply ETL and ETL Committee to three different corpora in three different languages: Portuguese, Spanish and Dutch. ETL system achieves state-of-the-art competitive results for the three corpora. Moreover, ETL Committee significantly improves the ETL results for the three corpora. This chapter is organized as follows. In Sect. 7.1, we describe the NER task and the selected corpora. In Sect. 7.2, we detail some modeling configurations used in our NER system. In Sect. 7.3, we show some configurations used in the machine learning algorithms. Section 7.4 presents the application of ETL for the HAREM Corpus. In Sect. 7.5, we present the application of ETL for the SPA CoNLL-2002. In Sect. 7.6, we detail the application of ETL for the DUT CoNLL-2002. Finally, Sect. 7.7 presents some concluding remarks.