Personal Name Extraction from Japanese Historical Documents Using Machine Learning

In this poster, we propose a method for extracting persons' real names and aliases from Japanese historical documents. In this method, we extract personal names and aliases by applying a named entity extraction technique based on machine learning using characters as the unit of analysis. One of the features of this method is that it uses already attached annotations to named entities in order to find undiscovered ones. Experimental results showed that our proposed method was able to extract personal names and aliases from "Yakusha-Hyoban-Ki", a collection of review documents of Kabuki actors in Edo Era (1603-1868) in Japan, with approximately 0.91 in F-measure.