“Il Piave mormorava…”: Recognizing Locations and other Named Entities in Italian Texts on the Great War

SUMMARY. Increasing amounts of sources about World War I (WWI) are nowadays available in digital form. In this paper, we illustrate the automatic creation of a NE-annotated domain corpus used to adapt an existing NER to Italian WWI texts. We discuss the annotation of the ntraining and test corpus and provide results of the system evaluation. RIASSUNTO. Negli ultimi anni, si sono resi disponibili in formato digitale un numero sempre maggiore di materiali riguardanti la Prima Guerra Mondiale. In questo lavoro illustriamo la creazione automatica di un corpus di addestramento per adattare un NER esistente a testi italiani sulla Prima Guerra Mondiale e presentiamo i risultati della valutazione del nostro sistema addestrato sul nuovo corpus.

[1]  Rohini K. Srihari,et al.  A Hybrid Approach for Named Entity and Sub-Type Tagging , 2000, ANLP.

[2]  Felice Dell'Orletta,et al.  Accurate Dependency Parsing with a Stacked Multilayer Perceptron , 2009 .

[3]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[4]  Esslli Site,et al.  Natural Language Processing for Historical Texts , 2012 .

[5]  Marc Moens,et al.  XML Tools And Architecture for Named Entity Recognition , 1999, Markup Lang..

[6]  Valentina Bartalesi Lenzi,et al.  EVALITA 2011: Description and Results of the Named Entity Recognition on Transcribed Broadcast News Task , 2011 .

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Felice Dell'Orletta,et al.  Ensemble system for Part-of-Speech tagging , 2009 .

[9]  Simonetta Montemagni,et al.  Computational Analysis of Historical Documents : An Application to Italian War Bulletins in World War I and II , 2014 .

[10]  Beatrice Alex,et al.  Investigating the Effects of Selective Sampling on the Annotation Task , 2005 .

[11]  Malvina Nissim,et al.  Recognising Geographical Entities in Scottish Historical Documents , 2003 .

[12]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[13]  Alessandro Lenci,et al.  Extracting Terms with EXTra , 2016 .

[14]  Michael Piotrowski,et al.  Natural Language Processing for Historical Texts , 2012, Synthesis Lectures on Human Language Technologies.

[15]  Claire Grover,et al.  Named Entity Recognition for Digitised Historical Texts , 2008, LREC.