We describe an automatic projection algorithm for transferring frame-semantic information from English to Italian texts as a first sep towards the creation of Italian FrameNet. Given an English text with frame information and its Italian translation, we project the annotation in four steps: first the Italian text is parsed, then English-Italian alignment is automatically carried out at word level, then we extract the semantic head for every annotated constituent on the English corpus side and finally we project annotation from English to Italian using aligned semantic heads as bridge. With our work, we point out typical features of the Italian language as regards frame-semantic annotation, in particular we describe peculiarities of Italian that at the moment make the projection task more difficult than in the above-mentioned examples. Besides, we created a gold standard with 987 manually annotated sentences to evaluate the algorithm.
[1]
Philipp Koehn,et al.
Europarl: A Parallel Corpus for Statistical Machine Translation
,
2005,
MTSUMMIT.
[2]
Mirella Lapata,et al.
Cross-Lingual Bootstrapping of Semantic Lexicons: The Case of FrameNet
,
2005,
AAAI.
[3]
Richard Johansson,et al.
A FrameNet-Based Semantic Role Labeler for Swedish
,
2006,
ACL.
[4]
John B. Lowe,et al.
The Berkeley FrameNet Project
,
1998,
ACL.
[5]
Emanuele Pianta,et al.
Knowledge Intensive Word Alignment with KNOWA
,
2004,
COLING.
[6]
Guillaume Pitel,et al.
Annotation précise du français en sémantique de rôles par projection cross-linguistique
,
2007
.
[7]
Katrin Erk,et al.
SALTO - A Versatile Multi-Level Annotation Tool
,
2006,
LREC.
[8]
ANALISI SINTATTICA,et al.
ANALISI SINTATTICA STATISTICA BASATA SU COSTITUENTI PHRASE-BASED STATISTICAL PARSING
,
2007
.