Automatic Annotation Suggestions and Custom Annotation Layers in WebAnno

In this paper, we present a flexible approach to the efficient and exhaustive manual annotation of text documents. For this purpose, we extend WebAnno (Yimam et al., 2013) an open-source web-based annotation tool. 1 While it was previously limited to specific annotation layers, our extension allows adding and configuring an arbitrary number of layers through a web-based UI. These layers can be annotated separately or simultaneously, and support most types of linguistic annotations such as spans, semantic classes, dependency relations, lexical chains, and morphology. Further, we tightly integrate a generic machine learning component for automatic annotation suggestions of span annotations. In two case studies, we show that automatic annotation suggestions, combined with our split-pane UI concept, significantly reduces annotation time.

[1]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[2]  Thomas S. Morton,et al.  WordFreak: An Open Tool for Linguistic Annotation , 2003, HLT-NAACL.

[3]  Thilo Götz,et al.  Design and implementation of the UIMA Common Analysis System , 2004, IBM Syst. J..

[4]  Klemens Böhm,et al.  Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor , 2007, ECDL.

[5]  Nancy Ide,et al.  GrAF: A Graph-based Format for Linguistic Annotations , 2007, LAW@ACL.

[6]  Christian Chiarcos,et al.  ANNIS: A Search Tool for Multi-Layer Annotated Corpora , 2009 .

[7]  Binyam Gebrekidan Gebre,et al.  Part of speech tagging for Amharic , 2010 .

[8]  Kalina Bontcheva,et al.  Text Processing with GATE , 2011 .

[9]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[10]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[11]  Iryna Gurevych,et al.  WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations , 2013, ACL.

[12]  Kalina Bontcheva,et al.  GATE Teamware: a web-based, collaborative text annotation framework , 2013, Lang. Resour. Evaluation.

[13]  Lucia Specia,et al.  Reducing Annotation Effort for Quality Estimation via Active Learning , 2013, ACL.

[14]  Christian Biemann,et al.  NoSta-D Named Entity Annotation for German: Guidelines and Dataset , 2014, LREC.

[15]  Louise Deléger,et al.  Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements , 2013, J. Am. Medical Informatics Assoc..