Inforex - a collaborative system for text corpora annotation and analysis

We report a first major upgrade of Inforex — a web-based system for qualitative and collaborative text corpora annotation and analysis. Inforex is a part of Polish CLARIN infrastructure. It is integrated with a digital repository for storing and publishing language resources and allows to visualize, browse and annotate text corpora stored in the repository. As a result of a series of workshops for researches from humanities and social sciences fields we improved the graphical interface to make the system more friendly and readable for non-experienced users. We also implemented a new functionality for gold standard annotation which includes private annotations and annotation agreement by a super-annotator.

[1]  Marcin Oleksy,et al.  Temporal Expressions in Polish Corpus KPWr , 2015 .

[2]  Marcin Oleksy,et al.  Liner2 - a Generic Framework for Named Entity Recognition , 2017, BSNLP@EACL.

[3]  Marcin Oleksy,et al.  Towards an event annotated corpus of Polish , 2015 .

[4]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[5]  Maciej Piasecki,et al.  Rich Set of Features for Proper Name Recognition in Polish Texts , 2011, SIIS.

[6]  Adam Kilgarriff,et al.  The Sketch Engine: ten years on , 2014 .

[7]  Maciej Piasecki,et al.  Towards Word Sense Disambiguation of Polish , 2008, 2008 International Multiconference on Computer Science and Information Technology.

[8]  Marcin Ptak,et al.  Preliminary Study on Automatic Induction of Rules for Recognition of Semantic Relations between Proper Names in Polish Texts , 2012, TSD.

[9]  Maciej Piasecki,et al.  Structure Annotation in the Polish Corpus of Suicide Notes , 2011, TSD.

[10]  Pavel Rychlý,et al.  Manatee/Bonito - A Modular Corpus Manager , 2007, RASLAN.

[11]  Bartosz Broda,et al.  KPWr: Towards a Free Corpus of Polish , 2012, LREC.

[12]  Iryna Gurevych,et al.  A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures , 2016, LT4DH@COLING.

[13]  Maciej Piasecki,et al.  A preliminary Noun Phrase Chunker for Polish , 2010 .

[14]  Maciej Janicki,et al.  Liner2 - A Customizable Framework for Proper Names Recognition for Polish , 2013, Intelligent Tools for Building a Scientific Information Platform.

[15]  Kalina Bontcheva,et al.  GATE Teamware: a web-based, collaborative text annotation framework , 2013, Lang. Resour. Evaluation.

[16]  Michał Marcińczuk,et al.  Towards Recognition of Spatial Relations between Entities for Polish , 2016 .