Annotation Process Management Revisited

Proper annotation process management is crucial to the construction of corpora, which are indispensable to the data-driven techniques that have come to the forefront in NLP during the last two decades. This paper first raises a list of 10 needs that any general purpose annotation system should address, such as user & role management, delegation & monitoring of work, diffing annotators’ work, versioning of corpora, multilingual support, and so on. A framework to address these needs is then proposed. The explanation of the framework is followed by an introduction of SLATE (Segment and Link-based Annotation Tool Enhanced), the second iteration of a web-based annotation tool, which is being rewritten to implement the proposed framework.