Semtinel: interactive supervision of automatic indexing

The use of thesaurus-based indexing is a common approach for improving the result of document retrieval. With the growing amount of documents available, manual indexing is no longer a feasible option and statistical methods for automated document indexing are becoming an attractive alternative. But especially in areas where manual indexing could be complemented or replaced by automatic systems, the correctness and completeness of the resulting annotations is very important. We argue that the quality of automatic indexing not only depends on the involved indexing system, but also on the quality of the thesaurus in regard to its ability to adequately cover the contents to be indexed. A manual verification of all automatically assigned annotations is obviously not a solution and it is questionable, if the verification of random samples would be sufficient to ensure an overall annotation quality that is comparable to manual annotations. We propose the integration of a revision performed by a human expert and supported by Semtinel into the process of automatic document indexing, as shown in the following diagram.