Active Learning with History-Based Query Selection for Text Categorisation

Automated text categorisation systems learn a generalised hypothesis from large numbers of labelled examples. However, in many domains labelled data is scarce and expensive to obtain. Active learning is a technique that has shown to reduce the amount of training data required to produce an accurate hypothesis. This paper proposes a novel method of incorporating predictions made in previous iterations of active learning into the selection of informative unlabelled examples. We show empirically how this method can lead to increased classification accuracy compared to alternative techniques.