Classifying the Hungarian Web

In this paper we present some lessons learned from building vizsla, the keyword search and topic classification system used on the largest Hungarian portal, [origo.hu]. Based on a simple statistical language, model, and the large-scale supporting evidence from vizsla, we argue that in topic classification only positive evidence matters.