Categorizing Gigabytes: Experiments on the RCV1 Corpus