Robust Bloom Filters for Large MultiLabel Classification Tasks

This paper presents an approach to multilabel classification (MLC) with a large number of labels. Our approach is a reduction to binary classification in which label sets are represented by low dimensional binary vectors. This representation follows the principle of Bloom filters, a space-efficient data structure originally designed for approximate membership testing. We show that a naive application of Bloom filters in MLC is not robust to individual binary classifiers' errors. We then present an approach that exploits a specific feature of real-world datasets when the number of labels is large: many labels (almost) never appear together. Our approach is provably robust, has sublinear training and inference complexity with respect to the number of labels, and compares favorably to state-of-the-art algorithms on two large scale multilabel datasets.

[1]  Eyke Hüllermeier,et al.  Combining instance-based learning and logistic regression for multilabel classification , 2009, Machine Learning.

[2]  Eyke Hüllermeier,et al.  Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains , 2010, ICML.

[3]  Eyke Hüllermeier,et al.  On label dependence and loss minimization in multi-label classification , 2012, Machine Learning.

[4]  Larry Carter,et al.  Exact and approximate membership testers , 1978, STOC.

[5]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[6]  Lihi Zelnik-Manor,et al.  Large Scale Max-Margin Multi-Label Classification with Priors , 2010, ICML.

[7]  Hsuan-Tien Lin,et al.  Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.

[8]  Ohad Shamir,et al.  Multiclass-Multilabel Classification with More Classes than Examples , 2010, AISTATS.

[9]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[10]  Hsuan-Tien Lin,et al.  Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  Kenneth J. Christensen,et al.  A new analysis of the false positive rate of a Bloom filter , 2010, Inf. Process. Lett..

[13]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[14]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.