论文信息 - Multi-Label Prediction for Large Text Corpora

Multi-Label Prediction for Large Text Corpora

Here we study the problem of predicting labels for large text corpora where each text can be assigned a variable number of labels. The problem might seem trivial when the label dimensionality is small, and can be easily solved using a series of one-vs-all classifiers. However, as the label dimensionality increases to several thousand, the parameter space becomes extremely large, and it is no longer possible to use the one-vs-all technique. Here we propose a model based on the factorization of higher order word vector moments, as well as the cross moments between the labels and the words for multi-label prediction. Our model provides guaranteed converge bounds on the extracted parameters. Further, our model takes only three passes through the training dataset to obtain the parameters, resulting in a highly scalable algorithm that can train on GB’s of data consisting of millions of documents with hundreds of thousands of labels using a nominal resource of a single processor with 16GB RAM. Our model achieves 10x-15x order of speed-up on large-scale datasets while producing competitive performance in comparison with existing benchmark algorithms.

Sayantan Dasgupta | Sayantani Dasgupta

[1] Inderjit S. Dhillon,et al. Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[2] Lars Schmidt-Thieme,et al. BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[3] Mark Goadrich,et al. The relationship between Precision-Recall and ROC curves , 2006, ICML.

[4] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[5] Prateek Jain,et al. Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[6] Manik Varma,et al. Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages , 2013, WWW.

[7] Timothy N. Rubin,et al. Statistical topic models for multi-label document classification , 2011, Machine Learning.

[8] Tamara G. Kolda,et al. Shifted Power Method for Computing Tensor Eigenpairs , 2010, SIAM J. Matrix Anal. Appl..

[9] Jun Zhu,et al. Spectral Methods for Supervised Topic Models , 2014, NIPS.

[10] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[11] Cheng-Kok Koh,et al. From O(k2N) to O(N): A fast complex-valued eigenvalue solver for large-scale on-chip interconnect analysis , 2009, 2009 IEEE MTT-S International Microwave Symposium Digest.

[12] Manik Varma,et al. FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning , 2014, KDD.

[13] Percy Liang,et al. Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[14] Anima Anandkumar,et al. A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.