论文信息 - Online (cid:96) 1 -Dictionary Learning with Application to Novel Document Detection

Online (cid:96) 1 -Dictionary Learning with Application to Novel Document Detection

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online l1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the l1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online l1-dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

[1] Jure Leskovec,et al. Proceedings of the First Workshop on Social Media Analytics , 2010, KDD 2010.

[2] Thomas Hofmann,et al. Probabilistic Latent Semantic Analysis , 1999, UAI.

[3] Vikas Sindhwani,et al. Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization , 2012, WSDM '12.

[4] Patrick L. Combettes,et al. Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[5] Bastian Goldlücke,et al. Variational Analysis , 2014, Computer Vision, A Reference Guide.

[6] R. Tibshirani,et al. PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[7] Vikas Sindhwani,et al. Concept Labeling: Building Text Classifiers with Minimal Supervision , 2011, IJCAI.

[8] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[9] John Wright,et al. Dense Error Correction via L1-Minimization , 2008, 0809.0199.

[10] John Wright,et al. Dense Error Correction Via $\ell^1$-Minimization , 2010, IEEE Transactions on Information Theory.

[11] Paul Tseng,et al. Approximation accuracy, gradient methods, and error bound for structured convex optimization , 2010, Math. Program..

[12] Patrik O. Hoyer,et al. Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[13] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[14] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[15] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[16] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[17] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18] Yoram Singer,et al. Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[19] Ambuj Tewari,et al. Composite objective mirror descent , 2010, COLT 2010.

[20] Yoram Singer,et al. Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[21] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[22] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[23] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[24] Vikas Sindhwani,et al. Emerging topic detection using dictionary learning , 2011, CIKM '11.

[25] Guillermo Sapiro,et al. Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[26] Huan Wang,et al. On the local correctness of ℓ1-minimization for dictionary learning , 2011, 2014 IEEE International Symposium on Information Theory.

[27] Mark W. Schmidt,et al. Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[28] Arindam Banerjee,et al. Online Alternating Direction Method , 2012, ICML.

[29] Junfeng Yang,et al. Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[30] Allen Y. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[32] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[33] Miles Osborne,et al. Streaming First Story Detection with application to Twitter , 2010, NAACL.

[34] Allen Y. Yang,et al. Fast ℓ1-minimization algorithms and an application in robust face recognition: A review , 2010, 2010 IEEE International Conference on Image Processing.