Distributing Frank-Wolfe via Map-Reduce
暂无分享,去创建一个
[1] Kenneth L. Clarkson,et al. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.
[2] Zaïd Harchaoui,et al. Lifted coordinate descent for learning with trace-norm regularization , 2012, AISTATS.
[3] Yi Zhou,et al. Conditional Gradient Sliding for Convex Optimization , 2016, SIAM J. Optim..
[4] F. Leighton,et al. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .
[5] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[6] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[7] Thomas Hofmann,et al. Map-Reduce for Machine Learning on Multicore , 2007 .
[8] Arindam Banerjee,et al. Structured Estimation with Atomic Norms: General Bounds and Applications , 2015, NIPS.
[9] J. Sherman,et al. Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .
[10] DAN GARBER,et al. A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization , 2016, SIAM J. Optim..
[11] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[12] Alexander J. Smola,et al. Stochastic Frank-Wolfe methods for nonconvex optimization , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[13] A. Ng. Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.
[14] A. Asuncion,et al. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .
[15] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[16] Pradeep Ravikumar,et al. Greedy Algorithms for Structurally Constrained High Dimensional Problems , 2011, NIPS.
[17] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[18] Sergei Vassilvitskii,et al. Fast greedy algorithms in mapreduce and streaming , 2013, SPAA.
[19] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[20] Parikshit Shah,et al. Linear system identification via atomic norm regularization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[21] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.
[22] Shimrit Shtern,et al. Linearly convergent away-step conditional gradient for non-strongly convex functions , 2015, Mathematical Programming.
[23] E T. Leighton,et al. Introduction to parallel algorithms and architectures , 1991 .
[24] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[25] Peng Li,et al. Distance Metric Learning with Eigenvalue Optimization , 2012, J. Mach. Learn. Res..
[26] Fei-Fei Li,et al. Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.
[27] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[28] Pablo A. Parrilo,et al. The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.
[29] Maria-Florina Balcan,et al. A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning , 2014, SDM.
[30] Haipeng Luo,et al. Variance-Reduced and Projection-Free Stochastic Optimization , 2016, ICML.
[31] Sergei Vassilvitskii,et al. Scalable K-Means++ , 2012, Proc. VLDB Endow..
[32] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[33] Jan Vondrák,et al. Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..
[34] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[35] Tianbao Yang,et al. Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent , 2013, NIPS.
[36] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[37] Patrice Marcotte,et al. Some comments on Wolfe's ‘away step’ , 1986, Math. Program..
[38] Sergei Vassilvitskii,et al. Counting triangles and the curse of the last reducer , 2011, WWW.
[39] Douglas Stott Parker,et al. Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.
[40] Matthijs Douze,et al. Large-scale image classification with trace-norm regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[41] Andreas Krause,et al. Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains , 2016, AISTATS.
[42] Martin Jaggi,et al. On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.
[43] Aaron Q. Li,et al. Parameter Server for Distributed Machine Learning , 2013 .