Scalable algorithms for large-scale machine learning problems : Application to multiclass classification and asynchronous distributed optimization. (Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchron
暂无分享,去创建一个
[1] Bikash Joshi,et al. Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification , 2017, NIPS.
[2] Johannes Fürnkranz,et al. Efficient prediction algorithms for binary decomposition techniques , 2011, Data Mining and Knowledge Discovery.
[3] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..
[4] Michalis K. Titsias,et al. One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities , 2016, NIPS.
[5] Alfonso Niño,et al. A Survey of Parallel Programming Models and Tools in the Multi and Many-core Era , 2022 .
[6] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[7] Jason Weston,et al. Multi-Class Support Vector Machines , 1998 .
[8] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[9] Hsuan-Tien Lin,et al. Multi-label Classification with Error-correcting Codes , 2011, ACML.
[10] Bikash Joshi,et al. On Binary Reduction of Large-Scale Multiclass Classification Problems , 2015, IDA.
[11] Bernhard Schölkopf,et al. Extracting Support Data for a Given Task , 1995, KDD.
[12] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.
[13] Heng Huang,et al. Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization , 2016, AAAI 2016.
[14] James T. Kwok,et al. Fast Distributed Asynchronous SGD with Variance Reduction , 2015, ArXiv.
[15] Moustapha Cissé,et al. Robust Bloom Filters for Large MultiLabel Classification Tasks , 2013, NIPS.
[16] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[17] Gediminas Adomavicius,et al. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.
[18] Guy Lapalme,et al. A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..
[19] Wu-Jun Li,et al. Distributed Stochastic ADMM for Matrix Factorization , 2014, CIKM.
[20] E. Lehmann,et al. Nonparametrics: Statistical Methods Based on Ranks , 1976 .
[21] Alexander J. Smola,et al. On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants , 2015, NIPS.
[22] Chih-Jen Lin,et al. A Learning-Rate Schedule for Stochastic Gradient Methods to Matrix Factorization , 2015, PAKDD.
[23] Prateek Jain,et al. Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.
[24] Michael J. Pazzani,et al. Content-Based Recommendation Systems , 2007, The Adaptive Web.
[25] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[26] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.
[27] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.
[28] Andreas Christmann,et al. Fast Learning from Non-i.i.d. Observations , 2009, NIPS.
[29] John Riedl,et al. An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.
[30] Seunghak Lee,et al. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.
[31] Nicolás García-Pedrajas,et al. Improving multiclass pattern recognition by the combination of two strategies , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[33] Georgios Paliouras,et al. LSHTC: A Benchmark for Large-Scale Text Classification , 2015, ArXiv.
[34] Inderjit S. Dhillon,et al. Large-scale Multi-label Learning with Missing Labels , 2013, ICML.
[35] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..
[36] Massih-Reza Amini,et al. Entropy-Based Concentration Inequalities for Dependent Variables , 2015, ICML.
[37] John Langford,et al. Conditional Probability Tree Estimation Analysis and Algorithms , 2009, UAI.
[38] Hsuan-Tien Lin,et al. Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.
[39] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[40] Chih-Jen Lin,et al. A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.
[41] Jason Weston,et al. Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.
[42] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[43] Gideon S. Mann,et al. Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.
[44] James T. Kwok,et al. Asynchronous Distributed ADMM for Consensus Optimization , 2014, ICML.
[45] Bikash Joshi,et al. Multi-class to Binary reduction of Large-scale classification Problems , 2015 .
[46] Chih-Jen Lin,et al. A fast parallel SGD for matrix factorization in shared memory systems , 2013, RecSys.
[47] James T. Kwok,et al. Efficient Multi-label Classification with Many Labels , 2013, ICML.
[48] Manik Varma,et al. FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning , 2014, KDD.
[49] Massih-Reza Amini,et al. An Asynchronous Distributed Framework for Large-scale Learning Based on Parameter Exchanges , 2017, 1705.07751.
[50] Chia-Hua Ho,et al. Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.
[51] Anderson Rocha,et al. Multiclass From Binary: Expanding One-Versus-All, One-Versus-One and ECOC-Based Approaches , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[52] Sol Ji Kang,et al. Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems , 2015, Adv. Multim..
[53] Sophie Ahrens,et al. Recommender Systems , 2012 .
[54] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[55] Andrew Zisserman,et al. Deep Face Recognition , 2015, BMVC.
[56] Tao Qin,et al. LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.
[57] Patrice Marcotte,et al. Co-Coercivity and Its Role in the Convergence of Iterative Schemes for Solving Variational Inequalities , 1996, SIAM J. Optim..
[58] Deva Ramanan,et al. Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.
[59] Fabian Pedregosa,et al. ASAGA: Asynchronous Parallel SAGA , 2016, AISTATS.
[60] Ryan M. Rifkin,et al. In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..
[61] Krishnakumar Balasubramanian,et al. The Landmark Selection Method for Multiple Output Prediction , 2012, ICML.
[62] Bikash Joshi,et al. Asynchronous Distributed Matrix Factorization with Similar User and Item Based Regularization , 2016, RecSys.
[63] Wu-Jun Li,et al. Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee , 2016, AAAI.
[64] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[65] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[66] John Langford,et al. Logarithmic Time One-Against-Some , 2016, ICML.
[67] André Carlos Ponce de Leon Ferreira de Carvalho,et al. A review on the combination of binary classifiers in multiclass problems , 2008, Artificial Intelligence Review.
[68] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[69] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..
[70] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.
[71] Ioannis Partalas,et al. On power law distributions in large-scale taxonomies , 2014, SKDD.
[72] John Langford,et al. Logarithmic Time Online Multiclass prediction , 2015, NIPS.
[73] Vijay V. Raghavan,et al. A critical analysis of vector space model for information retrieval , 1986, J. Am. Soc. Inf. Sci..
[74] Alexander Dekhtyar,et al. Information Retrieval , 2018, Lecture Notes in Computer Science.
[75] Rong Hu,et al. Active Learning for Text Classification , 2011 .
[76] Peter J. Haas,et al. Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.
[77] Nikos Karampatziakis,et al. Log-time and Log-space Extreme Classification , 2016, ArXiv.
[78] Julien Mairal,et al. Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning , 2014, SIAM J. Optim..
[79] Jason Weston,et al. Label Partitioning For Sublinear Ranking , 2013, ICML.
[80] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..
[81] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .
[82] Manik Varma,et al. Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications , 2016, KDD.
[83] Tie-Yan Liu,et al. Learning to rank for information retrieval , 2009, SIGIR.
[84] Tie-Yan Liu,et al. Learning to Rank for Information Retrieval , 2011 .
[85] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[86] Neha Mehra,et al. Survey on Multiclass Classification Methods , 2013 .
[87] Liva Ralaivola,et al. Chromatic PAC-Bayes Bounds for Non-IID Data , 2009, AISTATS.
[88] Massih-Reza Amini,et al. Generalization error bounds for classifiers trained with interdependent data , 2005, NIPS.
[89] Cordelia Schmid,et al. Image categorization using Fisher kernels of non-iid image models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[90] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[91] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[92] A. Choromańska. Extreme Multi Class Classification , 2013 .
[93] Jeff G. Schneider,et al. Multi-Label Output Codes using Canonical Correlation Analysis , 2011, AISTATS.
[94] Ashish Kapoor,et al. Multilabel Classification using Bayesian Compressed Sensing , 2012, NIPS.
[95] Suvrit Sra,et al. Scalable nonconvex inexact proximal splitting , 2012, NIPS.
[96] Yuxiao Hu,et al. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.
[97] Peter J. Haas,et al. Shared-memory and shared-nothing stochastic gradient descent algorithms for matrix completion , 2013, Knowledge and Information Systems.
[98] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[99] Thomas Hofmann,et al. Communication-Efficient Distributed Dual Coordinate Ascent , 2014, NIPS.
[100] Svante Janson,et al. Large deviations for sums of partly dependent random variables , 2004, Random Struct. Algorithms.
[101] Xiangfeng Wang,et al. Asynchronous Distributed ADMM for Large-Scale Optimization—Part I: Algorithm and Convergence Analysis , 2015, IEEE Transactions on Signal Processing.
[102] Mehryar Mohri,et al. Rademacher Complexity Bounds for Non-I.I.D. Processes , 2008, NIPS.
[103] Michel Vacher,et al. Improving Supervised Classification of Activities of Daily Living Using Prior Knowledge , 2011, Int. J. E Health Medical Commun..
[104] Fei-Fei Li,et al. What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.
[105] Johannes Fürnkranz,et al. Efficient implementation of class-based decomposition schemes for Naïve Bayes , 2013, Machine Learning.
[106] John Langford,et al. Error-Correcting Tournaments , 2009, ALT.
[107] Wotao Yin,et al. A Globally Convergent Algorithm for Nonconvex Optimization Based on Block Coordinate Update , 2014, J. Sci. Comput..
[108] Hsuan-Tien Lin,et al. Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.
[109] Florent Perronnin,et al. Large-scale image categorization with explicit data embedding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[110] Pradeep Ravikumar,et al. PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification , 2016, ICML.
[111] J. S. Cramer. The Origins of Logistic Regression , 2002 .
[112] Mu Li. Proposal Scaling Distributed Machine Learning with System and Algorithm Co-design , 2016 .
[113] J. Bobadilla,et al. Recommender systems survey , 2013, Knowl. Based Syst..
[114] Gideon S. Mann,et al. MapReduce/Bigtable for Distributed Optimization , 2010 .
[115] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.