论文信息 - Scalable algorithms for large-scale machine learning problems : Application to multiclass classification and asynchronous distributed optimization. (Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchron

Scalable algorithms for large-scale machine learning problems : Application to multiclass classification and asynchronous distributed optimization. (Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchron

This thesis focuses on developing scalable algorithms for large scale machine learning. In this work, we present two perspectives to handle large data. First, we consider the problem of large-scale multiclass classification. We introduce the task of multiclass classification and the challenge of classifying with a large number of classes. To alleviate these challenges, we propose an algorithm which reduces the original multiclass problem to an equivalent binary one. Based on this reduction technique, we introduce a scalable method to tackle the multiclass classification problem for very large number of classes and perform detailed theoretical and empirical analyses.In the second part, we discuss the problem of distributed machine learning. In this domain, we introduce an asynchronous framework for performing distributed optimization. We present application of the proposed asynchronous framework on two popular domains: matrix factorization for large-scale recommender systems and large-scale binary classification. In the case of matrix factorization, we perform Stochastic Gradient Descent (SGD) in an asynchronous distributed manner. Whereas, in the case of large-scale binary classification we use a variant of SGD which uses variance reduction technique, SVRG as our optimization algorithm.

Bikash Joshi

[1] Bikash Joshi,et al. Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification , 2017, NIPS.

[2] Johannes Fürnkranz,et al. Efficient prediction algorithms for binary decomposition techniques , 2011, Data Mining and Knowledge Discovery.

[3] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[4] Michalis K. Titsias,et al. One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities , 2016, NIPS.

[5] Alfonso Niño,et al. A Survey of Parallel Programming Models and Tools in the Multi and Many-core Era , 2022 .

[6] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[7] Jason Weston,et al. Multi-Class Support Vector Machines , 1998 .

[8] John Langford,et al. Slow Learners are Fast , 2009, NIPS.

[9] Hsuan-Tien Lin,et al. Multi-label Classification with Error-correcting Codes , 2011, ACML.

[10] Bikash Joshi,et al. On Binary Reduction of Large-Scale Multiclass Classification Problems , 2015, IDA.

[11] Bernhard Schölkopf,et al. Extracting Support Data for a Given Task , 1995, KDD.