Machine Learning Algorithms for Big Data

Growth of data provided from varied sources has created enormous amount of resources. However, utilizing those resources for any useful task requires deep understanding about characteristics of the data. Goal of machine learning algorithms is to learn these characteristics and use them for future predictions. However, in the context of big data, applying machine learning algorithms rely on the effective processing techniques of the data such as using data parallelism by working with huge chunks of data. Hence, machine learning methodologies are increasingly becoming statistical and less rule-based to handle such scale of data.

[1]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[2]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[3]  Oscar Fontenla-Romero,et al.  Online Machine Learning , 2024, Machine Learning: Foundations, Methodologies, and Applications.

[4]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[5]  Jun Zhou,et al.  PSMART: Parameter Server based Multiple Additive Regression Trees System , 2017, WWW.

[6]  Xiaojin Zhu,et al.  Humans Perform Semi-Supervised Classification Too , 2007, AAAI.

[7]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[8]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[9]  Shane Legg,et al.  Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[10]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[11]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[12]  Antoine Bordes,et al.  The Huller: A Simple and Efficient Online SVM , 2005, ECML.

[13]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[14]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[15]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[16]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[17]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[18]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[19]  Robert Pless,et al.  A Survey of Manifold Learning for Images , 2009, IPSJ Trans. Comput. Vis. Appl..

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[22]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[23]  Bernt Schiele,et al.  Decomposition, discovery and detection of visual categories using topic models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Tommi S. Jaakkola,et al.  Maximum Entropy Discrimination , 1999, NIPS.

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[27]  Hod Lipson,et al.  Re-embedding words , 2013, ACL.

[28]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[29]  David Yarowsky,et al.  Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[30]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[31]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[32]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[33]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[34]  Xin Wang,et al.  Clipper: A Low-Latency Online Prediction Serving System , 2016, NSDI.

[35]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[36]  Ulf Brefeld,et al.  Multi-view Discriminative Sequential Learning , 2005, ECML.

[37]  Peter J. Haas,et al.  Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.

[38]  Sanjoy Dasgupta,et al.  PAC Generalization Bounds for Co-training , 2001, NIPS.

[39]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[40]  G. James Blaine,et al.  Continuous Monitoring of Physiologic Variables with a Dedicated Minicomputer , 1975, Computer.

[41]  John Langford,et al.  Sparse Online Learning via Truncated Gradient , 2008, NIPS.

[42]  Mikhail Belkin,et al.  Semi-Supervised Learning , 2021, Machine Learning.

[43]  Seunghak Lee,et al.  More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.

[44]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[45]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[46]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[47]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[48]  Harish Karnick,et al.  Kernel-based online machine learning and support vector reduction , 2008, ESANN.

[49]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[50]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.