A survey of machine learning for big data processing

There is no doubt that big data are now rapidly expanding in all science and engineering domains. While the potential of these massive data is undoubtedly significant, fully making sense of them requires new ways of thinking and novel learning techniques to address the various challenges. In this paper, we present a literature survey of the latest advances in researches on machine learning for big data processing. First, we review the machine learning techniques and highlight some promising learning methods in recent studies, such as representation learning, deep learning, distributed and parallel learning, transfer learning, active learning, and kernel-based learning. Next, we focus on the analysis and discussions about the challenges and possible solutions of machine learning for big data. Following that, we investigate the close connections of machine learning with signal processing techniques for big data processing. Finally, we outline several open issues and research trends.

[1]  Georgios B. Giannakis,et al.  Real-Time Load Elasticity Tracking and Pricing for Electric Vehicle Charging , 2015, IEEE Transactions on Smart Grid.

[2]  Feiping Nie,et al.  Robust Matrix Completion via Joint Schatten p-Norm and lp-Norm Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[3]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[4]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[5]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[6]  Fei Huang,et al.  Exploring Representation-Learning Approaches to Domain Adaptation , 2010 .

[7]  Mikhail F. Kanevski,et al.  A Survey of Active Learning Algorithms for Supervised Remote Sensing Image Classification , 2011, IEEE Journal of Selected Topics in Signal Processing.

[8]  Sugato Basu,et al.  Adaptive product normalization: using online learning for record linkage in comparison shopping , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[9]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[10]  Sergios Theodoridis,et al.  Online Kernel-Based Classification Using Adaptive Projection Algorithms , 2008, IEEE Transactions on Signal Processing.

[11]  Dharma P. Agrawal,et al.  Markov chain existence and Hidden Markov models in spectrum sensing , 2009, 2009 IEEE International Conference on Pervasive Computing and Communications.

[12]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[13]  Ana Galindo-Serrano,et al.  Distributed Q-Learning for Aggregated Interference Control in Cognitive Radio Networks , 2010, IEEE Transactions on Vehicular Technology.

[14]  Luca Maria Gambardella,et al.  Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.

[15]  Ran El-Yaniv,et al.  Distributional Word Clusters vs. Words for Text Categorization , 2003, J. Mach. Learn. Res..

[16]  Ali El-Hajj,et al.  Cognitive Radio Transceivers: RF, Spectrum Sensing, and Learning Algorithms Review , 2014 .

[17]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[18]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[19]  Stanley Shostak Smart Machines: IBM’s Watson and the Era of Cognitive Computing , 2016 .

[20]  Qihui Wu,et al.  Kernel-Based Learning for Statistical Signal Processing in Cognitive Radio Networks: Theoretical Foundations, Example Applications, and Future Directions , 2013, IEEE Signal Processing Magazine.

[21]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[22]  Bertha Guijarro-Berdiñas,et al.  A survey of methods for distributed machine learning , 2012, Progress in Artificial Intelligence.

[23]  Thomas Hofmann,et al.  Map-Reduce for Machine Learning on Multicore , 2007 .

[24]  Jean-Yves Tourneret,et al.  A New Frequency Estimation Method for Equally and Unequally Spaced Data , 2014, IEEE Transactions on Signal Processing.

[25]  Nesime Tatbul,et al.  Streaming data integration: Challenges and opportunities , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[26]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[27]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Zhixun Su,et al.  Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation , 2011, NIPS.

[29]  Cheng Wang,et al.  Using Stacked Generalization to Combine SVMs in Magnitude and Shape Feature Spaces for Classification of Hyperspectral Data , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Dianhui Wang,et al.  Distributed learning for Random Vector Functional-Link networks , 2015, Inf. Sci..

[31]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[32]  Georgios B. Giannakis,et al.  Online Censoring for Large-Scale Regressions with Application to Streaming Big Data , 2015, IEEE Transactions on Signal Processing.

[33]  Shiliang Sun,et al.  Cross-domain representation-learning framework with combination of class-separate and domain-merge objectives , 2012, CDKD '12.

[34]  H. Vincent Poor,et al.  Outlying Sequence Detection in Large Data Sets: A data-driven approach , 2014, IEEE Signal Processing Magazine.

[35]  Gonzalo Mateos,et al.  Modeling and Optimization for Big Data Analytics: (Statistical) learning tools for our era of data deluge , 2014, IEEE Signal Processing Magazine.

[36]  Christian Viard-Gaudin,et al.  A Convolutional Neural Network Approach for Objective Video Quality Assessment , 2006, IEEE Transactions on Neural Networks.

[37]  Marios D. Dikaiakos,et al.  Cloud Computing: Distributed Internet Computing for IT and Scientific Research , 2009, IEEE Internet Computing.

[38]  Francisco Facchinei,et al.  Hybrid Random/Deterministic Parallel Algorithms for Convex and Nonconvex Big Data Optimization , 2014, IEEE Transactions on Signal Processing.

[39]  Lawrence B. Holder,et al.  Generalized Query-Based Active Learning to Identify Differentially Methylated Regions in DNA , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[40]  Qihui Wu,et al.  Robust Spectrum Sensing With Crowd Sensors , 2014, IEEE Trans. Commun..

[41]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[42]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[43]  Sudharman K. Jayaweera,et al.  Multidimensional Dirichlet Process-Based Non-Parametric Signal Classification for Autonomous Self-Learning Cognitive Radios , 2013, IEEE Transactions on Wireless Communications.

[44]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[45]  Guoyin Wang,et al.  A Rough Set-Based Method for Updating Decision Rules on Attribute Values’ Coarsening and Refining , 2014, IEEE Transactions on Knowledge and Data Engineering.

[46]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[47]  Martin Sarnovsky,et al.  Distributed boosting algorithm for classification of text documents , 2014, 2014 IEEE 12th International Symposium on Applied Machine Intelligence and Informatics (SAMI).

[48]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[49]  Renquan Lu,et al.  Vision-Based Human Tracking Control of a Wheeled Inverted Pendulum Robot , 2016, IEEE Transactions on Cybernetics.

[50]  T. Davenport,et al.  How ‘ Big Data ’ is Different FALL 2012 , 2012 .

[51]  Antonio González Muñoz,et al.  A Set of Complexity Measures Designed for Applying Meta-Learning to Instance Selection , 2015, IEEE Transactions on Knowledge and Data Engineering.

[52]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[53]  Jun Rao,et al.  Building LinkedIn's Real-time Activity Data Pipeline , 2012, IEEE Data Eng. Bull..

[54]  Volkan Cevher,et al.  Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[55]  Victor Vianu,et al.  Invited articles section foreword , 2010, JACM.

[56]  Zhiyong Peng,et al.  From Big Data to Big Data Mining: Challenges, Issues, and Opportunities , 2013, DASFAA Workshops.

[57]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[58]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[59]  Sujatha R. Upadhyaya,et al.  Parallel approaches to machine learning - A comprehensive survey , 2013, J. Parallel Distributed Comput..

[60]  Georgios B. Giannakis,et al.  Per-Block-Convex Data Modeling by Accelerated Stochastic Approximation , 2015, ArXiv.

[61]  Rong Jin,et al.  Online Feature Selection and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.

[62]  Chengqi Zhang,et al.  Active Learning without Knowing Individual Instance Labels: A Pairwise Label Homogeneity Query Approach , 2014, IEEE Transactions on Knowledge and Data Engineering.

[63]  Jeffrey D. Ullman,et al.  Mining of Massive Datasets: Data Mining , 2011 .

[64]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[65]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66]  Jun Zheng,et al.  An Online Incremental Learning Support Vector Machine for Large-scale Data , 2010, ICANN.

[67]  Anna Scaglione,et al.  A consensus-based decentralized algorithm for non-convex optimization with application to dictionary learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[68]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[69]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[70]  Melba M. Crawford,et al.  Active Learning: Any Value for Classification of Remotely Sensed Data? , 2013, Proceedings of the IEEE.

[71]  Qiang Yang,et al.  Bridging Domains Using World Wide Knowledge for Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[72]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[73]  Philip S. Yu,et al.  An improved categorization of classifier's sensitivity on sample selection bias , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[74]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[75]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[76]  Nicola Jones,et al.  Computer science: The learning machines , 2014, Nature.

[77]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[78]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[79]  S. Mahadevan,et al.  Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning , 1999 .

[80]  Alejandro Baldominos Gómez,et al.  A scalable machine learning online service for big data real-time analysis , 2014, 2014 IEEE Symposium on Computational Intelligence in Big Data (CIBD).

[81]  John Langford,et al.  Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.

[82]  Ian F. C. Smith,et al.  Reinforcement Learning for Structural Control , 2008 .

[83]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[84]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[85]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[86]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[87]  Fu Lin,et al.  Design of Optimal Sparse Feedback Gains via the Alternating Direction Method of Multipliers , 2011, IEEE Transactions on Automatic Control.

[88]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[89]  José M. F. Moura,et al.  Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[90]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[91]  Laurence T. Yang,et al.  Data Mining for Internet of Things: A Survey , 2014, IEEE Communications Surveys & Tutorials.

[92]  Francisco Facchinei,et al.  Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization , 2014, ArXiv.

[93]  Sergios Theodoridis,et al.  Adaptive Learning in a World of Projections , 2011, IEEE Signal Processing Magazine.

[94]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[95]  Sinisa Todorovic,et al.  Local-Learning-Based Feature Selection for High-Dimensional Data Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96]  Cong Li,et al.  A Unifying Framework for Typical Multitask Multiple Kernel Learning Problems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[97]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[98]  Morteza Mardani,et al.  Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors , 2014, IEEE Transactions on Signal Processing.

[99]  Alexander Yates,et al.  Biased Representation Learning for Domain Adaptation , 2012, EMNLP.

[100]  Sau Dan Lee,et al.  Decision Trees for Uncertain Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[101]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[102]  Sergios Theodoridis,et al.  Adaptive Constrained Learning in Reproducing Kernel Hilbert Spaces: The Robust Beamforming Case , 2009, IEEE Transactions on Signal Processing.

[103]  H. Vincent Poor,et al.  Communication Theoretic Data Analytics , 2015, IEEE Journal on Selected Areas in Communications.

[104]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[105]  Qihui Wu,et al.  Spatial-Temporal Opportunity Detection for Spectrum-Heterogeneous Cognitive Radio Networks: Two-Dimensional Sensing , 2013, IEEE Transactions on Wireless Communications.

[106]  Shifei Ding,et al.  Extreme learning machine and its applications , 2013, Neural Computing and Applications.

[107]  Rajiv Ranjan,et al.  Streaming Big Data Processing in Datacenter Clouds , 2014, IEEE Cloud Computing.

[108]  H. Vincent Poor,et al.  Attribute-Distributed Learning: Models, Limits, and Algorithms , 2011, IEEE Transactions on Signal Processing.

[109]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[110]  Maryam Fazel,et al.  New Restricted Isometry results for noisy low-rank recovery , 2010, 2010 IEEE International Symposium on Information Theory.

[111]  Qiang Yang,et al.  Spectral domain-transfer learning , 2008, KDD.

[112]  Nlp Lab Multi-Domain Sentiment Classification with Classifier Combination , 2011 .

[113]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[114]  Qihui Wu,et al.  Cognitive Internet of Things: A New Paradigm Beyond Connection , 2014, IEEE Internet of Things Journal.

[115]  Weiyi Liu,et al.  A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks , 2015, IEEE Transactions on Cybernetics.

[116]  Klaus-Robert Müller,et al.  Analyzing Local Structure in Kernel-Based Learning: Explanation, Complexity, and Reliability Assessment , 2013, IEEE Signal Processing Magazine.

[117]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[118]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing , 2011 .

[119]  Pascal Bianchi,et al.  A stochastic coordinate descent primal-dual algorithm and applications , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[120]  Muhammad Ali Imran,et al.  Challenges in 5G: how to empower SON with big data for enabling 5G , 2014, IEEE Network.

[121]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[122]  Yueming Cai,et al.  A learner based on neural network for cognitive radio , 2010, 2010 IEEE 12th International Conference on Communication Technology.

[123]  Amit Sethi,et al.  Drowsy driver detection using representation learning , 2014, 2014 IEEE International Advance Computing Conference (IACC).

[124]  Gonzalo Mateos,et al.  Stochastic Approximation vis-a-vis Online Learning for Big Data Analytics [Lecture Notes] , 2014, IEEE Signal Processing Magazine.

[125]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[126]  Cynthia Rudin,et al.  Machine learning for science and society , 2013, Machine Learning.

[127]  Sergios Theodoridis,et al.  Adaptive Multiregression in Reproducing Kernel Hilbert Spaces: The Multiaccess MIMO Channel Case , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[128]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[129]  Jian Zhang Deep Transfer Learning via Restricted Boltzmann Machine for Document Classification , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[130]  Laurence T. Yang,et al.  Big Data Real-Time Processing Based on Storm , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.