Online Machine Learning in Big Data Streams: Overview

[1]  Jie Jiang,et al.  Angel: a new large-scale machine learning system , 2018 .

[2]  Mohamed Medhat Gaber,et al.  A Survey of Classification Methods in Data Streams , 2007, Data Streams - Models and Algorithms.

[3]  João Gama,et al.  On evaluating stream learning algorithms , 2012, Machine Learning.

[4]  Alireza Rezaei Mahdiraji Clustering data stream: A survey of algorithms , 2009, Int. J. Knowl. Based Intell. Eng. Syst..

[5]  Yaoliang Yu,et al.  Petuum: A New Platform for Distributed Machine Learning on Big Data , 2015, IEEE Trans. Big Data.

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[8]  GamaJoão,et al.  Data stream mining in ubiquitous environments , 2014 .

[9]  Rajiv Ranjan,et al.  Streaming Big Data Processing in Datacenter Clouds , 2014, IEEE Cloud Computing.

[10]  Craig Chambers,et al.  The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing , 2015, Proc. VLDB Endow..

[11]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[12]  Oscar Fontenla-Romero,et al.  Online Machine Learning , 2024, Machine Learning: Foundations, Methodologies, and Applications.

[13]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[14]  Indranil Gupta,et al.  Stateful Scalable Stream Processing at LinkedIn , 2017, Proc. VLDB Endow..

[15]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[16]  Gianmarco De Francisci Morales SAMOA: a platform for mining big data streams , 2013, WWW '13 Companion.

[17]  András A. Benczúr,et al.  Tutorial on Open Source Online Learning Recommenders , 2017, RecSys.

[18]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Data stream clustering: A survey , 2013, CSUR.

[19]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[20]  Jennifer Widom,et al.  STREAM: the stanford stream data manager (demonstration description) , 2003, SIGMOD '03.

[21]  Katarzyna Musial,et al.  Next challenges for adaptive learning systems , 2012, SKDD.

[22]  Jiawei Jiang,et al.  Heterogeneity-aware Distributed Parameter Servers , 2017, SIGMOD Conference.

[23]  Wei Fan,et al.  Extremely Fast Decision Tree Mining for Evolving Data Streams , 2017, KDD.

[24]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[25]  Robert L. Grossman,et al.  Data mining standards initiatives , 2002, CACM.

[26]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[27]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[28]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[29]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[30]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[31]  Craig Chambers,et al.  FlumeJava: easy, efficient data-parallel pipelines , 2010, PLDI '10.

[32]  Daniel Mills,et al.  MillWheel: Fault-Tolerant Stream Processing at Internet Scale , 2013, Proc. VLDB Endow..

[33]  Latifur Khan,et al.  IoT Big Data Stream Mining , 2016, KDD.

[34]  Jun Zhou,et al.  PSMART: Parameter Server based Multiple Additive Regression Trees System , 2017, WWW.

[35]  Mohamed Medhat Gaber,et al.  Data stream mining in ubiquitous environments: state‐of‐the‐art and current directions , 2014, WIREs Data Mining Knowl. Discov..

[36]  Wei Fan,et al.  Mining big data: current status, and forecast to the future , 2013, SKDD.

[37]  Seif Haridi,et al.  State Management in Apache Flink®: Consistent Stateful Distributed Stream Processing , 2017, Proc. VLDB Endow..

[38]  Wilfred Ng,et al.  A survey on algorithms for mining frequent itemsets over data streams , 2008, Knowledge and Information Systems.

[39]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[40]  Rolf Jagerman,et al.  Computing Web-scale Topic Models using an Asynchronous Parameter Server , 2017, SIGIR.

[41]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[42]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[43]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[44]  Inder Monga,et al.  Lambda architecture for cost-effective batch and speed big data processing , 2015, 2015 IEEE International Conference on Big Data (Big Data).