Big Data Open Source Platforms

In a global market the capacity to mine and analyze user data is one way for companies to be as close in time and accuracy to the needs of their users. Big Data Platforms are one solution for companies to solve the necessary challenges to accomplish these capacities. Unfortunately the number of challenges that need to be addressed, allied with the high number of different solutions proposed, has led to the creation of a high number of different platforms making it hard to name one definitive and adequate platform for companies. In this paper we compare six of the most important Big Data Open Source Platforms to help companies or organizations choose the most adequate one to their needs. We analyze the following open source platforms - Apache Mahout, MOA, R Project, Vow pal Wabbit, PEGASUS and Graph Lab Create TM.

[1]  Albert Bifet,et al.  Mining Big Data in Real Time , 2013, Informatica.

[2]  Jorge Bernardino,et al.  YCSB and TPC-H: Big Data and Decision Support Benchmarks , 2014, 2014 IEEE International Congress on Big Data.

[3]  K. R. Dabhade,et al.  Big Data Overview , 2014 .

[4]  Ulf-Dietrich Reips,et al.  "Big Data" : big gaps of knowledge in the field of internet science , 2012 .

[5]  Thomas Seidl,et al.  MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering , 2010, WAPA.

[6]  Joseph M. Hellerstein,et al.  GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.

[7]  Jorge Bernardino,et al.  NoSQL databases: MongoDB vs cassandra , 2013, C3S2E '13.

[8]  Leo Sleuwaegen Vlerick Internationalization strategy and performance of small and medium sized enterprises , 2000 .

[9]  Jorge Bernardino,et al.  Survey on Big Data and Decision Support Benchmarks , 2014, DEXA.

[10]  Jan Vitek,et al.  Evaluating the Design of the R Language - Objects and Functions for Data Analysis , 2012, ECOOP.

[11]  Nicole Tache Big Data Now, 2015 Edition , 2016 .

[12]  María José del Jesús,et al.  Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks , 2014, WIREs Data Mining Knowl. Discov..

[13]  Christos Faloutsos,et al.  PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[14]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[15]  Rudy Aernoudt,et al.  Small- and medium-sized Enterprises , 2003 .

[16]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[17]  Ulf Grenander,et al.  Pattern analysis , 1978, Lectures in pattern theory / U. Grenander.

[18]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[19]  Soumendra Mohanty,et al.  “Big Data” in the Enterprise , 2013 .

[20]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[21]  Manjula M Ramannavar A Survey on Big Data Analytical Tools , 2013 .

[22]  Jorge Bernardino,et al.  NoSQL Databases: A Software Engineering Perspective , 2015, WorldCIST.