Big Data Analytics Technologies and Platforms: A Brief Review

A plethora of Big Data Analytics technologies and platforms have been proposed in the last years. However, in 2017, only 53% of companies are adopting such tools. It seems that the industry is not convinced about Big Data promises or maybe choosing the right technology/platform requires indepth knowledge about the capabilities of all these tools. Before deciding the right technology or platform to choose from, the organizations have to investigate the application/algorithm needs and the advantages and drawbacks of each technology/platform. In this paper, we aim at helping organizations in the selection of technologies/platforms more appropriate to their analytic processes by offering a short-review according to some categories of Big Data problems as processing (streaming and batch), storage, data integration, analytics, data governance, and monitoring.

[1]  Felix Naumann,et al.  The Stratosphere platform for big data analytics , 2014, The VLDB Journal.

[2]  Tim Kraska,et al.  MLI: An API for Distributed Machine Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[3]  Jens Lehmann,et al.  Big Data Europe , 2017, EDBT/ICDT Workshops.

[4]  Sabeur Aridhi,et al.  An experimental survey on big data frameworks , 2016, Future Gener. Comput. Syst..

[5]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[6]  Reynold Xin,et al.  Apache Spark , 2016 .

[7]  Joseph M. Hellerstein,et al.  MAD Skills: New Analysis Practices for Big Data , 2009, Proc. VLDB Endow..

[8]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[9]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[10]  Sherif Sakr Big Data 2.0 Processing Systems , 2016, SpringerBriefs in Computer Science.

[11]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[12]  Dilpreet Singh,et al.  A survey on platforms for big data analytics , 2014, Journal of Big Data.

[13]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[14]  Edmon Begoli,et al.  Design Principles for Effective Knowledge Discovery from Big Data , 2012, 2012 Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture.

[15]  Sunil Soares Big Data Governance: An Emerging Imperative , 2012 .

[16]  Rares Vernica,et al.  Hyracks: A flexible and extensible foundation for data-intensive computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[17]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.