Landscape of Unified Big Data Platforms

Big data has become a hot topic in recent years (Mayer-Schonberger & Cukier, 2013). Big data is defined by WikiPedia (WikiPedia, 2013) as a term “for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.” The sources of big data include e-commerce, Internet of things, scientific experiments, and Web applications such as user generated contents etc. People have agreed on several Vs to describe big data, including volume, velocity, variety, and veracity. When processing huge volume of data of various types (structured data, unstructured data, and semi-structured data), at different stages (data in motion, data at rest, and archived data) to extract valuable information for decision making, people face several challenges, including capturing, cleaning, storage, transferring, searching, analysis, and visualization of the big data. The challenges sparkle the development of unified big data platforms. This chapter discusses the current competition landscape of unified big data platforms. BACKGROUND

[1]  John Cieslewicz,et al.  SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions , 2009, Proc. VLDB Endow..

[2]  Joseph M. Hellerstein,et al.  MAD Skills: New Analysis Practices for Big Data , 2009, Proc. VLDB Endow..

[3]  John Wang,et al.  Encyclopedia of Business Analytics and Optimization , 2018 .

[4]  Michael F. Gorman,et al.  Searching for Herbert Simon: Extending the Reach and Impact of Business Intelligence Research Through Analytics , 2013, Int. J. Bus. Intell. Res..

[5]  Zhiwei Xu,et al.  RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[6]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[7]  William P. Fox,et al.  Applied Game Theory in Business Analytics , 2014 .

[8]  John R. Talburt,et al.  A Data-Intensive Approach to Named Entity Recognition Combining Contextual and Intrinsic Indicators , 2012, Int. J. Bus. Intell. Res..

[9]  Kevin R. Parker,et al.  The Role of Culture in Business Intelligence , 2010, Int. J. Bus. Intell. Res..

[10]  Vinay Setty,et al.  Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) , 2010, Proc. VLDB Endow..

[11]  Sherif Sakr,et al.  The family of mapreduce and large-scale data processing systems , 2013, CSUR.

[12]  Yon Dohn Chung,et al.  Parallel data processing with MapReduce: a survey , 2012, SGMD.

[13]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..