Towards efficient Big Data and data analytics: A review

“Big Data” is data whose scale, distribution, diversity, and/or timeliness require the use of new technical architectures and analytics to enable insights that unlock new sources of business value. It requires new data architectures, analytic sandboxes, new tools, new analytical methods, integrating multiple skills into new role of data scientist. Organizations are deriving business benefit from analyzing ever larger and more complex data sets that increasingly require real-time or near-real time capabilities. Big data can come in multiple forms. Everything from highly structured financial data, to text files, to multi-media files and genetic mappings. The high volume of the data is a consistent characteristic of big data.

[1]  Haibo Chen,et al.  Tiled-MapReduce: Optimizing resource usages of data-parallel applications on multicore with tiling , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[2]  Kristopher Welsh,et al.  The danger of big data: Social media as computational social science , 2012, First Monday.

[3]  Tim Kraska,et al.  An evaluation of alternative architectures for transaction processing in the cloud , 2010, SIGMOD Conference.

[4]  Bruce Ratner,et al.  Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data , 2003 .

[5]  Robert Morris,et al.  Optimizing MapReduce for Multicore Architectures , 2010 .

[6]  Beng Chin Ooi,et al.  The performance of MapReduce , 2010, Proc. VLDB Endow..

[7]  Aristides Gionis,et al.  Social Content Matching in MapReduce , 2011, Proc. VLDB Endow..

[8]  Herodotos Herodotou,et al.  Profiling, what-if analysis, and cost-based optimization of MapReduce programs , 2011, Proc. VLDB Endow..

[9]  Li Xiu,et al.  Application of data mining techniques in customer relationship management: A literature review and classification , 2009, Expert Syst. Appl..

[10]  D. Boyd,et al.  Six Provocations for Big Data , 2011 .

[11]  Weikuan Yu,et al.  Hierarchical merge for scalable MapReduce , 2012 .

[12]  Shivnath Babu,et al.  Towards automatic optimization of MapReduce programs , 2010, SoCC '10.

[13]  Yu Xu,et al.  Integrating hadoop and parallel DBMs , 2010, SIGMOD Conference.

[14]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.