Big Data Augmentation with Data Warehouse: A Survey

With dynamic changes in world’s technology, an increasing growth and adoption observed in the usage of social media, computer networks, internet of things, and cloud computing. Research experiments are also generating huge amount of data which are to be collected, managed and analyzed. This huge data is known as "Big Data". Research analysts have perceived an increase in data that contains both useful and useless entities. In extraction of useful information, data warehouse finds difficulties in enduring with increasing amount of data generated. With shifts in paradigm, big data analytics emerged as promising area of research which supports business intelligence in terms of decision making. This paper provides a comprehensive survey on BigData, BigData problems, BigData Analytics and Big Data Warehouse. In addition, it also explains how the need for augmentation of big data and data warehouse emerged in perspective of decision making, comparing methods and research problems. It also elaborates applications which support Big Data, Data Warehouse, and its challenges.

[1]  Francesco Di Tria,et al.  Design process for Big Data Warehouses , 2014, 2014 International Conference on Data Science and Advanced Analytics (DSAA).

[2]  Gore Sumit Sureshrao,et al.  MapReduce-based warehouse systems: A survey , 2014, 2014 International Conference on Advances in Engineering & Technology Research (ICAETR - 2014).

[3]  Nayem Rahman An empirical study of data warehouse implementation effectiveness , 2017 .

[4]  Mu Hu,et al.  Present Situation and Prospect of Data Warehouse Architecture under the Background of Big Data , 2013, 2013 International Conference on Information Science and Cloud Computing Companion.

[5]  Azizah Ahmad,et al.  Identifying Quality Factors within Data Warehouse , 2010, 2010 Second International Conference on Computer Research and Development.

[6]  Boris Vrdoljak,et al.  MapReduce research on warehousing of big data , 2017, 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[7]  Yong-chao Zhao,et al.  On the Research of Data Warehouse in Big Data , 2015, 2015 International Conference on Network and Information Systems for Computers.

[8]  B. Bharathi,et al.  A survey paper on big data analytics , 2017, 2017 International Conference on Information Communication and Embedded Systems (ICICES).

[9]  Roger L. Hayen,et al.  AN INVESTIGATION OF THE FACTORS AFFECTING DATA WAREHOUSING SUCCESS , 2007 .

[10]  A. Sunny Kumar,et al.  Performance analysis of MySQL partition, hive partition-bucketing and Apache Pig , 2016, 2016 1st India International Conference on Information Processing (IICIP).

[11]  Muhammad Gufran Khan,et al.  Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools , 2017 .

[12]  Peter Christen,et al.  A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012, IEEE Transactions on Knowledge and Data Engineering.

[13]  Nayem Rahman,et al.  Measuring Performance for Data Warehouses - A Balanced Scorecard Approach , 2013 .

[14]  Pengcheng Zhang,et al.  Data quality in big data processing: Issues, solutions and open problems , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[15]  Mohamed Adel Serhani,et al.  Big Data Pre-Processing: Closing the Data Quality Enforcement Loop , 2017, 2017 IEEE International Congress on Big Data (BigData Congress).

[16]  Mieczyslaw L. Owoc,et al.  Data warehouse as a source of knowledge acquisition. An empirical study , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[17]  Md. Zakirul Alam Bhuiyan,et al.  A Survey on Deep Learning in Big Data , 2017, 22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC).

[18]  Robin Singh Bhadoria,et al.  Predictive analytics in data science for business intelligence solutions , 2017, 2017 7th International Conference on Communication Systems and Network Technologies (CSNT).

[19]  Alfredo Cuzzocrea Analytics over Big Data: Exploring the Convergence of DataWarehousing, OLAP and Data-Intensive Cloud Infrastructures , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.

[20]  Mykola Pechenizkiy,et al.  Predictive analytics on evolving data streams anticipating and adapting to changes in known and unknown contexts , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).

[21]  Ephraim R. McLean,et al.  The DeLone and McLean Model of Information Systems Success: A Ten-Year Update , 2003, J. Manag. Inf. Syst..