Big Data Optimization Techniques: A Survey

As the world is getting digitized the speed in which the amount of data is over owing from different sources in different format, it is not possible for the traditional system to compute and analysis this kind of big data for which big data tool like Hadoop is used which is an open source software. It stores and computes data in a distributed environment. In the last few years developing Big Data Applications has become increasingly important. In fact many organizations are depending upon knowledge extracted from huge amount of data. However traditional data technique shows a reduced performance, accuracy, slow responsiveness and lack of scalability. To solve the complicated Big Data problem, lots of work has been carried out. As a result various types of technologies have been developed. As the world is getting digitized the speed in which the amount of data is over owing from different sources in different format, it is not possible for the traditional system to compute and analysis this kind of big data for which big data tool like Hadoop is used which is an open source software. This research work is a survey about the survey of recent optimization technologies and their applications developed for Big Data. It aims to help to choose the right collaboration of various Big Data technologies according to requirements.

[1]  Siddharth Swarup Rautaray,et al.  Name node performance enlarging by aggregator based HADOOP framework , 2017, 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC).

[2]  Pei Shu-Jun,et al.  Optimization and Research of Hadoop Platform Based on FIFO Scheduler , 2015, 2015 Seventh International Conference on Measuring Technology and Mechatronics Automation.

[3]  Tevfik Kosar,et al.  Application-Level Optimization of Big Data Transfers through Pipelining, Parallelism and Concurrency , 2016, IEEE Transactions on Cloud Computing.

[4]  Beng Chin Ooi,et al.  In-Memory Big Data Management and Processing: A Survey , 2015, IEEE Transactions on Knowledge and Data Engineering.

[5]  Silvia M. Figueira,et al.  Towards efficient resource provisioning in MapReduce , 2016, J. Parallel Distributed Comput..

[6]  Siddharth Swarup Rautaray,et al.  A Survey Work on Optimization Techniques Utilizing Map Reduce Framework in Hadoop Cluster , 2017 .

[7]  Survey on Schedulers Optimization to Handle Multiple Jobs in Hadoop Cluster , 2015 .

[8]  Siddharth Swarup Rautaray,et al.  Feedback analysis using big data tools , 2016, 2016 International Conference on ICT in Business Industry & Government (ICTBIG).

[9]  Sunita Dhingra,et al.  Scheduling Algorithms in Big Data: A Survey , 2016 .

[10]  Smita Shukla Patel,et al.  A survey on innovative approach for improvement in efficiency of caching technique for big data application , 2015, 2015 International Conference on Pervasive Computing (ICPC).

[11]  Hao Wu,et al.  Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks , 2014, J. Parallel Distributed Comput..

[12]  Prem Prakash Jayaraman,et al.  Big Data Reduction Methods: A Survey , 2016, Data Science and Engineering.

[13]  Kun-Lung Wu,et al.  FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads , 2010, Middleware.

[14]  Danilo Ardagna,et al.  Optimization Techniques within the Hadoop Eco-system: A Survey , 2014, 2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[15]  Rong Gu,et al.  SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters , 2014, J. Parallel Distributed Comput..

[16]  Kalyanmoy Deb,et al.  Data mining methods for knowledge discovery in multi-objective optimization: Part A - Survey , 2017, Expert Syst. Appl..

[17]  Qinghua Zheng,et al.  An optimized approach for storing and accessing small files on cloud storage , 2012, J. Netw. Comput. Appl..

[18]  Md. Rafiqul Islam,et al.  Evolutionary optimization: A big data perspective , 2016, J. Netw. Comput. Appl..

[19]  Stathes Hadjiefthymiades,et al.  An Efficient Time Optimized Scheme for Progressive Analytics in Big Data , 2015, Big Data Res..

[20]  Dilpreet Singh,et al.  A survey on platforms for big data analytics , 2014, Journal of Big Data.

[21]  Siddharth Swarup Rautaray,et al.  A Proposal for High Availability of HDFS Architecture based on Threshold Limit and Saturation Limit of the Namenode , 2017 .

[22]  Siddharth Swarup Rautaray,et al.  Real time financial analysis using big data technologies , 2017, 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC).