Big Data and its Analyzing Tools : A Perspective

Data are generated and stored in databases at a very high speed and hence it need to be handled and analyzed properly. Nowadays industries are extensively using Hadoop and Spark to analyze the datasets. Both the frameworks are used for increasing processing speeds in computing huge complex datasets. Many researchers are comparing both of them. Now, the big questions arising are, Is Spark a substitute for Hadoop? Is hadoop going to be replaced by spark in mere future?. Spark is “built on top of” Hadoop and it extends the model to deploy more types of computations which incorporates Stream Processing and Interactive Queries. No doubt, Spark's execution speed is much faster than Hadoop, but talking in terms of fault tolerance, hadoop is slightly more fault tolerant than spark. In this article comparison of various bigdata analytics tools are done and Hadoop and Spark are discussed in detail. This article further gives an overview of bigdata, spark and hadoop issues. In this survey paper, the approaches to resolve the issues of spark and hadoop are discussed elaborately.

[1]  D. P. Acharjya,et al.  A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools , 2016 .

[2]  Jamil Ahmed,et al.  Hadoop Architecture and Its Issues , 2014, 2014 International Conference on Computational Science and Computational Intelligence.

[3]  P. Dhavachelvan,et al.  Big Data and Hadoop-a Study in Security Perspective , 2015 .

[4]  R. Khan,et al.  Big Data Security challenges: Hadoop Perspective , 2018 .

[5]  Monika Khurana,et al.  Security of Big Data in Hadoop Using AES-MR with Auditing , 2017 .

[6]  Reynold Xin,et al.  Apache Spark , 2016 .

[7]  M Praveen Kumar,et al.  Security Issues in Hadoop Associated With Big Data , 2017 .

[8]  Anandu Jayan,et al.  RC4 in Hadoop security using MapReduce , 2017, 2017 International Conference on Computational Intelligence in Data Science(ICCIDS).

[9]  Joshua Zhexue Huang,et al.  Big data analytics on Apache Spark , 2016, International Journal of Data Science and Analytics.

[10]  Hadeer Mahmoud,et al.  An approach for big data security based on Hadoop distributed file system , 2018, 2018 International Conference on Innovative Trends in Computer Engineering (ITCE).

[11]  Mohammad Kazem Akbari,et al.  A survey on security of Hadoop , 2014, 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE).