Comprehensive Survey on Hadoop Security

The new emerging technologies have provided a way for a large amount of data generation. Secure storage of such a huge data is of prime importance. Hadoop is a tool used to store big data, where security of it is not assured. In this paper, we have considered a survey on various approaches which helps in providing secure storage of files in Hadoop. Hadoop framework is developed for the support of processing and storage of Bigdata in a distributed computing environment. Usage of Bigdata has become a key factor for the companies as they can increase their operating margin. Bigdata contains user-sensitive information and bring forth many privacy issues. Bigdata is a larger and a more complex datasets obtained from a variety of network resources. These datasets are beyond the ability of traditionally used data processing software to capture, manage, and process the data within the given time frame. These massive volumes of data are used by many of the organizations to tackle the problem that could not be done before. Since the data holds a lot of valuable information, these data need to be processed in short span of time by which companies can boost their scale and generate more revenue, traditional system resources are not sufficient for processing and storing, and this is where Hadoop comes into picture. The main objective of Hadoop is running of application of bigdata. Hadoop being a great tool for data processing, it was initially designed for internal use (i.e., within local cluster) without any security perimeter of organization, so they were easily hackable and exposed to threats.

[1]  Mohammad Kazem Akbari,et al.  A survey on security of Hadoop , 2014, 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE).

[2]  Ma Yuan Study of Security Mechanism based on Hadoop , 2012 .

[3]  V. S. Shankar Sriram,et al.  Authentication Service in Hadoop using One Time Pad , 2014 .

[4]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Ik Rae Jeong,et al.  A Study on Security Improvement in Hadoop Distributed File System Based on Kerberos , 2013, Inscrypt 2013.

[7]  Li Renfa,et al.  The research of the data security for cloud disk based on the Hadoop framework , 2013, 2013 Fourth International Conference on Intelligent Control and Information Processing (ICICIP).

[8]  Abderrahim Beni Hssane,et al.  Big data emerging issues: Hadoop security and privacy , 2016, 2016 5th International Conference on Multimedia Computing and Systems (ICMCS).

[9]  Anandu Jayan,et al.  RC4 in Hadoop security using MapReduce , 2017, 2017 International Conference on Computational Intelligence in Data Science(ICCIDS).

[10]  Sang-Soo Yeo,et al.  A Study on Hash Chain-Based Hadoop Security Scheme , 2015, 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom).

[11]  Peng Ning,et al.  Enhancing security of Hadoop in a public cloud , 2015, 2015 6th International Conference on Information and Communication Systems (ICICS).

[12]  Mohsen Guizani,et al.  Haddle: A Framework for Investigating Data Leakage Attacks in Hadoop , 2014, GLOBECOM 2014.

[13]  Qiaoyan Wen,et al.  A new solution of data security accessing for Hadoop based on CP-ABE , 2014, 2014 IEEE 5th International Conference on Software Engineering and Service Science.