Research of Access Optimization of Small Files on Basis of B + Tree on Hadoop

Hadoop, the open-source software for reliable, scalable, distributed computing used in the processing and storage of extremely large data sets, is originally designed to store large amounts of large files resulting in huge wastage of storage space for Data Node and increase in the memory space utilization, for Name Node, when dealing with massive small files. For the above shortcomings, this paper puts forward a optimization design for small files access scheme, which speeds up the small file location through the file index, on the Hadoop platform based on B + tree index, resulting in the improvement in the access efficiency of small files. The effectiveness of the proposed scheme is experimentally validated.

[1]  Yang Zhang,et al.  Improving the Efficiency of Storing for Small Files in HDFS , 2012 .

[2]  Xiao Fe Exploration of Big Data Processing Technology , 2013 .

[3]  Wang Yuan,et al.  Network Big Data: Present and Future: Network Big Data: Present and Future , 2014 .

[4]  Dan Wang,et al.  An Improved Small File Processing Method for HDFS , 2012 .

[5]  Jun Wang,et al.  Improving metadata management for small files in HDFS , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[6]  Yuanzhuo Wang,et al.  Network Big Data: Present and Future: Network Big Data: Present and Future , 2014 .

[7]  Liu Changtong An improved HDFS for small file , 2016, 2016 18th International Conference on Advanced Communication Technology (ICACT).

[8]  Natawut Nupairoj,et al.  Improving performance of small-file accessing in Hadoop , 2014, 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE).

[9]  Meina Song,et al.  THE optimization of HDFS based on small files , 2010, 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT).

[10]  Meng Xiaofeng and Ci Xiang,et al.  Big Data Management: Concepts,Techniques and Challenges , 2013 .

[11]  Dafang Zhang,et al.  A Strategy to Deal with Mass Small Files in HDFS , 2014, 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics.