Improving the Storage Efficiency of Small Files in Cloud Storage

An approach based on SequenceFile is proposed to improve storage efficiency of small files in the cloud storage systems that are on the basis of Hadoop distributed file system(HDFS).The approach uses the multi-attribute decision theory and the indices such as reading time,combining time,and saved memory size to obtain an optimal file merging scheme,so that the balance between computing time and memory space is achieved.A system load forecast algorithm is designed based on the analytic hierarchy process to predict the load of the system.SequenceFile is used to combine small files.Experimental results show that,without degrading the performance of storage system,the storage efficiency of small files is improved.