Scalable bootstrap attribute reduction for massive data

Attribute reduction is one of the fundamental techniques for knowledge acquisition in rough set theory. Traditional attribute reduction algorithms have to load the whole dataset into the memory at a time, however, it is unfeasible for attribute reduction of the massive decision table due to hard limitation. To solve this problem, we propose the bag of little bootstraps attribute reduction algorithm (BLBAR), which combines the bag of little bootstraps with attribute discernibility. Specifically, the algorithm first samples from the original decision table to generate a number of decision sub-tables; and then finds the reducts of bootstrap samples of each sub-table through attribute discernibility; finally, all of the reducts are integrated as the reduct of the original massive decision table. Experimental results demonstrate that BLBAR leads to the improved feasibility, scalability and efficiency for attribute reduction on massive decision table.