MapReduce-based fuzzy very fast decision tree for constructing prediction intervals

We propose the fuzzy version of very fast decision tree (VFDT) to predict prediction intervals and compared them with those generated by traditional VFDT. The proposed fuzzy VFDT is able to capture intrinsic features of VFDT as well as uncertainties available in data. The VFDT and fuzzy VFDT were trained using the lower upper bound estimation (LUBE) method in order to generate prediction intervals. We also implemented VFDT; developed and implemented fuzzy VFDT using Apache Hadoop MapReduce framework, where multiple slave nodes build a VFDT and fuzzy VFDT model. The developed models were tested on six datasets taken from the web. We conducted sensitivity analysis by studying the influence of the window size of the data stream, number of bins in discretisation on the final results. Results demonstrated that the proposed MapReduce-based fuzzy VFDT and VFDT can construct high-quality prediction intervals precisely and quickly.