A fuzzy neighborhood rough set method for anomaly detection in large scale data

Mining Outlier in database is to find exceptional objects that deviate from the rest of the datasets. Besides classical outlier analysis algorithms, recent studies have focused on mining local outliers. The outliers that have density distribution significantly different from their neighborhood.  However, the existing outlier detection algorithms suffer the drawbacks that they are inefficient in dealing with large scale datasets. In this paper, we propose a novel approach for outlier detection with voluminous data. This approach involves a neighborhood fuzzy rough set theory to rank outlier according to fuzzy membership function computed in rough approximation space. In order to improve the speed of computation, an efficient parallel computing system based on Map Reduce model is developed