A Causal Approach for Mining Interesting Anomalies

We propose a novel approach which combines the use of Bayesian network and probabilistic association rules to discover and explain anomalies in data. The Bayesian network allows us to organize information in order to capture both correlation and causality in the feature space, while the probabilistic association rules have a structure similar to association mining rules. In particular, we focus on two types of rules: (i) low support & high confidence and, (ii) high support & low confidence. New data points which satisfy either one of the two rules conditioned on the Bayesian network are the candidate anomalies. We perform extensive experiments on well-known benchmark data sets and demonstrate that our approach is able to identify anomalies in high precision and recall. Moreover, our approach can be used to discover contextual information from the mined anomalies, which other techniques often fail to do so.