large synthetic data sets to compare different data mining methods