Detecting Price Outliers in European Trade Data with the Forward Search

We describe empirical work in the domain of clustering and outlier detection, for the analysis of European trade data. It is our first attempt to evaluate benefits and limitations of the forward search approach for regression and multivariate analysis Atkinson and Riani (Robust diagnostic regression analysis, Springer, 2000), Atkinson et al. (Exploring multivariate data with the forward search, Springer, 2004), within a concrete application scenario and in relation to a comparable backward method developed in the JRC by Arsenis et al. (Price outliers in eu external trade data, Enlargement and Integration Workshop 2005, 2005). Our findings suggest that the automatic clustering based on Mahalanobis distances may be inappropriate in presence of a high-density area in the dataset. Follow up work is discussed extensively in Riani et al. (Fitting mixtures of regression lines with the forward search, Mining massive data sets for security, IOS, 2008).