Comparing Standard Regression Modeling to Ensemble Modeling: How Data Mining Software Can Improve Economists’ Predictions

Economists’ wariness of data mining may be misplaced, even in cases where economic theory provides a well-specified model for estimation. We discuss how new data mining/ensemble modeling software, for example the program TreeNet, can be used to create predictive models. We then show how for a standard labor economics problem, the estimation of wage equations, TreeNet outperforms standard OLS regression in terms of lower prediction error. Ensemble modeling resists the tendency to overfit data. We conclude by considering additional types of economic problems that are well-suited to use of data mining techniques.