Measuring Forecasting Accuracy: Problems and Recommendations (by the Example of SKU-Level Judgmental Adjustments)

Forecast adjustment commonly occurs when organizational forecasters adjust a statistical forecast of demand to take into account factors which are excluded from the statistical calculation. This paper addresses the question of how to measure the accuracy of such adjustments. We show that many existing error measures are generally not suited to the task, due to specific features of the demand data. Alongside the well-known weaknesses of existing measures, a number of additional effects are demonstrated that complicate the interpretation of measurement results and can even lead to false conclusions being drawn. In order to ensure an interpretable and unambiguous evaluation, we recommend the use of a metric based on aggregating performance ratios across time series using the weighted geometric mean. We illustrate that this measure has the advantage of treating over- and under-forecasting even-handedly, has a more symmetric distribution, and is robust.

[1]  S. Kolassa,et al.  Advantages of the MAD/Mean ratio over the MAPE , 2007 .

[2]  J. Scott Armstrong,et al.  On the Selection of Error Measures for Comparisons Among Forecasting Methods , 2005 .

[3]  A. M. Razali,et al.  New Technique to Estimate the Asymmetric Trimming Mean , 2010 .

[4]  Rob J. Hyndman,et al.  Another Look at Forecast Accuracy Metrics for Intermittent Demand , 2006 .

[5]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[6]  Jim Hoover,et al.  Measuring Forecast Accuracy: Omissions in Today's Forecasting Engines and Demand-Planning Software , 2006 .

[7]  Rand R. Wilcox,et al.  Statistics for the Social Sciences , 1996 .

[8]  Arnold Zellner,et al.  A tale of forecasting 1001 series : The Bayesian knight strikes again , 1986 .

[9]  Francis X. Diebold,et al.  On the limitations of comparing mean square forecast errors: Comment , 1993 .

[10]  Robert J. Genetski,et al.  Long-Range Forecasting: From Crystal Ball to Computer , 1981 .

[11]  Chris Chatfield,et al.  Time‐series forecasting , 2000 .

[12]  Philip Hans Franses,et al.  Do experts' adjustments on model-based SKU-level forecasts improve forecast quality? , 2009 .

[13]  Teresa M. McCarthy,et al.  The Evolution of Sales Forecasting Management: A 20-year Longitudinal Study of Forecasting Practices , 2006 .

[14]  J. Scott Armstrong,et al.  Long-Range Forecasting: From Crystal Ball to Computer , 1981 .

[15]  Spyros Makridakis,et al.  Accuracy measures: theoretical and practical concerns☆ , 1993 .

[16]  W. Dixon,et al.  Robustness in real life: a study of clinical laboratory data. , 1982, Biometrics.

[17]  D. F. Andrews,et al.  Robust Estimates of Location , 1972 .

[18]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[19]  Lawrence M. Spizman,et al.  A Note on Utilizing the Geometric Mean: When, Why and How the Forensic Economist Should Employ the Geometric Mean , 2008 .

[20]  Robert Fildes,et al.  Against Your Better Judgment? How Organizations Can Improve Their Use of Management Judgment in Forecasting , 2007, Interfaces.

[21]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[22]  Robert Fildes,et al.  The evaluation of extrapolative forecasting methods , 1992 .

[23]  G. S. Mudholkar Fisher's z‐Transformation , 2006 .

[24]  Adamantios Diamantopoulos,et al.  Alternative Indicators of Forecast Revision and Improvement , 1987 .

[25]  R. Fildes,et al.  Effective forecasting and judgmental adjustments: an empirical evaluation and strategies for improvement in supply-chain planning , 2009 .

[26]  J. Boylan,et al.  The accuracy of intermittent demand estimates , 2005 .

[27]  Juan R. Trapero,et al.  Nonlinear identification of judgmental forecasts effects at SKU level , 2011 .

[28]  Larry P. Ritzman,et al.  Integrating judgmental and quantitative forecasts: methodologies for pooling marketing and operations information , 2004 .

[29]  P. Neves,et al.  Evaluating core inflation indicators , 2003 .

[30]  Philip J. Fleming,et al.  How not to lie with statistics: the correct way to summarize benchmark results , 1986, CACM.

[31]  P. Goodwin,et al.  On the asymmetry of the symmetric MAPE , 1999 .

[32]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[33]  R. Fildes,et al.  Measuring forecasting accuracy : the case of judgmental adjustments to SKU-level demand forecasts , 2013 .