New Categorical Metrics for Air Quality Model Evaluation

Abstract Traditional categorical metrics used in model evaluations are “clear cut” measures in that the model’s ability to predict an “exceedance” is defined by a fixed threshold concentration and the metrics are defined by observation–forecast sets that are paired both in space and time. These metrics are informative but limited in evaluating the performance of air quality forecast (AQF) systems because AQF generally examines exceedances on a regional scale rather than a single monitor. New categorical metrics—the weighted success index (WSI), area hit (aH), and area false-alarm ratio (aFAR)—are developed. In the calculation of WSI, credits are given to the observation–forecast pairs within the observed exceedance region (missed forecast) or the forecast exceedance region (false alarm), depending on the distance of the points from the central line (perfect observation–forecast match line or 1:1 line on scatterplot). The aH and aFAR are defined by matching observed and forecast exceedances within an area ...