Discovering Unknown Unknowns of Predictive Models

Predictive models are widely used in domains ranging from judiciary and healthcare to autonomous driving. As we increasingly rely on these models for high-stakes decisions, identifying and characterizing their unexpected failures in the real world is critical. We categorize errors of a predictive model as: known unknowns and unknown unknowns [3]. Known unknowns are those data points for which the model makes low confidence predictions and errs, whereas unknown unknowns correspond to those points where the model is highly confident about its predictions, but is actually wrong. Since the model lacks awareness of such unknown unknowns, approaches developed for addressing known unknowns (e.g., active learning) cannot be used for discovering unknown unknowns. Unknown unknowns primarily occur when the data used for training a predictive model is not representative of the samples encountered during test time, i.e., when the model is deployed in the wild. This mismatch could be a result of biases in the collection of training data or differences between the train and test distributions due to temporal, spatial or other factors such as a subtle shift in task definition.