Theory of confusion

Given a classifier, presently we use a confusion matrix to quantify how much the classifier deviates from truth based upon training data. Shortcomings to this limited application of the confusion matrix are that (1) it does not communicate data trends in feature space, for example where errors congregate, and (2) the truth mapping is largely unknown except for a small, potentially biased sample set. In practice, one does not have truth but has to rely on an expert's opinion. We propose the mathematical theory of confusion comparing and contrasting the opinions of two experts (i.e., two classifiers). This theory has advantages over traditional confusion matrices in that it provides a capability for expressing classification confidence over ALL of feature space, not just at sampled truth. This theory quantifies different types of confusion between classifiers and yields a region of feature space where confusion occurs. An example using Artificial Neural Networks will be given.