This article describes how a mixture of two densities, f0 and f1, may be decomposed into a different mixture consisting of three densities. These new densities, f+, f-, and f=, summarize differences between f0 and f1: f+ is high in areas of excess of f1 compared to f0; f- represents deficiency of f1 compared to f0 in the same way; f= represents commonality between f1 and f0. The supports of f+ and f- are disjoint. This decomposition of the mixture of f0 and f1 is similar to the set-theoretic decomposition of the union of two sets A and B into the disjoint sets AB, BA, and A ∩ B. Sample points from f0 and f1can be assigned to one of these three densities, allowing the differences between f0 and f1 to be visualized in a single plot, a visual hypothesis test of whether f0 is equal to f1. We describe two similar such decompositions and contrast their behavior under the null hypothesis f0 = f1, giving some insight into how such plots may be interpreted. We present two examples of uses of these methods: visualization of departures from independence, and of a two-class classification problem. Other potential applications are discussed.
[1]
Catherine Blake,et al.
UCI Repository of machine learning databases
,
1998
.
[2]
Andreas Buja,et al.
XGobi: Interactive Dynamic Data Visualization in the X Window System
,
1998
.
[3]
S. Sheather.
Density Estimation
,
2004
.
[4]
David W. Scott,et al.
Multivariate Density Estimation: Theory, Practice, and Visualization
,
1992,
Wiley Series in Probability and Statistics.
[5]
D. J. Newman,et al.
UCI Repository of Machine Learning Database
,
1998
.
[6]
Nicholas I. Fisher,et al.
Bump hunting in high-dimensional data
,
1999,
Stat. Comput..