Density distribution sunflower plots are used to display high-density bivariate data. They are useful for data where a conventional scatter plot is difficult to read due to overstriking of the plot symbol. The x-y plane is subdivided into a lattice of regular hexagonal bins of width w specified by the user. The user also specifies the values of l, d, and k that affect the plot as follows. Individual observations are plotted when there are less than l observations per bin as in a conventional scatter plot. Each bin with from l to d observations contains a light sunflower. Other bins contain a dark sunflower. In a light sunflower each petal represents one observation. In a dark sunflower, each petal represents k observations. (A dark sunflower with p petals represents between /2-pk k and /2+pk k observations.) The user can control the sizes and colors of the sunflowers. By selecting appropriate colors and sizes for the light and dark sunflowers, plots can be obtained that give both the overall sense of the data density distribution as well as the number of data points in any given region. The use of this graphic is illustrated with data from the Framingham Heart Study. A documented Stata program, called sunflower, is available to draw these graphs. It can be downloaded from the Statistical Software Components archive at http://ideas.repec.org/c/boc/bocode/s430201.html . (Journal of Statistical Software 2003; 8 (3): 1-5. Posted at http://www.jstatsoft.org/index.php?vol=8 .)
[1]
Tx Station.
Stata Statistical Software: Release 7.
,
2001
.
[2]
John Alan McDonald,et al.
Variable Resolution Bivariate Plots
,
1997
.
[3]
Marcello Pagano,et al.
Principles of Biostatistics
,
1992
.
[4]
D. W. Scott.
A Note on Choices of Bivariate Histogram Bin Shape
,
1985
.
[5]
Robert McGill,et al.
The Many Faces of a Scatterplot
,
1984
.
[6]
John W. Tukey,et al.
Exploratory Data Analysis.
,
1979
.
[7]
Daniel B. Carr,et al.
Scatterplot matrix techniques for large N
,
1986
.