Statistical uncertainty associated with histograms in the Earth sciences

[1] Two types of quantitative information can be distinguished in the Earth sciences: categorical data (e.g., mineral type, fossil name) and continuous data (e.g., apparent age, strike, dip). Many branches of the Earth sciences study populations of such data by collecting a random sample and binning it into a histogram. Histograms of categorical data follow multinomial distributions. All possible outcomes of a multinomial distribution with M categories must plot on a (M − 1) simplex ΔM−1 because they are subject to a constant sum constraint. Confidence regions for such multinomial distributions can be computed using Bayesian statistics. The conjugate prior/posterior to the multinomial distribution is the Dirichlet distribution. A 100(1-α)% confidence interval for the unknown multinomial population given an observed sample histogram is a polygon on ΔM−1 containing 100(1-α)% of its Dirichlet posterior. The projection of this polygon onto the sides of the simplex yields M confidence intervals for the M bin counts. These confidence intervals are “simultaneous” in the sense that they form a band completely containing the 100(1-α)% most likely multinomial populations. As opposed to categorical variables, adjacent bins of histograms containing continuous variables are not mutually independent. If this “smoothness” of the unknown population is not taken into account, the Bayesian confidence bands described above will be overly conservative. This problem can be solved by introducing an ad hoc prior of “smoothing weights” w = e−sr, where r is the integrated squared second derivative of the histogram and s is a “smoothing parameter.”

[1]  D. Rubin The Bayesian Bootstrap , 1981 .

[2]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[3]  R. Sibson,et al.  Normal faults, normal friction? , 2001 .

[4]  Jeff Gill,et al.  Bayesian Methods : A Social and Behavioral Sciences Approach , 2002 .

[5]  G. Brakenridge,et al.  Provenance of North American Phanerozoic sandstones in relation to tectonic setting , 1983 .

[6]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[7]  P. Vermeesch How many grains are needed for a provenance study , 2004 .

[8]  Becker,et al.  Lunar impact history from (40)Ar/(39)Ar dating of glass spherules , 2000, Science.

[9]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[10]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[11]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[12]  D. Avigad,et al.  Origin of northern Gondwana Cambrian sandstone revealed by detrital zircon SHRIMP dating , 2003 .

[13]  M. Mcwilliams,et al.  Lithofacies control in detrital zircon provenance studies: Insights from the Cretaceous Methow basin, southern Canadian Cordillera , 2003 .

[14]  Jeff Gill,et al.  What are Bayesian Methods , 2008 .

[15]  Andrew B. Whitford Bayesian Methods: A Social and Behavioral Sciences Approach , 2003, Journal of Politics.

[16]  A. Tobi,et al.  A chart for judging the reliability of point counting results , 1965 .

[17]  W. Dickinson Interpreting Provenance Relations from Detrital Modes of Sandstones , 1985 .

[18]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[19]  S. Graham,et al.  Initiation and Long-Term Slip History of the Altyn Tagh Fault , 2001 .

[20]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[21]  G. Weltje Quantitative analysis of detrital modes: statistically rigorous confidence regions in ternary diagrams and their use in sedimentary petrology , 2002 .

[22]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[23]  M. Hazelton,et al.  Comparison of detrital zircon age distributions by kernel functional estimation , 2004 .

[24]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[25]  G. Migiros,et al.  Provenance of the Peloponnese (Greece) flysch based on heavy minerals , 2002, Geological Magazine.

[26]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[27]  C. Blyth Approximate Binomial Confidence Limits , 1986 .