A goodness-of-fit statistical toolkit

Statistical methods play a significant role throughout the life-cycle of physics experiments, being an essential component of physics analysis. The present project in progress aims to develop an object-oriented software Toolkit for statistical data analysis. The Toolkit contains a variety of Goodness-of-Fit (GoF) tests, from Chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Goodman, Fisz-Cramer-von Mises, Kuiper. Thanks to the component-based design and the usage of the standard abstract interfaces for data analysis, this tool can be used by other data analysis systems or integrated in experimental software frameworks. In this paper we describe the statistical details of the algorithms and the computational features of the Toolkit. With the aim of showing the consistency between the code and the mathematical features of the algorithms, we describe the results we obtained reproducing by means of the Toolkit a couple of Goodness-of-Fit testing examples of relevance in statistics literature.

[1]  H. Cramér On the composition of elementary errors: Second paper: Statistical applications , 1928 .

[2]  N. Smirnov Table for Estimating the Goodness of Fit of Empirical Distributions , 1948 .

[3]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[4]  D. Darling,et al.  A Test of Goodness of Fit , 1954 .

[5]  L. A. Goodman,et al.  Kolmogorov-Smirnov tests for psychological research. , 1954, Psychological bulletin.

[6]  D. Darling The Kolmogorov-Smirnov, Cramer-von Mises Tests , 1957 .

[7]  N. Kuiper Tests concerning random points on a circle , 1960 .

[8]  M. Fisz On a Result by M. Rosenblatt Concerning the Von Mises-Smirnov Test , 1960 .

[9]  T. W. Anderson On the Distribution of the Two-Sample Cramer-von Mises Criterion , 1962 .

[10]  E. J. Burr Distribution of the Two-Sample Cramer-Von Mises Criterion for Small Equal Samples , 1963 .

[11]  B. W. Feather,et al.  The effect of augmented sensory feedback on the control of salivation. , 1968, Psychophysiology.

[12]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[13]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[14]  O. Couet,et al.  PAW — Towards a physics analysis workstation , 1987 .

[15]  Joseph P. Romano A Bootstrap Revival of Some Nonparametric Distance Tests , 1988 .

[16]  M. A. Stephens,et al.  Introduction to Kolmogorov (1933) On the Empirical Determination of a Distribution , 1992 .

[17]  A. Martin-Löf On the composition of elementary errors , 1994 .

[18]  J. Praestgaard Permutation and bootstrap Kolmogorov-Smirnov tests for the equality of two distributions , 1995 .

[19]  F. Rademakers,et al.  ROOT — An object oriented data analysis framework , 1997 .

[20]  Fons Rademakers,et al.  ROOT — An object oriented data analysis framework , 1997 .

[21]  Ivar Jacobson,et al.  The Unified Software Development Process , 1999 .

[22]  O. Couet,et al.  Anaphe { OO Libraries and Tools for Data Analysis , 2001 .

[23]  Guy Barrand,et al.  Abstract Interfaces for Data Analysis - Component Architecture for Data Analysis Tools , 2002 .

[24]  A. Mantero,et al.  Simulation of X-ray fluorescence and application to planetary astrophysics , 2003, 2003 IEEE Nuclear Science Symposium. Conference Record (IEEE Cat. No.03CH37515).

[25]  S. Donadio,et al.  Precision validation of Geant4 electromagnetic physics , 2003, 2003 IEEE Nuclear Science Symposium. Conference Record (IEEE Cat. No.03CH37515).

[26]  B. Mascialino,et al.  Implementation of a new Monte Carlo simulation tool for the development of a proton therapy beam line and verification of the related dose distributions , 2003, 2003 IEEE Nuclear Science Symposium. Conference Record (IEEE Cat. No.03CH37515).