Geodesic Distance on Gaussian Manifolds to Reduce the Statistical Errors in the Investigation of Complex Systems

In the last years the reputation of medical, economic, and scientific expertise has been strongly damaged by a series of false predictions and contradictory studies. The lax application of statistical principles has certainly contributed to the uncertainty and loss of confidence in the sciences. Various assumptions, generally held as valid in statistical treatments, have proved their limits. In particular, since some time it has emerged quite clearly that even slightly departures from normality and homoscedasticity can affect significantly classic significance tests. Robust statistical methods have been developed, which can provide much more reliable estimates. On the other hand, they do not address an additional problem typical of the natural sciences, whose data are often the output of delicate measurements. The data can therefore not only be sampled from a nonnormal pdf but also be affected by significant levels of Gaussian additive noise of various amplitude. To tackle this additional source of uncertainty, in this paper it is shown how already developed robust statistical tools can be usefully complemented with the Geodesic Distance on Gaussian Manifolds. This metric is conceptually more appropriate and practically more effective, in handling noise of Gaussian distribution, than the traditional Euclidean distance. The results of a series of systematic numerical tests show the advantages of the proposed approach in all the main aspects of statistical inference, from measures of location and scale to size effects and hypothesis testing. Particularly relevant is the reduction even of 35% in Type II errors, proving the important improvement in power obtained by applying the methods proposed in the paper. It is worth emphasizing that the proposed approach provides a general framework, in which also noise of different statistical distributions can be dealt with.

[1]  A. Hama,et al.  Wrong: Why Experts* Keep Failing Us - and How to Know When Not to Trust Them , 2011 .

[2]  J. Brooks Why most published research findings are false: Ioannidis JP, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece , 2008 .

[3]  A. Murari,et al.  Symbolic regression via genetic programming for data driven derivation of confinement scaling laws without any assumption on their mathematical form , 2014 .

[4]  Rand R. Wilcox,et al.  Fundamentals of Modern Statistical Methods , 2001 .

[5]  P. C. de Vries,et al.  Towards the realization on JET of an integrated H-mode scenario for ITER , 2002 .

[6]  Michela Gelfusa,et al.  Clustering based on the geodesic distance on Gaussian manifolds for the automatic classification of disruptions , 2013 .

[7]  Michela Gelfusa,et al.  How to Handle Error Bars in Symbolic Regression for Data Mining in Scientific Applications , 2015, SLDS.

[8]  A. Murari,et al.  Application of symbolic regression to the derivation of scaling laws for tokamak energy confinement time in terms of dimensionless quantities , 2016 .

[9]  Richard Kamendje,et al.  Overview of the JET results , 2005 .

[10]  R. Blair,et al.  A more realistic look at the robustness and Type II error properties of the t test to departures from population normality. , 1992 .

[11]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[12]  S. Sawilowsky Nonparametric Tests of Interaction in Experimental Design , 1990 .

[13]  Geert Verdoolaege,et al.  A New Robust Regression Method Based on Minimization of Geodesic Distances on a Probabilistic Manifold: Application to Power Laws , 2015, Entropy.

[14]  R. Wilcox Applying Contemporary Statistical Techniques , 2003 .

[15]  Jet Efda Contributors,et al.  Automatic disruption classification based on manifold learning for real-time applications on JET , 2013 .

[16]  A. Murari,et al.  A new approach to the formulation and validation of scaling expressions for plasma confinement in tokamaks , 2015 .

[17]  R. Wilcox Introduction to Robust Estimation and Hypothesis Testing , 1997 .

[18]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[19]  C. Giroud,et al.  Radiation pattern and impurity transport in argon seeded ELMy H-mode discharges in JET , 2002 .