Testing Randomness Online

The hypothesis of randomness is fundamental in statistical machine learning and in many areas of nonparametric statistics; it says that the observations are assumed to be independent and coming from the same unknown probability distribution. This hypothesis is close, in certain respects, to the hypothesis of exchangeability, which postulates that the distribution of the observations is invariant with respect to their permutations. This paper reviews known methods of testing the two hypotheses concentrating on the online mode of testing, when the observations arrive sequentially. All known online methods for testing these hypotheses are based on conformal martingales, which are defined and studied in detail. The paper emphasizes conceptual and practical aspects and states two kinds of results. Validity results limit the probability of a false alarm or the frequency of false alarms for various procedures based on conformal martingales, including conformal versions of the CUSUM and Shiryaev– Roberts procedures. Efficiency results establish connections between randomness, exchangeability, and conformal martingales. The version of this paper at http://alrw.net (Working Paper 24) is updated most often.

[1]  Wouter M. Koolen,et al.  Safe Testing , 2019, 2020 Information Theory and Applications Workshop (ITA).

[2]  V. Vovk,et al.  E-values: Calibration, combination, and applications , 2019 .

[3]  Vladimir Vovk,et al.  Game‐Theoretic Foundations for Probability and Finance , 2019, Wiley Series in Probability and Statistics.

[4]  Albert N. Shiryaev Stochastic Disorder Problems , 2019 .

[5]  J. Berger,et al.  Three Recommendations for Improving the Use of p-Values , 2019, The American Statistician.

[6]  D. Wei Some Aspects of Change Point Analysis , 2019 .

[7]  G. Shafer The Language of Betting as a Strategy for Statistical and Scientific Communication , 2019, 1903.06991.

[8]  Alexander Gammerman,et al.  Inductive Conformal Martingales for Change-Point Detection , 2017, COPA.

[9]  Wenyu Du,et al.  On robustness of the Shiryaev–Roberts change-point detection procedure under parameter misspecification in the post-change distribution , 2017, Commun. Stat. Simul. Comput..

[10]  Marie Schmidt,et al.  Nonparametrics Statistical Methods Based On Ranks , 2016 .

[11]  Vladimir Vovk,et al.  Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications , 2014 .

[12]  Frederico Caeiro,et al.  An R implementation of several randomness tests , 2014 .

[13]  V. Vovk,et al.  Combining P-Values Via Averaging , 2012, Biometrika.

[14]  P. Grünwald,et al.  Catching up faster by switching sooner: a predictive approach to adaptive estimation with an application to the AIC–BIC dilemma , 2012 .

[15]  Alexander Gammerman,et al.  Plug-in martingales for testing exchangeability on-line , 2012, ICML.

[16]  Nancy R. Zhang,et al.  Detecting simultaneous variant intervals in aligned sequences , 2011, 1108.3177.

[17]  G. Shafer,et al.  Test Martingales, Bayes Factors and p-Values , 2009, 0912.4269.

[18]  Harry Wechsler,et al.  A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Shiryaev Quickest Detection Problems: Fifty Years Later , 2010 .

[20]  G. Shafer,et al.  The Sources of Kolmogorov’s Grundbegriffe , 2006, math/0606533.

[21]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[22]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[23]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[24]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[25]  Daniel Keysers Approaches to Invariant Image Object Recognition , 2000 .

[26]  F. Y. Edgeworth,et al.  The theory of statistics , 1996 .

[27]  P. K. Bhattacharya,et al.  Some aspects of change-point analysis , 1994 .

[28]  V. Vovk,et al.  On the Empirical Validity of the Bayesian Method , 1993 .

[29]  V. Vovk A logic of probability, with application to the foundations of statistics , 1993 .

[30]  A. Shiryayev On Tables of Random Numbers , 1993 .

[31]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[32]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[33]  D. McDonald A cusum procedure based on sequential ranks , 1990 .

[34]  A. Kolmogorov,et al.  ALGORITHMS AND RANDOMNESS , 1988 .

[35]  M. Pollak Average Run Lengths of an Optimal Method of Detecting a Change in Distribution. , 1987 .

[36]  V. Vovk On the concept of the Bernoulli property , 1986 .

[37]  A. N. Kolmogorov Combinatorial foundations of information theory and the calculus of probabilities , 1983 .

[38]  R. Bartels The Rank Version of von Neumann's Ratio Test for Randomness , 1982 .

[39]  B. K. Ghosh,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[40]  J. Lukasiewicz Logical foundations of probability theory , 1970 .

[41]  Y. Zel’dovich,et al.  AN OPEN UNIVERSE. , 1969 .

[42]  Andrei N. Kolmogorov,et al.  Logical basis for information theory and probability theory , 1968, IEEE Trans. Inf. Theory.

[43]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[44]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[45]  S. W. Roberts A Comparison of Some Control Chart Procedures , 1966 .

[46]  A. Shiryaev On Optimum Methods in Quickest Detection Problems , 1963 .

[47]  A. N. Kolmogorov,et al.  Foundations of the theory of probability , 1960 .

[48]  John L. Kelly,et al.  A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.

[49]  H. Robbins A Remark on Stirling’s Formula , 1955 .

[50]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[51]  J. Wolfowitz,et al.  Optimum Character of the Sequential Probability Ratio Test , 1948 .

[52]  J. Wolfowitz,et al.  An Exact Test for Randomness in the Non-Parametric Case Based on Serial Correlation , 1943 .

[53]  Jean-Luc Ville Étude critique de la notion de collectif , 1939 .

[54]  Helly Grundbegriffe der Wahrscheinlichkeitsrechnung , 1936 .

[55]  A. R. Crathorne,et al.  Economic Control of Quality of Manufactured Product. , 1933 .

[56]  L. M. M.-T. Theory of Probability , 1929, Nature.

[57]  A. Cournot Exposition de la théorie des chances et des probabilités , 1843 .