Exchangeability and regression models

Sir David Cox’s statistical career and his lifelong interest in the theory and application of stochastic processes began with problems in the wool industry. The problem of drafting a strand of wool yarn to near uniform width is not an auspicious starting point, but an impressive array of temporal and spectral methods from stationary time series were brought to bear on the problem in Cox (1949). His ability to extract the fundamental from the mundane became evident in his discovery or construction of the eponymous Cox process in the counting of neps in a sample of wool yarn (Cox, 1955). Subsequent applications included hydrology and long-range dependence (Davison and Cox 1989; Cox 1991), models for rainfall (Cox and Isham 1988; Rodriguez-Iturbe, Cox and Isham 1987, 1988), and models for the spread of infectious diseases (Anderson, Cox and Hillier 1989). At some point in the late 1950s, the emphasis shifted to statistical models for dependence, the way in which a response variable depends on known explanatory variables or factors (Cox, 1958a). His contributions in both areas have been extraordinarily insightful, Cox processes being a fundamental class of point processes, and the Cox model playing a similar role in survival analysis. In addition to these, we have the Box-Cox transformation, binary regression models (Cox 1958b) and models relevant to agricultural field trials. This brief summary is a gross simplification of Sir David’s work, but it suits my purpose by way of introduction because the chief goal of this chapter is to explore the relation between exchangeability, a concept from stochastic processes, and regression models in which the observed process is modulated by a covariate. It is usual to introduce the notion of a stochastic process as a collection of random variables, Y1, Y2, . . ., usually an infinite set though not necessarily an ordered sequence. What this means is that U is an index set of statistical units, and for each finite subset S = {u1, . . . , un} of elements in U , the value Y (S) = ( Y (u1), . . . , Y (un) ) of the process on S has distribution PS on RS . This chapter emphasizes probability distributions rather than random variables. A real-valued process is thus a consistent assignment of probability distributions to observation spaces such that the distribution Pn on Rn is the marginal distribution of Pn+1 on Rn+1 under deletion of the relevant coordinate. A notation such as Rn that emphasizes the dimension of the observation space is not entirely satisfactory because two samples of equal size need not have the same distribution, so we write RS rather than Rn for the set of real-valued functions on the sampled units. A process is said to be exchangeable if each finite-dimensional distribution is symmetric, or invariant under coordinate permutation. The definition suggests that exchangeability can have no role in statistical models for dependence, in which the distributions are overtly non-exchangeable on account of differences in covariate values. I argue that

[1]  Glenn Shafer,et al.  Comments on "Causal Inference without Counterfactuals" by A.P. Dawid , 1999 .

[2]  J. Besag What is a statistical model? Discussion , 2002 .

[3]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[4]  John A. Nelder,et al.  Invariance and factorial models. Discussion , 2000 .

[5]  Valerie Isham,et al.  Some models for rainfall based on stochastic point processes , 1987, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[6]  David R. Cox,et al.  A simple spatial-temporal model of rainfall , 1988, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[7]  J. Nelder A Reformulation of Linear Models , 1977 .

[8]  B. deFinetti,et al.  Theory of Probability , 1981 .

[9]  Clive W. J. Granger,et al.  Some recent developments in a concept of causality , 2001 .

[10]  Y. Vardi Empirical Distributions in Selection Bias Models , 1985 .

[11]  R. Wolpert,et al.  Likelihood Principle , 2022, The SAGE Encyclopedia of Research Design.

[12]  D. Lindley Seeing and Doing: the Concept of Causation , 2002 .

[13]  V. Isham,et al.  A point process model for rainfall: further developments , 1988, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[14]  J. Nelder,et al.  Double hierarchical generalized linear models (with discussion) , 2006 .

[15]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[16]  D. Cox Causality : some statistical aspects , 1992 .

[17]  G. Wahba Spline models for observational data , 1990 .

[18]  J. Pearl Comments on Seeing and Doing , 2002 .

[19]  M. Kendall Theoretical Statistics , 1956, Nature.

[20]  B. Schmitz,et al.  How snapping shrimp snap: through cavitating bubbles. , 2000, Science.

[21]  D. Rubin Comment on "Causal inference without counterfactuals," by Dawid AP , 2000 .

[22]  R. Anderson,et al.  Epidemiological and statistical aspects of the AIDS epidemic: introduction. , 1989, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[23]  D. Aldous Representations for partially exchangeable arrays of random variables , 1981 .

[24]  D. Cox Some Statistical Methods Connected with Series of Events , 1955 .

[25]  D. A. Bell,et al.  Applied Statistics , 1953, Nature.

[26]  A. Dawid Influence Diagrams for Causal Modelling and Inference , 2002 .

[27]  A. C. Davison,et al.  Some simple properties of sums of random variables having long-range dependence , 1989, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[28]  David R. Cox,et al.  On sampling and the estimation of rare errors , 1979 .

[29]  I NICOLETTI,et al.  The Planning of Experiments , 1936, Rivista di clinica pediatrica.

[30]  A. P. Dawid,et al.  Causal inference without counterfactuals (with Discussion) , 2000 .

[31]  J. Nelder,et al.  Hierarchical Generalized Linear Models , 1996 .

[32]  David Roxbee Cox,et al.  Theory of drafting of wool slivers. I , 1949, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[33]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[34]  D. Cox LONG‐RANGE DEPENDENCE, NON‐LINEARITY AND TIME IRREVERSIBILITY , 1991 .

[35]  G. Wahba A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem , 1985 .

[36]  David R. Cox,et al.  Prediction and asymptotics , 1996 .

[37]  G. Box An analysis of transformations (with discussion) , 1964 .

[38]  J. Kingman The Representation of Partition Structures , 1978 .

[39]  A. Agresti Categorical data analysis , 1993 .

[40]  G. Robinson That BLUP is a Good Thing: The Estimation of Random Effects , 1991 .

[41]  A. B. Hill,et al.  Principles of Medical Statistics , 1950, The Indian Medical Gazette.

[42]  P. McCullagh,et al.  A theory of statistical models for Monte Carlo integration , 2003 .

[43]  B. Efron Selection Criteria For Scatterplot Smoothers , 1999 .