THE COMPARISON OF PROPORTIONS: A REVIEW OF SIGNIFICANCE TESTS, CONFIDENCE INTERVALS AND ADJUSTMENTS FOR STRATIFICATION'

Probably the most common quantitative problem in biological research involves the question of the presence or absence of a particular attribute. In its simplest form it reduces to a 2 x 2 table analysis or a test for the difference between two proportions; in a more complex form it may involve comparison of several proportions adjusted for an auxiliary variable. This paper reviews the theory and application of statistical analyses appropriate for such data. The theory is set forth in the general framework of logistic models as formulated by Cox [16, 21] (see also Rasch [60]). This analysis uses the notion of conditioning the distribution of the test statistic on the ancillary statistics, which implies, in most cases, the so-called "fixed marginals" analysis of contingency tables. Other theoretical formulations are also considered briefly. The paper is divided into three parts: the first deals with 2 x 2 tables, the second with the combination of 2 x 2 tables, and the last with the combination of 2 x t tables. The latter two subjects are closely related to the analyses of higher order interaction in contingency tables and the fitting of multivariate contingency table data. These subjects are treated only peripherally here: the former subject has been thoroughly reviewed recently by Plackett [58] and the latter subject has been treated by Bishop [7] whose paper contains an extensive bibliography. The excellent papers by Birch [5, 6] deal with many of the questions we discuss here. For additional references, see Cox [21].

[1]  M. Bartlett Contingency Table Interactions , 1935 .

[2]  Marvin A. Kastenbaum,et al.  On the Hypothesis of No "Interaction" In a Multi-way Contingency Table , 1956 .

[3]  W. Harkness Properties of the extended hypergeometric distribution , 1965 .

[4]  S. Radhakrishna,et al.  Combination of results from several 2 X 2 contingency tables , 1965 .

[5]  M. W. Birch The Detection of Partial Association, Ii: The General Case , 1965 .

[6]  D. Fraser The Structure of Inference. , 1969 .

[7]  G. W. Snedecor Statistical Methods , 1964 .

[8]  J. Gart Approximate Confidence Limits for the Relative Risk , 1962 .

[9]  Nathan Mantel,et al.  Chi-square tests with one degree of freedom , 1963 .

[10]  B. Woolf ON ESTIMATING THE RELATION BETWEEN BLOOD GROUP AND DISEASE , 1955, Annals of human genetics.

[11]  D. Cox Note on Grouping , 1957 .

[12]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[13]  R. A. Fisher,et al.  Statistical methods and scientific inference. , 1957 .

[14]  Donald G. Thomas Algorithm AS 36: Exact Confidence Limits for the Odds Ratio in a 2 × 2 Table , 1971 .

[15]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[16]  H. W. Norton Calculation of Chi-Square for Complex Contingency Tables , 1945 .

[17]  Rory A. Fisher,et al.  Statistical methods and scientific inference. , 1957 .

[18]  G. H. Jowett The Harmonic Standardisation of Comparisons between Success Rates in Two Heterogeneous Groups of Patients , 1964 .

[19]  J. Gart,et al.  Numerical Results on Approximate Confidence Limits for the Odds Ratio , 1972 .

[20]  O. Miettinen Estimation of relative risk from individually matched series. , 1970, Biometrics.

[21]  R. Fisher,et al.  The Logic of Inductive Inference , 1935 .

[22]  Leo A. Goodman,et al.  Simultaneous Confidence Limits for Cross‐Product Ratios in Contingency Tables , 1964 .

[23]  D. Cox The Regression Analysis of Binary Sequences , 1958 .

[24]  W. G. Cochran The comparison of percentages in matched samples. , 1950, Biometrika.

[25]  B. M. Bennett Note on X 2 Tests for Matched Samples , 1968 .

[26]  P. Armitage The Chi‐Square Test for Heterogeneity of Proportions, after Adjustment for Stratification , 1966 .

[27]  Y. Bishop,et al.  Full Contingency Tables, Logits, and Split Contingency Tables , 1969 .

[28]  W. Stevens Mean and variance of an entry in a contingency table , 1951 .

[29]  O S Miettinen,et al.  The matched pairs design in the case of all-or-none responses. , 1968, Biometrics.

[30]  John J. Gart,et al.  Point and interval estimation of the common odds ratio in the combination of 2 × 2 tables with fixed marginals , 1970 .

[31]  Jerome Cornfield,et al.  A Statistical Problem Arising from Retrospective Studies , 1956 .

[32]  O S Miettinen,et al.  Individual matching with multiple controls in the case of all-or-none responses. , 1969, Biometrics.

[33]  B. M. Bennett Tests of Hypotheses Concerning Matched Samples , 1967 .

[34]  J. Haldane THE EXACT VALUE OF THE MOMENTS OF THE DISTRIBUTION OF x2 USED AS A TEST OF GOODNESS OF FIT, WHEN EXPECTATIONS ARE SMALL , 1937 .

[35]  D. Cox Two further applications of a model for binary regression , 1958 .

[36]  Robert J. Buehler,et al.  SOME VALIDITY CRITERIA FOR STATISTICAL INFERENCES , 1959 .

[37]  M. W. Birch The Detection of Partial Association, I: The 2 × 2 Case , 1964 .

[38]  D. Cox Some problems connected with statistical inference , 1958 .

[39]  J. Gart,et al.  On the bias of various estimators of the logit and its variance with application to quantal bioassay. , 1967, Biometrika.

[40]  J. Gart An exact test for comparing matched proportions in crossover designs , 1969 .

[41]  G. Chase,et al.  On the efficiency of matched pairs in Bernoulli trials , 1968 .

[42]  David R. Cox The analysis of binary data , 1970 .

[43]  C. Odoroff A Comparison of Minimum Logit Chi-Square Estimation and Maximum Likelihood Estimation in 2×2×2 and 3×2×2 Contingency Tables: Tests for Interaction , 1970 .

[44]  W. M. Kincaid The Combination of 2 x m Contingency Tables , 1962 .

[45]  Leo A. Goodman,et al.  Simple Methods for Analyzing Three-Factor Interaction in Contingency Tables , 1964 .

[46]  N Mantel,et al.  Models for complex contingency tables and polychotomous dosage response curves. , 1966, Biometrics.

[47]  E. Sverdrup The present state of the decision theory and the Neyman-Pearson theory , 1966 .

[48]  D. Cox A simple example of a comparison involving quantal data. , 1966, Biometrika.

[49]  J M Nam,et al.  On two tests for comparing matched proportions. , 1971, Biometrics.

[50]  J J Gart,et al.  Bioassay of pesticides and industrial chemicals for tumorigenicity in mice: a preliminary note. , 1969, Journal of the National Cancer Institute.

[51]  F. Mosteller Some Statistical Problems in Measuring the Subjective Response to Drugs , 1952 .

[52]  L. A. Goodman The Multivariate Analysis of Qualitative Data: Interactions among Multiple Classifications , 1970 .

[53]  J. Hannan,et al.  Normal Approximation to the Distribution of Two Independent Binomials, Conditional on Fixed Sum , 1963 .

[54]  D. Cox Fieller's theorem and a generalization. , 1967, Biometrika.

[55]  G. Koch,et al.  Analysis of categorical data by linear models. , 1969, Biometrics.

[56]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[57]  A F R S Sir Ronald Fisher,et al.  CONFIDENCE LIMITS FOR A CROSS‐PRODUCT RATIO , 1962 .

[58]  John J. Gart,et al.  On the Combination of Relative Risks , 1962 .