On issues concerning the assessment of information contained in aggregate data using the F-statistic

In this paper we investigate the informativeness of the aggregate data for inferring an association exists between the variables of a 2x2 table. This article introduces development of an F-test to determine the statistical significance of the information contained in the aggregate data for inferring a statistically significant association between the variables. Unlike Pearson's chi-squared statistic, the F-statistic is robust to any change in the sample size and depends only on the aggregate information in the contingency table. Thus this statistic provides an opportunity to understand the structure of a 2x2 table without being influenced by sample size. The applicability of this test is demonstrated by using the Selikoff's (1981) asbestosis data which was collected from 1117 insulation workers of New York City in 1963 to explore the links between asbestosis and occupational exposure to asbestos fibres. Such work was the key to establishing the link between asbestosis and mesothelioma. As a result of findings of this nature, many international government organisations have now banned the production, and importation, of goods that contain asbestosis fibres.

[1]  Leo A. Goodman,et al.  Some Alternatives to Ecological Correlation , 1959, American Journal of Sociology.

[2]  Eric J. Beh,et al.  The Information in Aggregate Data , 2004 .

[3]  L. A. Goodman Ecological Regressions and Behavior of Individuals , 1953 .

[4]  D. Steel,et al.  Ecological inference techniques: an empirical evaluation using data describing gender and voter turnout at New Zealand elections, 1893–1919 , 2010 .

[5]  Eric J. Beh The aggregate association index , 2010, Comput. Stat. Data Anal..

[6]  Adjusting the aggregate association index for large samples , 2013 .

[7]  D. Freedman,et al.  A solution to the ecological inference problem , 1997 .

[8]  R. L. Plackett,et al.  The marginal totals of a 2×2 table , 1977 .

[9]  Anthony C. Davison,et al.  Discussion of Wakefield, J. (2004) Ecological inference for 2x2 tables , 2003 .

[10]  Jerome Sacks,et al.  Ecological Regression and Voting Rights , 1991 .

[11]  Otis Dudley Duncan,et al.  An Alternative to Ecological Correlation , 1953 .

[12]  E. Beh Correspondence analysis of aggregate data: The 2×2 table , 2008 .

[13]  David G Steel,et al.  Simple methods for ecological inference in 2×2 tables , 2001 .

[14]  Real-World Occupational Epidemiology, Part 3: An Aggregate Data Analysis of Selikoff's “20-Year Rule” , 2012, Archives of environmental & occupational health.

[15]  Gary King,et al.  Binomial-Beta Hierarchical Models for Ecological Inference , 1999 .

[16]  E. Beh,et al.  Real World Occupational Epidemiology, Part 1: Odds Ratios, Relative Risk, and Asbestosis , 2011, Archives of environmental & occupational health.

[17]  Jon Wakefield,et al.  Bayes computation for ecological inference , 2011, Statistics in medicine.

[18]  Joseph Berkson,et al.  In dispraise of the exact test: Do the marginal totals of the 2X2 table contain relevant information respecting the table proportions? , 1978 .