Bayesian Factor Analysis for Spatially Correlated Data, With Application to Summarizing Area-Level Material Deprivation From Census Data

This article describes a Bayesian hierarchical model for factor analysis of spatially correlated multivariate data. The first level specifies, for each area on a map, the distribution of a vector of manifest variables conditional on an underlying latent factor; at the second level, the area-specific latent factors have a joint distribution that incorporates spatial correlation. The framework allows for both marginal and conditional (e.g., conditional autoregressive) specifications of spatial correlation. The model is used to quantify material deprivation at the census tract level using data from the 1990 U.S. Census in Rhode Island. An existing and widely used measure of material deprivation is the Townsend index, an unweighted sum of four standardized census variables (i.e., Z scores) corresponding to area-level proportions of unemployment, car ownership, crowding, and home ownership. The Townsend and many related indices are computed as linear combinations of measured census variables, which motivates the factor-analytic structure adopted here. The model-based index is the posterior expectation of the latent factor, given the census variables and model parameters. Index construction based on a model allows several improvements over Townsend's and similarly constructed indices: (1) The index can be represented as a weighted sum of (standardized) census variables, with data-driven weights; (2) by using posterior summaries, the indices can be reported with corresponding measures of uncertainty; and (3) incorporating information from neighboring areas improves precision of the posterior parameter distributions. Using data from Rhode Island census tracts, we apply our model to summarize variations in material deprivation across the state. Our analysis entertains various spatial covariance structures. We summarize the relative contributions of each census variable to the latent index, suggest ways to report material deprivation at the area level, and compare our model-based summaries with those found by applying the standard Townsend index.

[1]  Josef Guttmann,et al.  Bayesian Inference in Factor Analysis -- Revised , 1973 .

[2]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[3]  B Jarman,et al.  Identification of underprivileged areas. , 1983 .

[4]  B Jarman,et al.  Identification of underprivileged areas , 1983, British medical journal.

[5]  P. Townsend,et al.  Inequalities in Health in the City of Bristol: A Preliminary Review of Statistical Evidence , 1985, International journal of health services : planning, administration, evaluation.

[6]  D. Bartholomew Latent Variable Models And Factor Analysis , 1987 .

[7]  J. L. Grand,et al.  Inequalities in health: Some international comparisons , 1987 .

[8]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[9]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[10]  N. Cressie,et al.  Spatial Modeling of Regional Variables , 1993 .

[11]  C. Foy,et al.  Comparison of two scores for allocating resources to doctors in deprived areas. , 1989, BMJ.

[12]  Thomas A. Louis,et al.  Empirical Bayes Ranking Methods , 1989 .

[13]  T. Sheldon,et al.  Weighting in the dark: resource allocation in the new NHS. , 1993, BMJ.

[14]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[15]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[16]  L Bernardinelli,et al.  Bayesian estimates of disease maps: how important are priors? , 1995, Statistics in medicine.

[17]  V. Carstairs,et al.  Deprivation indices: their interpretation and use in relation to health. , 1995, Journal of epidemiology and community health.

[18]  J. Geweke,et al.  Measuring the pricing error of the arbitrage pricing theory , 1996 .

[19]  L. Ryan,et al.  Latent variable models with fixed effects. , 1996, Biometrics.

[20]  E G Luebeck,et al.  Particulate air pollution and mortality. , 1996, Epidemiology.

[21]  J. Donovan,et al.  Deprivation and cause specific morbidity: evidence from the Somerset and Avon survey of health , 1996, BMJ.

[22]  Bradley P. Carlin,et al.  Hierarchical Spatio-Temporal Mapping of Disease Rates , 1997 .

[23]  J. Sundquist,et al.  Indices of need and social deprivation for primary health care , 1998, Scandinavian journal of social medicine.

[24]  Alan E. Gelfand,et al.  Model choice: A minimum posterior predictive loss approach , 1998, AISTATS.

[25]  S. Raudenbush,et al.  Assessing Direct and Indirect Effects in Multilevel Designs with Latent Variables , 1999 .

[26]  J. Sundquist,et al.  Cardiovascular risk factors and the neighbourhood environment: a multilevel analysis. , 1999, International journal of epidemiology.

[27]  E. Lesaffre,et al.  Disease mapping and risk assessment for public health. , 1999 .

[28]  P. Speckman,et al.  Posterior distribution of hierarchical models using CAR(1) distributions , 1999 .

[29]  J. Copeland,et al.  Socio-economic deprivation and the prevalence and prediction of depression in older community residents , 1999, British Journal of Psychiatry.

[30]  Hal S. Stern,et al.  Inference for extremes in disease mapping , 1999 .

[31]  M. West,et al.  Bayesian Dynamic Factor Models and Portfolio Allocation , 2000 .

[32]  F. Dominici,et al.  Fine particulate air pollution and mortality in 20 U.S. cities, 1987-1994. , 2000, The New England journal of medicine.

[33]  Jarvis T. Chen,et al.  Geocoding and monitoring of US socioeconomic inequalities in mortality and cancer incidence: does the choice of area-based measure and geographic level matter?: the Public Health Disparities Geocoding Project. , 2002, American journal of epidemiology.

[34]  Yasuo Amemiya,et al.  Latent Variable Analysis of Multivariate Spatial Data , 2002 .

[35]  I Diamond,et al.  Interrelations between three proxies of health care need at the small area level: an urban/rural comparison , 2002, Journal of epidemiology and community health.

[36]  Daniel B. Rowe,et al.  Multivariate Bayesian Statistics: Models for Source Separation and Signal Unmixing , 2002 .

[37]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[38]  Jarvis T. Chen,et al.  Geocoding and Measurement of Neighborhood Socioeconomic Position: A U.S. Perspective , 2003 .

[39]  Yasuo Amemiya,et al.  Modeling and prediction for multivariate spatial factor analysis , 2003 .

[40]  Melanie M Wall,et al.  Generalized common spatial factor model. , 2003, Biostatistics.