Dealing with Missing Data: A Comparative Exploration of Approaches Using the Integrated City Sustainability Database

Studies of governments and local organizations using survey data have played a critical role in the development of urban studies and related disciplines. However, missing data pose a daunting challenge for this research. This article seeks to raise awareness about the treatment of missing data in urban studies research by comparing and evaluating three commonly used approaches to deal with missing data—listwise deletion, single imputation, and multiple imputation. Comparative analyses illustrate the relative performance of these approaches using the second-generation Integrated City Sustainability Database (ICSD). The results demonstrate the benefit of using an approach to missing data based on multiple imputation, using a theoretically informed and statistically supported set of predictor variables to develop a more complete sample that is free of issues raised by nonresponse in survey data. The results confirm the usefulness of the ICSD in the study of environmental and sustainability and other policy in U.S. cities. We conclude with a discussion of results and provide a set of recommendations for urban researcher scholars.

[1]  Susan L Handy,et al.  City Adoption of Environmentally Sustainable Policies in California's Central Valley , 2009 .

[2]  Andrew Gelman,et al.  Diagnostics for multivariate imputations , 2007 .

[3]  Carol M. Werner,et al.  Walking routes to school in new urban and suburban neighborhoods: An environmental walkability analysis of blocks and routes , 2011 .

[4]  Rebekah Young,et al.  Imputing the Missing Y ’ s : Implications for Survey Producers and Survey Users , 2010 .

[5]  C. Keyes,et al.  The structure of psychological well-being revisited. , 1995, Journal of personality and social psychology.

[6]  Patrick Royston,et al.  Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables☆ , 2010, Comput. Stat. Data Anal..

[7]  D. Rubin,et al.  MULTIPLE IMPUTATIONS IN SAMPLE SURVEYS-A PHENOMENOLOGICAL BAYESIAN APPROACH TO NONRESPONSE , 2002 .

[8]  T. Stijnen,et al.  Review: a gentle introduction to imputation of missing values. , 2006, Journal of clinical epidemiology.

[9]  R. Feiock,et al.  Forms of Government and Climate Change Policies in US Cities , 2013 .

[10]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[11]  Paul Zhang Multiple Imputation: Theory and Method , 2003 .

[12]  Patrick Royston,et al.  Multiple imputation using chained equations: Issues and guidance for practice , 2011, Statistics in medicine.

[13]  Richard C. Feiock,et al.  Politics, institutions and entrepreneurship: city decisions leading to inventoried GHG emissions , 2011 .

[14]  Michael P. Jones Indicator and stratification methods for missing explanatory variables in multiple linear regression , 1996 .

[15]  G. King,et al.  Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation , 2001, American Political Science Review.

[16]  S. Brody,et al.  Risk, Stress, and Capacity , 2008 .

[17]  Rachel M. Krause Policy Innovation, Intergovernmental Relations, and the Adoption of Climate Protection Initiatives by U.S. Cities , 2011 .

[18]  Lena Osterhagen,et al.  Multiple Imputation For Nonresponse In Surveys , 2016 .

[19]  A. Gelman,et al.  Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box , 2011 .

[20]  G. Kalton,et al.  Handling missing data in survey research , 1996, Statistical methods in medical research.

[21]  Rachel M. Krause Political Decision-making and the Local Provision of Public Goods: The Case of Municipal Climate Protection in the US , 2012 .

[22]  A Rogier T Donders,et al.  Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: a clinical example. , 2006, Journal of clinical epidemiology.

[23]  C. Keyes,et al.  The structure of psychological well-being revisited , 1995 .

[24]  Richard C. Feiock,et al.  The Administrative Organization of Sustainability Within Local Government , 2014 .

[25]  James Alan Fox,et al.  Multiple Imputation of the Supplementary Homicide Reports, 1976–2005 , 2009 .

[26]  Elizabeth Lentz,et al.  Mentorship behaviors and mentorship quality associated with formal mentoring programs: closing the gap between research and practice. , 2006, The Journal of applied psychology.

[27]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[28]  Todd E. Bodner,et al.  What Improves with Increased Missing Data Imputations? , 2008 .

[29]  Michele M. Betsill,et al.  Mitigating Climate Change in US Cities: Opportunities and obstacles , 2001 .

[30]  David R. Johnson,et al.  Toward best practices in analyzing datasets with missing data: Comparisons and recommendations , 2011 .

[31]  Shunsuke Managi,et al.  Global environmental emissions estimate: application of multiple imputation , 2014 .

[32]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[33]  J. Graham,et al.  How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory , 2007, Prevention Science.

[34]  Alfred DeMaris,et al.  Combating unmeasured confounding in cross-sectional studies: evaluating instrumental-variable and Heckman selection models. , 2014, Psychological methods.

[35]  R. Downey,et al.  Missing data in Likert ratings: A comparison of replacement methods. , 1998, The Journal of general psychology.

[36]  Kent Portney Taking Sustainable Cities Seriously: Economic Development, the Environment, and Quality of Life in American Cities , 2003 .

[37]  John O. Brehm The Phantom Respondents: Opinion Surveys and Political Representation , 1993 .

[38]  R. Little A Test of Missing Completely at Random for Multivariate Data with Missing Values , 1988 .

[39]  R. Little Missing-Data Adjustments in Large Surveys , 1988 .

[40]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[41]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[42]  C. Y. Peng,et al.  Advances in Missing Data Methods and Implications for Educational Research , 2006 .

[43]  Sejin Ha,et al.  Understanding pro‐environmental behavior: A comparison of sustainable consumers and apathetic consumers , 2012 .

[44]  Teresa A. Myers Goodbye, Listwise Deletion: Presenting Hot Deck Imputation as an Easy and Effective Tool for Handling Missing Data , 2011 .

[45]  Richard C. Feiock,et al.  Making meaningful commitments: Accounting for variation in cities’ investments of staff and fiscal resources to sustainability , 2016 .

[46]  T. Schneider Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values. , 2001 .

[47]  Patrick Royston,et al.  Multiple Imputation by Chained Equations (MICE): Implementation in Stata , 2011 .

[48]  Sinnott Murphy,et al.  Local climate action: motives, enabling factors and barriers , 2014 .

[49]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[50]  Richard C. Feiock,et al.  Collaboration Networks Among Local Elected Officials: Information, Commitment, and Risk Aversion , 2010 .

[51]  Rachel M. Krause The Motivations Behind Municipal Climate Engagement: An Empirical Assessment of How Local Objectives Shape the Production of a Public Good , 2013 .

[52]  Richard C. Feiock,et al.  The Integrated City Sustainability Database , 2014 .

[53]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[54]  Roderick J A Little,et al.  A Review of Hot Deck Imputation for Survey Non‐response , 2010, International statistical review = Revue internationale de statistique.