Improving Data Quality: Actors, Incentives, and Capabilities

This paper examines the construction and use of data sets in political science. We focus on three interrelated questions: How might we assess data quality? What factors shape data quality? and How can these factors be addressed to improve data quality? We first outline some problems with existing data set quality, including issues of validity, coverage, and accuracy, and we discuss some ways of identifying problems as well as some consequences of data quality problems. The core of the paper addresses the second question by analyzing the incentives and capabilities facing four key actors in a data supply chain: respondents, data collection agencies (including state bureaucracies and private organizations), international organizations, and finally, academic scholars. We conclude by making some suggestions for improving the use and construction of data sets. It is a capital mistake, Watson, to theorise before you have all the evidence. It biases the judgment. —Sherlock Holmes in “A Study in Scarlet” Statistics make officials, and officials make statistics.” —Chinese proverb

[1]  D. Laitin,et al.  The Implications of Constructivism for Constructing Ethnic Fractionalization Indices , 2000 .

[2]  S. Fukuda‐Parr The Human Development Report 2003 , 2008 .

[3]  Gerardo L. Munck,et al.  Generating Better Data , 2002 .

[4]  Aart Kraay,et al.  Governance Matters Iii: Governance Indicators for 1996-2002 , 2003 .

[5]  Yoshiko M. Herrera,et al.  Identity as a Variable , 2006, Perspectives on Politics.

[6]  Myung Geun Kim,et al.  Robust Estimation and Outlier Detection , 1994 .

[7]  Luc Girardin,et al.  Beyond Fractionalization: Mapping Ethnicity onto Nationalist Insurgencies , 2007, American Political Science Review.

[8]  Michael Coppedge,et al.  Democracy and Dimensions , 2002 .

[9]  G. King,et al.  Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation , 2001, American Political Science Review.

[10]  A. Yeats On the accuracy of economic observations : do sub-Saharan trade statistics mean anything? , 1990 .

[11]  Aart Kraay,et al.  Aggregating Governance Indicators , 1999 .

[12]  D. Collier,et al.  Measurement Validity: A Shared Standard for Qualitative and Quantitative Research , 2001, American Political Science Review.

[13]  Ted Robert Gurr,et al.  Polity IV, 1800-1999 , 2002 .

[14]  V. Velkoff,et al.  Trends and differentials in infant mortality in the Soviet Union, 1970-90: How much is due to misreporting? , 1995, Population studies.

[15]  Jasjeet S. Sekhon,et al.  Robust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data , 2004 .

[16]  Kanchan Chandra Measuring Identity: A Constructivist Dataset on Ethnicity and Institutions , 2009 .

[17]  Daniel N. Posner Measuring ethnic fractionalization in Africa , 2004 .

[18]  Jean-Francois Richard,et al.  Economic Development, Legality and the Transplant Effect , 2003 .

[19]  R. Rose,et al.  Political Support for Incomplete Democracies: Realist vs. Idealist Theories and Measures , 2001 .

[20]  W. Hunter,et al.  Democracy and Social Spending in Latin America, 1980–92 , 1999, American Political Science Review.

[21]  Branislav L. Slantchev How Initiators End Their Wars: The Duration of Warfare and the Terms of Peace , 2004 .

[22]  A. Krueger,et al.  "Misunderestimating" Terrorism: The State Department's Big Mistake , 2004 .

[23]  Gary Goertz,et al.  Social Science Concepts: A User's Guide , 2005 .

[24]  Kanchan Chandra Cumulative Findings in the Study of Ethnic Politics , 2001 .

[25]  G. Y. Wong,et al.  Contextually Specific Effects and other Generalizations of the Hierarchical Linear Model for Comparative Analysis , 1991 .

[26]  C. Murray,et al.  Enhancing the Validity and Cross-Cultural Comparability of Measurement in Survey Research , 2003, American Political Science Review.

[27]  J. P. Lewis,et al.  The World Bank: Its First Half Century , 1997 .

[28]  Jonathan N. Wand,et al.  The Butterfly Did It: The Aberrant Vote for Buchanan in Palm Beach County, Florida , 2001, American Political Science Review.

[29]  Daniel A. Kaufmann,et al.  Governance Matters III: Governance Indicators for 1996, 1998, 2000, and 2002 , 2004 .

[30]  M. Coppedge Commentary: Democracy And Dimensions Comments on Munck and Verkuilen , 2002 .

[31]  A. Yeats,et al.  On the (in)accuracy of economic observations: An assessment of trends in the reliability of international trade statistics , 1994 .

[32]  Jonathan N. Wand,et al.  Comparing Incomparable Survey Responses: Evaluating and Selecting Anchoring Vignettes , 2008, Political Analysis.

[33]  Aart Kraay,et al.  Government Matters III : Governance Indicators for 1996-2002 , 2003 .

[34]  T. Srinivasan Data base for development analysis Data base for development analysis: An overview , 1994 .

[35]  Daniel Pauly,et al.  Systematic distortions in world fisheries catch trends , 2001, Nature.

[36]  Halbert White,et al.  Estimation, inference, and specification analysis , 1996 .

[37]  C. Goodhart Money, information, and uncertainty , 1976 .

[38]  Simon Jackman,et al.  Democracy as a Latent Variable , 2008 .

[39]  Bruce Bueno de Mesquita,et al.  The War Trap , 1981 .

[40]  David Collier,et al.  DEMOCRACY AND DICHOTOMIES: A Pragmatic Approach to Choices about Concepts , 1999 .

[41]  J. Church Human Development Report , 2001 .