Development of a method for ex-post identification of falsifications in survey data

Results of a research project dealing with ex-post detection of falsified data in surveys are reported. Based on an analysis of the motivation for falsifications we develop, test and apply multivariate statistical methods, which can be used to identify falsifications in survey data. The methods build on specific statistical properties of falsified interviews and their interdependence. The classification of interviewers is based on these indicators calculated for the data collected by each interviewer. In a first explorative phase we identify further attributes of questionnaires which are useful to detect interviewers producing falsified data. Among those attributes are both specific type of content, e.g. knowledge questions, and behaviour across several questions, e.g. reliability in multi item scales. It is explored to what extent these additional indicators improve the capability to distinguish between potential falsifiers and regular interviewers. The sensitivity of the results with regard to the number of interviews available is analyzed by means of bootstrap analysis. The results are discussed regarding methodological issues in development of our data driven approach for identification of falsified interview data as well as its potential application already during the field phase in real surveys.

[1]  J. Bushery,et al.  GETTING MORE BANG FROM THE REINTERVIEW BUCK: IDENTIFYING "AT RISK" INTERVIEWERS , 2002 .

[2]  Paul Biemer,et al.  The optimal design of quality control samples to detect interviewer cheating , 1989 .

[3]  M. McCarthy The statistical approach , 1959 .

[4]  S. Messick THE PSYCHOLOGY OF ACQUIESCENCE: AN INTERPRETATION OF RESEARCH EVIDENCE1 , 1966 .

[5]  A. Koch Gefälschte Interviews: Ergebnisse der Interviewerkontrolle beim ALLBUS 1994 , 1995 .

[6]  O. John,et al.  Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German , 2007 .

[7]  Peter Winker,et al.  Robustness of clustering methods for identification of potential falsifications in survey data , 2011 .

[8]  Markus Bühner Einführung in die Test- und Fragebogenkonstruktion , 2008 .

[9]  Peter Winker,et al.  A statistical approach to detect interviewer falsification of survey data , 2012 .

[10]  Karl-Heinz Reuband Interviews, die keine sind—"Erfolge" und "Mi–Serfolge" beim Fälschen von Interviews. , 1990 .

[11]  Gert G. Wagner,et al.  Identification, Characteristics and Impact of Faked Interviews in Surveys: An Analysis by Means of Genuine Fakes in the Raw Data of SOEP , 2003, SSRN Electronic Journal.

[12]  PetermannFranz Bühner, M. (2006). Einführung in die Test- und Fragebogenkonstruktion , 2009 .

[13]  N. Menold,et al.  A literature review of methods to detect fabricated survey data , 2011 .

[14]  Peter Winker,et al.  A statistical approach to detect cheating interviewers , 2008 .

[15]  M. Ziegler,et al.  The Vocabulary and Overclaiming Test (VOC-T) , 2013 .

[16]  Peter Winker Optimization Heuristics in Econometrics : Applications of Threshold Accepting , 2000 .

[17]  U. S. Census,et al.  INTERVIEWER FALSIFICATION IN CENSUS BUREAU SURVEYS , 2002 .

[18]  Jon A. Krosnick,et al.  Comparing the Quality of Data Obtained by Minimally Balanced and Fully Balanced Attitude Questions , 2005 .

[19]  Klaus-Robert Müller,et al.  Automatie Identification of Faked and Fraudulent Interviews in the German SOEP , 2005, Journal of Contextual Economics – Schmollers Jahrbuch.

[20]  G. Wagner,et al.  Characteristics and impact of faked interviews in surveys – An analysis of genuine fakes in the raw data of SOEP , 2005 .

[21]  J. Krosnick,et al.  AN EVALUATION OF A COGNITIVE THEORY OF RESPONSE-ORDER EFFECTS IN SURVEY MEASUREMENT , 1987 .

[22]  L. Rips,et al.  The Psychology of Survey Response , 2000 .

[23]  Rainer Schnell Der Einfluß gefälschter Interviews auf Survey-Ergebnisse , 1991 .

[24]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .