A theoretical framework to improve the quality of manually acquired data

Abstract We present a framework for organisations to prevent errors in data entry. It states that data entry errors can be prevented by a strong intention of data producers to enter data correctly and by a high task-technology fit. Two empirical studies support the framework and demonstrate that a high task-technology fit is relatively more important than the data producers’ intention. The framework refines the theory of planned behaviour, and extends the explanatory domain of the task-technology fit construct. The empirical evidence underlines the importance of the task-technology fit construct, an often-neglected construct in information systems research.

[1]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[2]  Dale Goodhue,et al.  Task-Technology Fit and Individual Performance , 1995, MIS Q..

[3]  Wilfried Lemahieu,et al.  Entering data correctly: An empirical evaluation of the theory of planned behaviour in the context of manual data acquisition , 2018, Reliab. Eng. Syst. Saf..

[4]  Goodness of Fit Tests in Mixed Effects Logistic Models Characterized by Clustering , 2004 .

[5]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[6]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[7]  David W. Hosmer,et al.  A smoothed residual based goodness-of-fit statistic for logistic hierarchical regression models , 2007, Comput. Stat. Data Anal..

[8]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[9]  C. Y. Peng,et al.  An Introduction to Logistic Regression Analysis and Reporting , 2002 .

[10]  Ken Orr,et al.  Data quality and systems theory , 1998, CACM.

[11]  Marie Johnston,et al.  Application of the Theory of Planned Behaviour in Behaviour Change Interventions: A Systematic Review , 2002 .

[12]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[13]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[14]  D. Goldhill,et al.  APACHE II, data accuracy and outcome prediction , 1998, Anaesthesia.

[15]  Huey-Ling Lin,et al.  Influence of Culture and Education on U. S. and Taiwan Preservice Teachers' Efficacy Beliefs , 2002 .

[16]  Kenneth C. Laudon,et al.  Data quality and due process in large interorganizational record systems , 1986, CACM.

[17]  F. J. Roethlisberger,et al.  Management and the Worker , 1941 .

[18]  Viral V. Acharya,et al.  CAUSES OF THE FINANCIAL CRISIS , 2009 .

[19]  Dov Te'eni Behavioral Aspects of Data Production and Their Impact on Data Quality , 1993 .

[20]  M. Conner,et al.  Efficacy of the Theory of Planned Behaviour: a meta-analytic review. , 2001, The British journal of social psychology.

[21]  Todd M. Gureckis,et al.  CUNY Academic , 2016 .

[22]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[23]  Marcin Kozak,et al.  The effects of data input errors on subsequent statistical inference , 2015 .

[24]  I. Ajzen The theory of planned behavior , 1991 .

[25]  Richard Y. Wang,et al.  Data quality assessment , 2002, CACM.

[26]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[27]  P. Sheeran Intention—Behavior Relations: A Conceptual and Empirical Review , 2002 .

[28]  Richard V. McCarthy,et al.  Analyzing the Factors That Affect Information Systems Use: A Task-Technology Fit Meta-Analysis , 2009, J. Comput. Inf. Syst..

[29]  Kai A. Olsen The $100, 000 Keying Error , 2008, Computer.

[30]  Ananth Raman,et al.  Inventory Record Inaccuracy: An Empirical Analysis , 2008, Manag. Sci..

[31]  K. Thiru,et al.  Systematic review of scope and quality of electronic patient record data in primary care , 2003, BMJ : British Medical Journal.

[32]  I. Ajzen,et al.  Predicting and Changing Behavior: The Reasoned Action Approach , 2009 .

[33]  H. Theil Introduction to econometrics , 1978 .

[34]  Dale Goodhue,et al.  Understanding user evaluations of information systems , 1995 .

[35]  I. Ajzen,et al.  How Effective are Behavior Change Interventions Based on the Theory of Planned Behavior?: A Three-Level Meta-Analysis , 2016 .

[36]  Monique Snoeck,et al.  The Link Between the Data Producers’ Knowing-Why and their Intention to Enter Data Correctly , 2017, 2017 IEEE 19th Conference on Business Informatics (CBI).

[37]  Diane M. Strong,et al.  Knowing-Why About Data Processes and Data Quality , 2004 .

[38]  J. Freese,et al.  Comparing data characteristics and results of an online factorial survey between a population-based and a crowdsource-recruited sample , 2014 .

[39]  Alan A Montgomery,et al.  Bmc Medical Research Methodology Open Access Design, Analysis and Presentation of Factorial Randomised Controlled Trials , 2022 .

[40]  Glen D. Murphy,et al.  Improving the quality of manually acquired data: Applying the theory of planned behaviour to data quality , 2009, Reliab. Eng. Syst. Saf..

[41]  Larry A. Pace,et al.  Preventing human error: The impact of data entry methods on data accuracy and statistical results , 2011, Comput. Hum. Behav..

[42]  Monique Snoeck,et al.  Towards a Theoretical Framework to Explain Root Causes of Errors in Manually Acquired Data , 2016, ICIQ.

[43]  Adam Seth Levine,et al.  Cross-Sample Comparisons and External Validity , 2014, Journal of Experimental Political Science.

[44]  Chunhua Weng,et al.  Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research , 2013, J. Am. Medical Informatics Assoc..

[45]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[46]  Jeroen Smits,et al.  Testing goodness‐of‐fit of the logistic regression model in case–control studies using sample reweighting , 2005, Statistics in medicine.

[47]  M. R. Sooriyarachchi,et al.  A Goodness of Fit Test for the Multilevel Logistic Model , 2016, Commun. Stat. Simul. Comput..

[48]  O. Østerås,et al.  Data quality in the Norwegian dairy herd recording system: agreement between the national database and disease recording on farm. , 2013, Journal of dairy science.

[49]  Melinda R. Hodkiewicz,et al.  Are managerial pressure, technological control and intrinsic motivation effective in improving data quality? , 2013, Reliab. Eng. Syst. Saf..

[50]  Nicolette de Keizer,et al.  Model Formulation: Defining and Improving Data Quality in Medical Registries: A Literature Review, Case Study, and Generic Framework , 2002, J. Am. Medical Informatics Assoc..

[51]  M. Orne Demand Characteristics and the Concept of Quasi-Controls1 , 2009 .