A Model of Error Propagation in Satisficing Decisions and its Application to Database Quality Management

This study centers on the accuracy dimension of information quality and models the relationship between input accuracy and output accuracy in a popular class of applications. Such applications consist of dichotomous decisions or judgments that are implemented through conjunction of selected criteria. Initially, this paper introduces a model that designates a single decision rule which employs a single binary conjunction operation. This model is extended to handle multiple, related decision rules that consist of any number of binary conjunction operations. Finally, application of the extended model is illustrated through the example of an online hotel reservation database. This example demonstrates how the new model can be utilized for ranking and quantifying the damage that errors in different database attributes inflict. Numerical estimates of the model can be integrated into cost-benefit analyses that assess alternative data accuracy enhancements or process or system designs.

[1]  InduShobha N. Chengalur-Smith,et al.  Sample-based quality estimation of query results in relational database environments , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Irit Askira Gelman GIGO or not GIGO: Error Propagation in Basic Information Processing Operations , 2007, AMCIS.

[3]  G. Shankaranarayan,et al.  Managing Data Quality in Dynamic Decision Environments: An Information Product Approach , 2003, J. Database Manag..

[4]  Diane M. Strong,et al.  AIMQ: a methodology for information quality assessment , 2002, Inf. Manag..

[5]  Thomas M. Cover,et al.  The Best Two Independent Measurements Are Not the Two Best , 1974, IEEE Trans. Syst. Man Cybern..

[6]  Dennis J. Aigner,et al.  Regression with a binary independent variable subject to errors of observation , 1973 .

[7]  Karen Markey,et al.  Twenty-five years of end-user searching, Part 1: Research findings , 2007, J. Assoc. Inf. Sci. Technol..

[8]  R. Olshavsky,et al.  Task Complexity and Contingent Processing in Brand Choice , 1979 .

[9]  Carlo Batini,et al.  A formulation of the Data Quality Optimization Problem in Cooperative Information Systems , 2004, CAiSE Workshops.

[10]  H. J. Einhorn The use of nonlinear, noncompensatory models in decision making. , 1970, Psychological bulletin.

[11]  John W. Payne,et al.  Task complexity and contingent processing in decision making: An information search and protocol analysis☆ , 1976 .

[12]  Richard Y. Wang,et al.  Modeling Information Manufacturing Systems to Determine Information Product Quality Management Scien , 1998 .

[13]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[14]  Ma Janson,et al.  Data quality: The Achilles heel of end-user computing , 1988 .

[15]  Varghese S. Jacob,et al.  Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product , 2004, Manag. Sci..

[16]  Adir Even,et al.  Utility-driven assessment of data quality , 2007, DATB.

[17]  Donald P. Ballou,et al.  A Framework for the Analysis of Error in Conjunctive, Multi‐Criteria, Satisficing Decision Processes* , 1990 .

[18]  Robert L. Winkler,et al.  Limits for the Precision and Value of Information from Dependent Sources , 1985, Oper. Res..

[19]  Herbert A. Simon,et al.  Models of Man: Social and Rational. , 1957 .

[20]  Hillel J. Einhorn,et al.  Expert measurement and mechanical combination , 1972 .

[21]  Alex Mintz,et al.  How Do Leaders Make Decisions? , 2004, Contributions to Conflict Management, Peace Economics and Development.

[22]  Alan G. Phipps,et al.  Utility Function Switching During Residential Search , 1983 .

[23]  G. Owen,et al.  Thirteen theorems in search of the truth , 1983 .

[24]  Richard Y. Wang,et al.  Toward quality data: An attribute-based approach , 2014, Decis. Support Syst..

[25]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[26]  H. J. Einhorn Use of nonlinear, noncompensatory models as a function of task and amount of information , 1971 .

[27]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[28]  Amihai Motro,et al.  Not all answers are equally good: estimating the quality of database answers , 1997 .

[29]  Barbara D. Klein,et al.  Data quality in neural network models: effect of error rate and magnitude of error on predictive accuracy , 1999 .

[30]  Donald P. Ballou,et al.  Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems , 1985 .

[31]  Joseph Moses Juran Juran on planning for quality , 1988 .

[32]  Irit Askira Gelman,et al.  Simulations Of Error Propagation For Prioritizing Data Accuracy Improvement Efforts , 2007, ICIQ.

[33]  Amir Parssian,et al.  Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions , 2006, Decis. Support Syst..

[34]  Barbara D. Klein,et al.  Data Quality in Linear Regression Models: Effect of Errors in Test Data and Errors in Training Data on Predictive Accuracy , 1999, Informing Sci. Int. J. an Emerg. Transdiscipl..

[35]  Karen Markey Twenty-five years of end-user searching, Part 1: Research findings , 2007 .