Ontological Considerations When Modeling Missing Data With Relational Databases

SQL’s use of nulls to indicate missing data in relational databases has been criticized as violating the relational model. In this article, I review this critique and two popular recommendations of resolution. The first is to retire SQL’s multivalued logic and replace it with a conventional binary logic. The second is to revise SQL’s implementation of multivalued logic so that it can recognize tautological propositions. I argue that underlying this debate is an ontological disagreement about how to properly model missing data and, more generally, social reality. I demonstrate that the relational model provides useful tools for modeling different types of missing data in different ways and, furthermore, offers a useful foundation upon which to conduct social research, one that supports both variable-oriented and case-oriented analysis.

[1]  Claude Rubinson,et al.  Nulls, three-valued logic, and ambiguity in SQL: critiquing date's critique , 2007, SGMD.

[2]  John L.P. Thompson,et al.  Missing data , 2004, Amyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases.

[3]  Richard T. Snodgrass,et al.  Developing Time-Oriented Database Applications in SQL , 1999 .

[4]  J. Simpson,et al.  It’s In the Way That You Use It , 2013, Personality & social psychology bulletin.

[5]  C. J. Date Database in depth - relational theory for practitioners , 2005 .

[6]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[7]  John Grant,et al.  Null values in SQL , 2008, SIGMOD Rec..

[8]  C. J. Date SQL and Relational Theory - How to Write Accurate SQL Code, Second Edition , 2012, Theory in practice.

[9]  J. Graham,et al.  Missing data analysis: making it work in the real world. , 2009, Annual review of psychology.

[10]  Stéphane Bressan,et al.  Introduction to Database Systems , 2005 .

[11]  Patrick E. McKnight Missing Data: A Gentle Introduction , 2007 .

[12]  C. J. Date,et al.  A critique of Claude Rubinson's paper nulls, three - valued logic, and ambiguity in SQL , 2008, SIGMOD Rec..

[13]  C. J. Date Logic and Databases: The Roots of Relational Theory , 2007 .

[14]  John Grant,et al.  Null Values in a Relational Data Base , 1977, Inf. Process. Lett..

[15]  C. J. Date Relational Database Writings 1991-1994 , 1990 .

[16]  C. J. Date An introduction to database systems (7. ed.) , 1999 .

[17]  E. H. Stitt,et al.  Multifunctional Reactors? ‘Up to a Point Lord Copper’ , 2004 .

[18]  Trivellore E Raghunathan,et al.  What do we do with missing data? Some options for analysis of incomplete data. , 2004, Annual review of public health.

[19]  John W. Graham,et al.  Missing Data: Analysis and Design , 2012 .

[20]  A Non-Truth-Functional 3-Valued Logic , 1974 .

[21]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.