Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach

Spreadsheet programs are probably the most successful example of end-user software development tools and are used for a variety of purposes. Like any type of software, they are prone to error, in particular as they are usually developed by non-programmers. While various techniques exist to support the developer in finding errors in procedural programs, the tool support for spreadsheet debugging is still limited. In this paper, we show how techniques from model-based diagnosis can be applied and extended for spreadsheet debugging by translating the relevant parts of a spreadsheet to a constraint satisfaction problem. We additionally propose both problem-specific and generalizable extensions to the classical diagnosis algorithms which help to detect potential problems in a spreadsheet based on user-provided test cases more efficiently. The proposed techniques were integrated into a modular framework for spreadsheet debugging and evaluated with respect to scalability based on a number of real-world and artificially created spreadsheets. An additional error detection exercise involving 24 subjects was performed to assess the general applicability of such advanced spreadsheet debugging techniques for end users.

[1]  Brian C. Williams,et al.  Diagnosing Multiple Faults , 1987, Artif. Intell..

[2]  S. Ditlea,et al.  Spreadsheets can be hazardous to your health , 1987 .

[3]  John D. Gould,et al.  An experimental study of people creating spreadsheets , 1987, TOIS.

[4]  Raymond Reiter,et al.  A Theory of Diagnosis from First Principles , 1986, Artif. Intell..

[5]  Russell Greiner,et al.  A Correction to the Algorithm in Reiter's Theory of Diagnosis , 1989, Artif. Intell..

[6]  Johan de Kleer,et al.  Using Crude Probability Estimates to Guide Diagnosis , 1990, Artif. Intell..

[7]  Wolfgang Nejdl,et al.  Choosing Observations and Actions in Model-Based Diagnosis/Repair Systems , 1992, KR.

[8]  Eugene C. Freuder,et al.  Partial Constraint Satisfaction , 1989, IJCAI.

[9]  Dennis F. Galletta,et al.  An empirical study of spreadsheet error-finding performance , 1993 .

[10]  Daniele Theseider Dupré,et al.  Model-Based Diagnosis Meets Error Diagnosis in Logic Programs , 1993, IJCAI.

[11]  Edward P. K. Tsang,et al.  Foundations of constraint satisfaction , 1993, Computation in cognitive science.

[12]  Daniele Theseider Dupré,et al.  Model-Based Diagnosis Meets Error Diagnosis in Logic Programs (Extended Abstract) , 1993, AADEBUG.

[13]  Raymond R. Panko,et al.  Spreadsheets on trial: a survey of research on spreadsheet risks , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[14]  Markus Stumptner,et al.  Model-Based Program Debugging and Repair , 1996, IEA/AIE.

[15]  Martin E. Dyer,et al.  Locating the Phase Transition in Binary Constraint Satisfaction Problems , 1996, Artif. Intell..

[16]  Gregg Rothermel,et al.  What you see is what you test: a methodology for testing form-based visual programs , 1998, Proceedings of the 20th International Conference on Software Engineering.

[17]  Gordon Filby Spreadsheets in science and engineering , 1998 .

[18]  Raymond R. Panko,et al.  What we know about spreadsheet errors , 1998 .

[19]  Markus Stumptner,et al.  Debugging Functional Programs , 1999, IJCAI.

[20]  Markus Stumptner,et al.  Model-Based Diagnosis of Hardware Designs , 1999, Artif. Intell..

[21]  Thomas Schiex,et al.  Maintaining Reversible DAC for Max-CSP , 1999, Artif. Intell..

[22]  Gregg Rothermel,et al.  Slicing spreadsheets: an integrated methodology for spreadsheet testing and debugging , 1999, DSL '99.

[23]  J. D. Pemberton,et al.  Spreadsheets in business , 2000 .

[24]  Markus Stumptner,et al.  Model-Based Debugging of Java Programs , 2000, AADEBUG.

[25]  D. Jannach,et al.  Hierarchical Diagnosis of Large Configurator Knowledge Bases , 2001, KI/ÖGAI.

[26]  Markus Stumptner,et al.  Diagnosing tree-structured systems , 2001, Artif. Intell..

[27]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[28]  Gregg Rothermel,et al.  Testing Homogeneous Spreadsheet Grids with the "What You See Is What You Test" Methodology , 2002, IEEE Trans. Software Eng..

[29]  Gregg Rothermel,et al.  End-user software engineering with assertions in the spreadsheet paradigm , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[30]  Brian Knight,et al.  Quality Control in Spreadsheets: A Visual Approach using Color Codings to Reduce Errors in Formulae , 2004, Software Quality Journal.

[31]  Luca Chittaro,et al.  Hierarchical model-based diagnosis based on structural abstraction , 2004, Artif. Intell..

[32]  Markus Stumptner,et al.  Consistency-based diagnosis of configuration knowledge bases , 1999, Artif. Intell..

[33]  Ulrich Junker,et al.  QUICKXPLAIN: Preferred Explanations and Relaxations for Over-Constrained Problems , 2004, AAAI.

[34]  Martin Erwig,et al.  AutoTest: A Tool for Automatic Test Case Generation in Spreadsheets , 2006, Visual Languages and Human-Centric Computing (VL/HCC'06).

[35]  Martin Erwig,et al.  Inferring templates from spreadsheets , 2006, ICSE '06.

[36]  Martin Erwig,et al.  GoalDebug: A Spreadsheet Debugger for End Users , 2007, 29th International Conference on Software Engineering (ICSE'07).

[37]  Stephen G. Powell,et al.  A critical review of the literature on spreadsheet errors , 2008, Decis. Support Syst..

[38]  Peter Zoeteweij,et al.  An observation-based model for fault localization , 2008, WODA.

[39]  Martin Erwig,et al.  Mutation Operators for Spreadsheets , 2009, IEEE Transactions on Software Engineering.

[40]  D. Jannach,et al.  Toward model-based debugging of spreadsheet programs , 2010 .

[41]  Franz Wotawa,et al.  Challenges of Distributed Model-Based Diagnosis , 2010, IEA/AIE.

[42]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[43]  Hugo Ribeiro,et al.  Towards a Catalog of Spreadsheet Smells , 2012, ICCSA.

[44]  Rui Abreu,et al.  GZoltar: an eclipse plug-in for testing and debugging , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[45]  Arie van Deursen,et al.  Detecting code smells in spreadsheet formulas , 2011, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[46]  Rui Abreu,et al.  Constraint-based Debugging of Spreadsheets , 2012, CIbSE.

[47]  A. Felfernig,et al.  FastDiag : A Diagnosis Algorithm for Inconsistent Constraint Sets , 2012 .

[48]  Atipol Asavametha Detecting bad smells in spreadsheets , 2012 .

[49]  Daniel Port,et al.  End User Computing: The Dark Matter (and Dark Energy) of Corporate IT , 2012, HICSS.

[50]  Arie van Deursen,et al.  Detecting and visualizing inter-worksheet smells in spreadsheets , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[51]  Meir Kalech,et al.  Compiling Model-Based Diagnosis to Boolean Satisfaction , 2012, AAAI.

[52]  Gerhard Friedrich,et al.  Interactive ontology debugging: Two query strategies for efficient fault localization☆ , 2011, J. Web Semant..

[53]  Rui Abreu,et al.  On the Empirical Evaluation of Fault Localization Techniques for Spreadsheets , 2013, FASE.

[54]  Dietmar Jannach,et al.  Toward an Integrated Framework for Declarative and Interactive Spreadsheet Debugging , 2013, ENASE.

[55]  Franz Wotawa,et al.  On classification and modeling issues in distributed model-based diagnosis , 2013, AI Commun..

[56]  Felienne Hermans Improving spreadsheet test practices , 2013, CASCON.

[57]  Thomas C. Herndon,et al.  Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogoff , 2014 .

[58]  Y. Chauhan,et al.  Growth in a Time of Debt , 2015 .