Automatic Verification of Database-Centric Systems

Software systems centered around a database are pervasive in numerous applications. They are encountered in areas as diverse as electronic commerce, e-government, scientific applications, enterprise information systems, and business process management. Such systems are often very complex and prone to costly bugs, whence the need for verification of critical properties. Classical software verification techniques that can be applied to such systems include model checking and theorem proving. However, both have serious limitations. Indeed, model checking usually requires performing finite-state abstraction on the data, resulting in loss of semantics for both the system and properties being verified. Theorem proving is incomplete, requiring expert user feedback. Recently, an alternative approach to verification of database-centric systems has taken shape, at the confluence of the database and computer-aided verification areas. It aims to identify restricted but sufficiently expressive classes of database-driven applications and properties for which sound and complete verification can be performed in a fully automatic way. This approach leverages another trend in database-driven applications: the emergence of high-level specification tools for database-centered systems, such as interactive web applications and data-driven business processes. We review next a few representative examples. A commercially successful high-level specification tool for web applications is Web Ratio [1], an outgrowth of the earlier academic prototype WebML [20, 17]. Web Ratio allows to specify a Web application using an interactive variant of the E-R model augmented with a workflow formalism. Noninteractive variants of Web page specifications had already been proposed in Strudel [39], Araneus [58] and Weave [40], targeting the automatic generation of Web sites from an underlying database. Highlevel specification tools have also emerged in the area of business process management, concomitantly with an evolution from the traditional process-centric approach towards data awareness. A notable exponent of this class is the business artifact model pioneered in [63, 51], deployed by IBM in professional services. Business artifacts (or simply “artifacts”) model key business-relevant entities, which are updated by a set of services that implement business process tasks. A collection of artifacts and services is called an artifact system. This modeling approach has been successfully deployed in practice [7, 6, 21, 27, 69], and has been adopted in the OMG standard for Case Management [9]. Tools such as the above automatically generate the database-centric application code from the highlevel specification. This not only allows fast prototyping and improves programmer productivity but, as a side effect, provides new opportunities for automatic verification. Indeed, the high-level specification is a natural target for verification, as it addresses the most likely source of errors (the application’s specification, as opposed to the less likely errors in the automatic generator’s implementation). The theoretical and practical results obtained so far concerning the verification of such systems are quite encouraging. They suggest that, unlike arbitrary software systems, significant classes of datadriven systems may be amenable to automatic verification. This relies on a novel marriage of database and model checking techniques, and is relevant to both the database and the computer-aided verification communities. In this article, we describe several models and results on automatic verification of database-driven systems, focusing on temporal properties of their underlying workflows. To streamline the presentation, we focus on verification of business artifacts, and use it as a vehicle to introduce the main concepts and results. We then summarize some of the work pertaining to other applications such as datadriven web services.

[1]  Parosh Aziz Abdulla,et al.  Recency-Bounded Verification of Dynamic Database-Driven Systems , 2016, PODS.

[2]  Alin Deutsch,et al.  Verification of Hierarchical Artifact Systems , 2016, PODS.

[3]  Diego Calvanese,et al.  Verification of Relational Multiagent Systems with Data Types , 2014, AAAI.

[4]  Diego Calvanese,et al.  Foundations of data-aware process analysis: a database theory perspective , 2013, PODS.

[5]  Jianwen Su,et al.  Data management perspectives on business process management: tutorial overview , 2013, SIGMOD '13.

[6]  Serge Abiteboul,et al.  Collaborative data-driven workflows: think global, act local , 2013, PODS '13.

[7]  Marco Montali,et al.  Verification of Artifact-Centric Systems: Decidability and Modeling Issues , 2013, ICSOC.

[8]  Alessio Lomuscio,et al.  Verification of Agent-Based Artifact Systems , 2013, J. Artif. Intell. Res..

[9]  Alessio Lomuscio,et al.  Verification of GSM-Based Artifact-Centric Systems through Finite Abstraction , 2012, ICSOC.

[10]  Richard Hull,et al.  Data Centric BPM and the Emerging Case Management Standard: A Short Survey , 2012, Business Process Management Workshops.

[11]  Alin Deutsch,et al.  Artifact systems with data dependencies and arithmetic , 2012, TODS.

[12]  Wil M. P. van der Aalst,et al.  Process Mining , 2012, CACM.

[13]  Giuseppe De Giacomo,et al.  Verification of Conjunctive Artifact-Centric Services , 2012, Int. J. Cooperative Inf. Syst..

[14]  Alessio Lomuscio,et al.  An Abstraction Technique for the Verification of Artifact-Centric Systems , 2012, KR.

[15]  Diego Calvanese,et al.  Verification of relational data-centric dynamic systems with external services , 2012, PODS.

[16]  Diego Calvanese,et al.  Foundations of Relational Artifacts Verification , 2011, BPM.

[17]  Richard Hull,et al.  On the equivalence of incremental and fixpoint semantics for business artifacts with Guard-Stage-Milestone lifecycles , 2011, Inf. Syst..

[18]  Richard Hull,et al.  Business artifacts with guard-stage-milestone lifecycles: managing artifact interactions with conditions and events , 2011, DEBS '11.

[19]  Ibm Redbooks,et al.  Advanced Case Management With IBM Case Manager , 2011 .

[20]  Diego Calvanese,et al.  Artifact-Centric Workflow Dominance , 2009, ICSOC/ServiceWave.

[21]  John Vergo,et al.  Artifact-Based Transformation of IBM Global Financing , 2009, BPM.

[22]  Jianwen Su,et al.  Enforcing Constraints on Life Cycles of Business Artifacts , 2009, 2009 Third IEEE International Symposium on Theoretical Aspects of Software Engineering.

[23]  Alin Deutsch,et al.  Automatic verification of data-centric business processes , 2009, ICDT '09.

[24]  Joël Ouaknine,et al.  Nets with Tokens which Carry Data , 2008, Fundam. Informaticae.

[25]  John Vergo,et al.  Siena: From PowerPoint to Web App in 5 Minutes , 2008, ICSOC.

[26]  Stéphane Demri,et al.  Model Checking Freeze LTL over One-Counter Automata , 2008, FoSSaCS.

[27]  Santhosh Kumaran,et al.  Artifact-centered operational modeling: Lessons from customer engagements , 2007, IBM Syst. J..

[28]  Jianwen Su,et al.  Towards Formal Analysis of Artifact-Centric Business Process Models , 2007, BPM.

[29]  Jianwen Su,et al.  Specification and Verification of Artifact Behaviors in Business Process Models , 2007, ICSOC.

[30]  Ahmed Bouajjani,et al.  Rewriting Systems with Data , 2007, FCT.

[31]  Marcin Jurdzinski,et al.  Alternation-free modal mu-calculus for data trees , 2007, 22nd Annual IEEE Symposium on Logic in Computer Science (LICS 2007).

[32]  Jianwen Su,et al.  Static Analysis of Business Artifact-centric Operational Models , 2007, IEEE International Conference on Service-Oriented Computing and Applications (SOCA '07).

[33]  Kamal Bhattacharya,et al.  Modeling Business Contexture and Behavior Using Business Artifacts , 2007, CAiSE.

[34]  Alin Deutsch,et al.  Specification and verification of data-driven Web applications , 2007, J. Comput. Syst. Sci..

[35]  Constantin Enea,et al.  A Generic Framework for Reasoning about Dynamic Networks of Infinite-State Processes , 2007, Log. Methods Comput. Sci..

[36]  Thomas Schwentick,et al.  Two-Variable Logic on Words with Data , 2006, 21st Annual IEEE Symposium on Logic in Computer Science (LICS'06).

[37]  Stéphane Demri,et al.  LTL with the Freeze Quantifier and Register Automata , 2006, 21st Annual IEEE Symposium on Logic in Computer Science (LICS'06).

[38]  Liying Sui,et al.  A system for specification and verification of interactive, data-driven web applications , 2006, SIGMOD Conference.

[39]  Alin Deutsch,et al.  Verification of communicating data-driven web services , 2006, PODS '06.

[40]  Thomas Schwentick,et al.  Two-variable logic on data trees and XML reasoning , 2009, JACM.

[41]  Robert J. Glushko,et al.  Document Engineering - Analyzing and Designing Documents for Business Informatics and Web Services , 2005 .

[42]  Akhil Kumar,et al.  A Framework for Document-Driven Workflow Systems , 2005, Business Process Management.

[43]  Alin Deutsch,et al.  A verifier for interactive, data-driven web applications , 2005, SIGMOD '05.

[44]  Thomas Schwentick,et al.  Finite state machines for strings over infinite alphabets , 2004, TOCL.

[45]  Alin Deutsch,et al.  Specification and verification of data-driven web services , 2004, PODS.

[46]  Jianwen Su,et al.  Tools for design of composite Web services , 2004, ACM SIGMOD Conference.

[47]  Paul Gastin,et al.  Pure future local temporal logics are expressively complete for Mazurkiewicz traces , 2004, Inf. Comput..

[48]  Anil Nigam,et al.  Business artifacts: An approach to operational specification , 2003, IBM Syst. J..

[49]  Patricia Bouyer,et al.  An algebraic approach to data languages and timed languages , 2003, Inf. Comput..

[50]  Ahmed Bouajjani,et al.  Automatic verification of recursive procedures with one integer parameter , 2003, Theor. Comput. Sci..

[51]  Marc Spielmann,et al.  Verification of relational transducers for electronic commerce , 2003, J. Comput. Syst. Sci..

[52]  Santhosh Kumaran,et al.  ADoc-oriented programming , 2003, 2003 Symposium on Applications and the Internet, 2003. Proceedings..

[53]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[54]  Stefano Ceri,et al.  Designing Data-Intensive Web Applications , 2002 .

[55]  Patricia Bouyer,et al.  A logical characterization of data languages , 2002, Inf. Process. Lett..

[56]  Stefano Ceri,et al.  Conceptual Modeling of Data-Intensive Web Applications , 2002, IEEE Internet Comput..

[57]  Ioana Manolescu,et al.  Specification and Design of Workflow-Driven Hypertexts , 2002, J. Web Eng..

[58]  Stephan Merz,et al.  Model Checking: A Tutorial Overview , 2000, MOVEP.

[59]  Richard Mayr,et al.  Undecidable problems in unreliable computations , 2000, Theor. Comput. Sci..

[60]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[61]  Jianwen Su,et al.  Optimization techniques for data-intensive decision flows , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[62]  Timos K. Sellis,et al.  A survey of logical models for OLAP databases , 1999, SGMD.

[63]  Jianwen Su,et al.  A Framework for Optimizing Distributed Workflow Executions , 1999, DBPL.

[64]  R. Hull,et al.  Declarative workflows that support easy modification and dynamic browsing , 1999, WACC '99.

[65]  Serge Abiteboul,et al.  Relational transducers for electronic commerce , 1998, J. Comput. Syst. Sci..

[66]  Doron A. Peled,et al.  Combining partial order reductions with on-the-fly model-checking , 1994, Formal Methods Syst. Des..

[67]  E. Allen Emerson,et al.  Temporal and Modal Logic , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[68]  A. P. Sistla,et al.  The complexity of propositional linear temporal logics , 1985, JACM.

[69]  Daniel Brand,et al.  On Communicating Finite-State Machines , 1983, JACM.

[70]  A. Prasad Sistla,et al.  The complexity of propositional linear temporal logics , 1982, STOC '82.

[71]  Arnold L. Rosenberg,et al.  Rapid identification of repeated patterns in strings, trees and arrays , 1972, STOC.

[72]  Emil L. Post Recursive Unsolvability of a problem of Thue , 1947, Journal of Symbolic Logic.

[73]  Ursula Dresdner,et al.  Computation Finite And Infinite Machines , 2016 .

[74]  Sophia Kluge,et al.  Modeling And Verification Of Parallel Processes , 2016 .

[75]  David L. Martin,et al.  Semantic Web Services , 2012, Springer Berlin Heidelberg.

[76]  Szymon Torunczyk,et al.  Automata based verification over linearly ordered data domains , 2011, STACS.

[77]  Henk de Man,et al.  Case Management: Cordys Approach , 2009 .

[78]  Santhosh Kumaran,et al.  Adaptive Business Objects - A new Component Model for Business Integration , 2005, ICEIS.

[79]  Santhosh Kumaran,et al.  A model-driven approach to industrializing discovery processes in pharmaceutical research , 2005, IBM Syst. J..

[80]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[81]  Faron Moller,et al.  Verification on Infinite Structures , 2001, Handbook of Process Algebra.

[82]  Stephan Merz,et al.  Model Checking , 2000 .

[83]  Paolo Merialdo,et al.  Araneus in the Era of XML , 1999, IEEE Data Eng. Bull..

[84]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[85]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.