Deriving a unified fault taxonomy for event-based systems

Dependability and fault-tolerance, which are key requirements for business- or safety-critical applications, require explicit knowledge of potential faults that may occur within a system. In contrast to other major research directions, the emerging field of distributed event-based systems is yet lacking a common understanding of faults. In this paper we take a step forward and study potential origins and effects of faults in such systems. Our work on a unified fault taxonomy follows a rigorous methodology. We first identify five core sub-areas in the broader field of event-based systems, and discuss commonalities and differences among them. Then we derive from the existing literature a coherent domain model, which accurately captures the specifics of the different areas. The domain model provides a holistic view and covers both structural and procedural aspects of event-based systems. Based on this model, we elaborate a detailed taxonomy of faults, in line with well-established fault dimensions from dependable and secure computing. The fault taxonomy forms the basis for a comprehensive discussion of fault instances across the five sub-areas of event processing.

[1]  Alejandro P. Buchmann,et al.  Event composition in time-dependent distributed systems , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[2]  Robert Stephens,et al.  A survey of stream processing , 1997, Acta Informatica.

[3]  Umesh Bellur,et al.  A Taxonomy of QoS-Aware, Adaptive Event-Dissemination Middleware , 2007, IEEE Internet Computing.

[4]  Robert V. Binder,et al.  Testing object‐oriented software: a survey , 1996 .

[5]  August-Wilhelm Scheer,et al.  Process Modeling Using Event-Driven Process Chains , 2005, Process-Aware Information Systems.

[6]  Dewayne E. Perry,et al.  A case study in root cause defect analysis , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[7]  L. Alvisi,et al.  A Survey of Rollback-Recovery Protocols , 2002 .

[8]  Opher Etzion,et al.  Event-processing network model and implementation , 2008, IBM Syst. J..

[9]  Schahram Dustdar,et al.  Advanced event processing and notifications in service runtime environments , 2008, DEBS.

[10]  Rolf Isermann,et al.  Model-based fault-detection and diagnosis - status and applications , 2004, Annu. Rev. Control..

[11]  Vinny Cahill,et al.  Taxonomy of distributed event-based programming systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[12]  Peter G. Neumann,et al.  EMERALD: Event Monitoring Enabling Responses to Anomalous Live Disturbances , 1997, CCS 2002.

[13]  Peter R. Pietzuch,et al.  Hermes: a distributed event-based middleware architecture , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[14]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[15]  Peter R. Pietzuch,et al.  Distributed complex event processing with query rewriting , 2009, DEBS '09.

[16]  Morris Sloman,et al.  GEM: a generalized event monitoring language for distributed systems , 1997, Distributed Syst. Eng..

[17]  Pankaj Jalote,et al.  Fault tolerance in distributed systems , 1994 .

[18]  Walter Mann,et al.  Correction to "Specification and Analysis of System Architecture Using Rapide" , 1995, IEEE Trans. Software Eng..

[19]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[20]  Oliver Kopp,et al.  SOEDA: A Method for Specification and Implementation of Applications on a Service-Oriented Event-Driven Architecture , 2009, BIS.

[21]  Eduardo F. Nakamura,et al.  Information fusion for wireless sensor networks: Methods, models, and classifications , 2007, CSUR.

[22]  Michel Riveill,et al.  WComp middleware for ubiquitous computing: Aspects and composite event-based Web services , 2009, Ann. des Télécommunications.

[23]  Felix C. Freiling,et al.  Supporting Mobility in Content-Based Publish/Subscribe Middleware , 2003, Middleware.

[24]  Deborah Estrin,et al.  The impact of data aggregation in wireless sensor networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[25]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[26]  Malgorzata Steinder,et al.  A survey of fault localization techniques in computer networks , 2004, Sci. Comput. Program..

[27]  Robert Tappan Morris,et al.  Event-driven programming for robust software , 2002, EW 10.

[28]  Opher Etzion,et al.  Event Processing in Action , 2010 .

[29]  Miroslaw Malek,et al.  A Fault Taxonomy for Service-Oriented Architecture , 2007, 10th IEEE High Assurance Systems Engineering Symposium (HASE'07).

[30]  DayalUmeshwar,et al.  The architecture of an active database management system , 1989 .

[31]  Peter R. Pietzuch,et al.  Distributed event-based systems , 2006 .

[32]  Guruduth Banavar,et al.  A Case for Message Oriented Middleware , 1999, DISC.

[33]  Wil M. P. van der Aalst,et al.  Conformance Testing: Measuring the Fit and Appropriateness of Event Logs and Process Models , 2005, Business Process Management Workshops.

[34]  David Gelernter,et al.  Multiple Tuple Spaces in Linda , 1989, PARLE.

[35]  Zaid Al-Ars,et al.  Functional memory faults: a formal notation and a taxonomy , 2000, Proceedings 18th IEEE VLSI Test Symposium.

[36]  Marija Mikic-Rakic,et al.  A style-aware architectural middleware for resource-constrained, distributed systems , 2005, IEEE Transactions on Software Engineering.

[37]  Kun-Lung Wu,et al.  Fault injection-based assessment of partial fault tolerance in stream processing applications , 2011, DEBS '11.

[38]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[39]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[40]  Jonathan Goldstein,et al.  Consistent Streaming Through Time: A Vision for Event Stream Processing , 2006, CIDR.

[41]  Aleksander Slominski,et al.  Discovering event correlation rules for semi-structured business processes , 2011, DEBS '11.

[42]  Luciano Baresi,et al.  A Fault Taxonomy for Web Service Composition , 2009, ICSOC Workshops.

[43]  A. Avizienis,et al.  Microprocessor entomology: a taxonomy of design faults in COTS microprocessors , 1999, Dependable Computing for Critical Applications 7.

[44]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[45]  Opher Etzion,et al.  Existing and future standards for event-driven business process management , 2009, DEBS '09.

[46]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[47]  Ramesh Jain,et al.  Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[48]  Opher Etzion,et al.  A stratified approach for supporting high throughput event processing applications , 2009, DEBS '09.

[49]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[50]  Oliver Kopp,et al.  Institute of Architecture of Application Systems SOEDA : A Methodology for Specification and Implementation of Applications on a Service-Oriented Event-Driven Architecture , 2009 .

[51]  Umeshwar Dayal,et al.  The architecture of an active database management system , 1989, SIGMOD '89.

[52]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[53]  Kees G. W. Goossens,et al.  An event-based network-on-chip monitoring service , 2004, Proceedings. Ninth IEEE International High-Level Design Validation and Test Workshop (IEEE Cat. No.04EX940).

[54]  Fusheng Wang,et al.  Bridging Physical and Virtual Worlds: Complex Event Processing for RFID Data Streams , 2006, EDBT.

[55]  Chris Fowler,et al.  Towards a Common Event Model for an Integrated Sensor Information System , 2009 .

[56]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[57]  Christof Fetzer,et al.  Bloom filter based routing for content-based publish/subscribe , 2008, DEBS.

[58]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[59]  Eugene H. Spafford,et al.  Use of A Taxonomy of Security Faults , 1996 .

[60]  Antonio Alfredo Ferreira Loureiro,et al.  Fault management in event-driven wireless sensor networks , 2004, MSWiM '04.

[61]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[62]  David Luckham,et al.  The power of events - an introduction to complex event processing in distributed enterprise systems , 2002, RuleML.

[63]  Jana Kosecka,et al.  Control of Discrete Event Systems , 1992 .

[64]  Nael B. Abu-Ghazaleh,et al.  A taxonomy of wireless micro-sensor network models , 2002, MOCO.

[65]  Zack J. Butler,et al.  Event-Based Motion Control for Mobile-Sensor Networks , 2003, IEEE Pervasive Comput..

[66]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[67]  David C. Luckham,et al.  Complex Event Processing in Distributed Systems , 1998 .

[68]  Hans-Arno Jacobsen,et al.  A distributed service-oriented architecture for business process execution , 2010, TWEB.

[69]  H. Kopetz,et al.  Dependability: Basic Concepts and Terminology , 1992, Dependable Computing and Fault-Tolerant Systems.

[70]  C LuckhamDavid,et al.  Specification and Analysis of System Architecture Using Rapide , 1995 .

[71]  Wil M. P. van der Aalst,et al.  Formalization and verification of event-driven process chains , 1999, Inf. Softw. Technol..

[72]  Robert Szewczyk,et al.  System architecture directions for networked sensors , 2000, ASPLOS IX.

[73]  David Garlan,et al.  Formalizing Design Spaces: Implicit Invocation Mechanisms , 1991, VDM Europe.

[74]  Richard N. Taylor,et al.  A Component- and Message-Based Architectural Style for GUI Software , 1995, 1995 17th International Conference on Software Engineering.

[75]  Christopher Krügel,et al.  Decentralized Event Correlation for Intrusion Detection , 2001, ICISC.

[76]  Schahram Dustdar,et al.  Dynamic Migration of Processing Elements for Optimized Query Execution in Event-Based Systems , 2011, OTM Conferences.