Architecting Resilient Computing Systems: a Component-Based Approach. (Conception et implémentation de systèmes résilients par une approche à composants)

Evolution of systems during their operational life is mandatory and both updates and upgrades should not impair their dependability properties. Dependable systems must evolve to accommodate changes, such as new threats and undesirable events, application updates or variations in available resources. A system that remains dependable when facing changes is called resilient. In this paper, we present an innovative approach taking advantage of component-based software engineering technologies for tackling the on-line adaptation of fault tolerance mechanisms. We propose a development process that relies on two key factors: designing fault tolerance mechanisms for adaptation and leveraging a reflective component-based middleware enabling fine-grained control and modification of the software architecture at runtime. We thoroughly describe the methodology, the development of adaptive fault tolerance mechanisms and evaluate the approach in terms of performance and agility.

[1]  Ralph E. Johnson,et al.  Design Patterns: Abstraction and Reuse of Object-Oriented Design , 1993, ECOOP.

[2]  Cecília M. F. Rubira,et al.  An SPL approach for adaptive fault tolerance in SOA , 2011, SPLC '11.

[3]  Petr Hošek,et al.  Comparison of component frameworks for real-time embedded systems , 2010, Knowledge and Information Systems.

[4]  Nicolas Salatgé,et al.  Fault Tolerance Connectors for Unreliable Web Services , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[5]  Mario Tokoro Open Systems Dependability : Dependability Engineering for Ever-Changing Systems , 2012 .

[6]  Seyed Masoud Sadjadi,et al.  ACT: an adaptive CORBA template to support unanticipated adaptation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[7]  Flavio Oquendo Proceedings of the 2nd European conference on Software Architecture , 2007 .

[8]  岡村 寛之 The International Conference on Dependable Systems and Networks(DSN 2005) , 2005 .

[9]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[10]  J. Goldberg,et al.  Adaptive fault tolerance , 1993, Proceedings 1993 IEEE Workshop on Advances in Parallel and Distributed Systems.

[11]  K. H. Kim,et al.  Adaptive fault tolerance: issues and approaches , 1990, [1990] Proceedings. Second IEEE Workshop on Future Trends of Distributed Computing Systems.

[12]  Matti A. Hiltunen,et al.  Coyote: a system for constructing fine-grain configurable communication services , 1998, TOCS.

[13]  Alistair Cockburn,et al.  Agile Software Development: The Business of Innovation , 2001, Computer.

[14]  Gustavo Alonso,et al.  Understanding replication in databases and distributed systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[15]  Boris Magnusson Proceedings of the 16th European Conference on Object-Oriented Programming , 2002 .

[16]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[17]  Miguel Correia,et al.  An infrastructure for adaptive fault tolerance on FT-CORBA , 2006, Ninth IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC'06).

[18]  P. Reynier,et al.  Active replication in Delta-4 , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[19]  Matti A. Hiltunen,et al.  Adaptive Distributed and Fault-Tolerant Systems , 2007 .

[20]  Gordon S. Blair,et al.  A generic component model for building systems software , 2008, TOCS.

[21]  Jean-Charles Fabre Architecting resilient computing systems: Overall approach and open issues , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W).

[22]  Jean-Charles Fabre,et al.  Detecting interferences in aspect oriented programs , 2011, EWDC '11.

[23]  Ravishankar K. Iyer,et al.  Chameleon: A Software Infrastructure for Adaptive Fault Tolerance , 1999, IEEE Trans. Parallel Distributed Syst..

[24]  Matti A. Hiltunen,et al.  Affordable Fault Tolerance Through Adaptation , 1998, IPPS/SPDP Workshops.

[25]  Titos Saridakis,et al.  A System of Patterns for Fault Tolerance , 2002, EuroPLoP.

[26]  Israel Koren,et al.  Adaptive fault-tolerance fault-tolerance for cyber-physical systems , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[27]  Mario Tokoro Open Systems Dependability : Dependability Engineering for Ever-Changing Systems, Second Edition , 2015 .

[28]  Pierre Sens,et al.  Towards Adaptive Fault-Tolerance For Distributed Multi-Agent Systems , 2001 .

[29]  Julie A. McCann,et al.  A survey of autonomic computing—degrees, models, and applications , 2008, CSUR.

[30]  Matthieu Roy,et al.  Design-driven development methodology for resilient computing , 2013, CBSE '13.

[31]  Seyed Masoud Sadjadi,et al.  Composing adaptive software , 2004, Computer.

[32]  David S. Munro,et al.  In: Software-Practice and Experience , 2000 .

[33]  B. J. Ferro Castro,et al.  Pattern-Oriented Software Architecture: A System of Patterns , 2009 .

[34]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[35]  Robert E. Lyons,et al.  The Use of Triple-Modular Redundancy to Improve Computer Reliability , 1962, IBM J. Res. Dev..

[36]  Mohamed G. Gouda,et al.  Adaptive Programming , 1991, IEEE Trans. Software Eng..

[37]  Damien Cassou,et al.  Toward a Tool-Based Development Methodology for Pervasive Computing Applications , 2012, IEEE Transactions on Software Engineering.

[38]  Damien Cassou,et al.  Leveraging software architectures to guide and verify the development of sense/compute/control applications , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[39]  Thaís Vasconcelos Batista,et al.  Managing Dynamic Reconfiguration in Component-Based Systems , 2005, EWSA.

[40]  Pattie Maes,et al.  Concepts and experiments in computational reflection , 1987, OOPSLA '87.

[41]  Jean-Claude Laprie Surete de fonctionnement des systemes: concepts de base et terminologie , 2004 .

[42]  George T. Heineman,et al.  Component-Based Software Engineering: Putting the Pieces Together , 2001 .

[43]  David Garlan,et al.  Stitch: A language for architecture-based self-adaptation , 2012, J. Syst. Softw..

[44]  Thomas Ledoux,et al.  Reliable Dynamic Reconfigurations in a Reflective Component Model , 2010, CBSE.

[45]  Alistair Cockburn,et al.  Agile Software Development , 2001 .

[46]  Viktor K. Prasanna,et al.  Srijan : A Graphical Toolkit for WSN Application Development , 2008 .

[47]  Jean-Claude Laprie,et al.  From Dependability to Resilience , 2008, DSN 2008.

[48]  Jean-Charles Fabre,et al.  Componentization of Fault Tolerance Software for Fine-Grain Adaptation , 2008, 2008 14th IEEE Pacific Rim International Symposium on Dependable Computing.

[49]  Vinny Cahill,et al.  Supporting Unanticipated Dynamic Adaptation of Application Behaviour , 2002, ECOOP.

[50]  Jean-Charles Fabre,et al.  Fine-Grained Implementation of Fault Tolerance Mechanisms with AOP: To What Extent? , 2013, SAFECOMP.

[51]  Clemens A. Szyperski,et al.  Component software - beyond object-oriented programming, 2nd Edition , 2002, Addison-Wesley component software series.

[52]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[53]  Ivica Crnkovic,et al.  Classification and survey of component models , 2010 .

[54]  Valerio Schiavoni,et al.  A component‐based middleware platform for reconfigurable service‐oriented architectures , 2012, Softw. Pract. Exp..

[55]  Paris Avgeriou,et al.  Proceedings of the 16th European Conference on Pattern Languages of Programs , 2012 .

[56]  Séverine Sentilles,et al.  A Classification Framework for Software Component Models , 2011, IEEE Transactions on Software Engineering.

[57]  Luciane Lamour Ferreira,et al.  Reflective Design Patterns to Implement Fault Tolerance , 1998 .

[58]  Heinz Schmidt,et al.  Working Conference on Complex and Dynamic Systems Architecture , 2001 .

[59]  Qi Han,et al.  Journal of Network and Systems Management ( c ○ 2007) DOI: 10.1007/s10922-007-9062-0 A Survey of Fault Management in Wireless Sensor Networks , 2022 .

[60]  Peter A. Barrett,et al.  Using passive replicates in Delta-4 to provide dependable distributed computing , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[61]  Mladen A. Vouk,et al.  The Reliable Hybrid Pattern A Generalized Software Fault Tolerant Design Pattern , 1997 .

[62]  Zhou Mingtian,et al.  The research and implementation of a CORBA-based architecture for adaptive fault tolerance in distributed systems , 2002, Fifth International Conference on Algorithms and Architectures for Parallel Processing, 2002. Proceedings..

[63]  Matti A. Hiltunen,et al.  A Model for Adaptive Fault-Tolerant Systems , 1994, EDCC.

[64]  Clemens A. Szyperski,et al.  Component software - beyond object-oriented programming , 2002 .

[65]  Roy Sterritt,et al.  Autonomic Computing - a means of achieving dependability? , 2003, 10th IEEE International Conference and Workshop on the Engineering of Computer-Based Systems, 2003. Proceedings..

[66]  Raymond Feng,et al.  Tuscany SCA in Action , 2011 .

[67]  B. F. Castro Buschmann, Frank; Meunier, Regine; Rohnert, Hans; Sommerlad, Peter; Stal, Michael. Pattern-oriented software architecture: a system of patterns, John Wiley & Sons Ltd, 1996 , 1997 .

[68]  Peter Sommerlad,et al.  Pattern-Oriented Software Architecture Volume 1: A System of Patterns , 1996 .

[69]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[70]  David Garlan,et al.  Using Gauges for Architecture-Based Monitoring and Adaptation , 2001 .

[71]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[72]  Matthieu Roy,et al.  Experimenting with Component-Based Middleware for Adaptive Fault Tolerant Computing , 2012, ArXiv.

[73]  E. Dijkstra On the Role of Scientific Thought , 1982 .

[74]  Edsger W. Dijkstra,et al.  Selected Writings on Computing: A personal Perspective , 1982, Texts and Monographs in Computer Science.

[75]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[76]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[77]  K. H. Kim,et al.  Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications , 1989, IEEE Trans. Computers.

[78]  John Knight,et al.  Fundamentals of Dependable Computing for Software Engineers , 2012 .

[79]  Richard N. Taylor,et al.  Software architecture: foundations, theory, and practice , 2009, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[80]  Roy H. Campbell,et al.  Autonomic pervasive computing based on planning , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[81]  Michael Rowley,et al.  Understanding SCA (Service Component Architecture) , 2009 .

[82]  Matthieu Roy,et al.  Towards a System Architecture for Resilient Computing , 2011 .

[83]  Li Gong,et al.  Implementing Adaptive Fault-Tolerant Services for Hybrid Faults , 2007 .

[84]  Hans P. Zima,et al.  Adaptive Fault Tolerance for Many-Core Based Space-Borne Computing , 2010, Euro-Par.

[85]  David Garlan,et al.  Acme: architectural description of component-based systems , 2000 .

[86]  Huirong Fu,et al.  A Survey on Fault Tolerance in Wireless Sensor Networks , 2014 .

[87]  Jean Arlat,et al.  Definition and analysis of hardware- and software-fault-tolerant architectures , 1990, Computer.

[88]  Zibin Zheng,et al.  An adaptive QoS-aware fault tolerance strategy for web services , 2010, Empirical Software Engineering.

[89]  Jean-Charles Fabre,et al.  Architecting Dependable Systems Using Reflective Computing: Lessons Learnt and Some Challenges , 2009, WADS.

[90]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[91]  Bradley R. Schmerl,et al.  Model-based adaptation for self-healing systems , 2002, WOSS '02.

[92]  M. Looney Analog Dialogue , 2000 .

[93]  Ferranti Computer Systems Limited,et al.  THE DELTA-4 EXTRA PERFORMANCE ARCHITECTURE (XPA) , 1990 .

[94]  Tudor Dumitras,et al.  MEAD: support for Real‐Time Fault‐Tolerant CORBA , 2005, Concurr. Pract. Exp..

[95]  Viktor K. Prasanna,et al.  High-Level Application Development for Sensor Networks: Data-Driven Approach , 2011, Theoretical Aspects of Distributed Computing in Sensor Networks.

[96]  Lars Grunske,et al.  Proceedings of the 13th international conference on Component-Based Software Engineering , 2010 .

[97]  Thomas Ledoux,et al.  FPath and FScript: Language support for navigation and reliable reconfiguration of Fractal architectures , 2009, Ann. des Télécommunications.

[98]  M. Hecht,et al.  Adaptive fault tolerance for spacecraft , 2000, 2000 IEEE Aerospace Conference. Proceedings (Cat. No.00TH8484).

[99]  Joni da Silva Fraga,et al.  An Adaptive Fault-Tolerant Component Model , 2003, 2003 The Ninth IEEE International Workshop on Object-Oriented Real-Time Dependable Systems.

[100]  Ben Margolis,et al.  SOA for the Business Developer: Concepts, BPEL, and SCA (Business Developers series) , 2007 .