Resilient computing: An engineering discipline

The term resiliency has been used in many fields like child psychology, ecology, business, and several others, with the common meaning of expressing the ability to successfully accommodate unforeseen environmental perturbations or disturbances. The adjective resilient has been in use for decades in the field of dependable computing systems however essentially as a synonym of fault-tolerant, thus generally ignoring the unexpected aspect of the phenomena the systems may have to face. These phenomena become of primary relevance when moving to systems like the future large, networked, evolving systems constituting complex information infrastructures — perhaps involving everything from super-computers and huge server “farms” to myriads of small mobile computers and tiny embedded devices, with humans being central part of the operation of such systems. Such systems are in fact the dawning of the ubiquitous systems that will support Ambient Intelligence. With such ubiquitous systems, what is at stake is to maintain dependability, i.e., the ability to deliver service that can justifiably be trusted, in spite of continuous changes. Therefore the term resilience and resilient computing can be applied to the design of ubiquitous systems and defined as the search for the following property: the persistence of service delivery that can justifiably be trusted, when facing changes. Changes may be of different nature, with different prospect and different timing. Therefore the design of ubiquitous systems requires the mastering of many, often separated, engineering disciplines that span from advanced probability to logic, from human factors to cryptology and information security and to management of large projects.