Self-* and Adaptive Mechanisms for Large Scale Distributed Systems

Large-scale distributed computing systems and infrastructure, such as Grids, P2P systems and desktop Grid platforms, are decentralized, pervasive, and composed of a large number of autonomous entities. The complexity of these systems is such that human administration is nearly impossible and centralized or hierarchical control is highly inefficient. These systems need to run on highly dynamic environments, where content, network topologies and workloads are continuously changing. Moreover, they are characterized by the high degree of volatility of their components and the need to provide efficient service management and to handle efficiently large amounts of data. This paper describes some of the areas for which adaptation emerges as a key feature, namely, the management of computational Grids, the self-management of desktop Grid platforms and the monitoring and healing of complex applications. It also elaborates on the use of bio-inspired algorithms to achieve self-management. Related future trends and challenges are described.