A probabilistic approach to distributed system management
暂无分享,去创建一个
Large-scale distributed systems are playing an increasing role in computational research, production operations, information processing, and application hosting. The continuous management of such systems is a critical consideration when focusing on reliability, availability, and security. As the number of commodity components within these systems continue to grow, it becomes increasingly difficult to track the multitude of parameters required to ensure optimal performance from the system, especially in those systems that have been built through expansion and not as an initial purchase of identical nodes. In this paper, we discuss the use of statistical inference, specifically Markov Logic Networks, in a distributed multi-agent system to provide the most effective means of managing these parameters. We showcase an architecture that provides services to manage a system's configuration throughout its life-cycle, and is capable of resolving differences after identifying potential mis-configurations using conflict discovery and resolution modules.
[1] Ben Taskar,et al. Introduction to statistical relational learning , 2007 .
[2] Matthew Richardson,et al. The Alchemy System for Statistical Relational AI: User Manual , 2007 .
[3] Ben Taskar,et al. Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .
[4] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.