Reliability Improvement and Models in Autonomic Computing

The rapidly increasing complexity of computing systems is driving the movement towards autonomic systems that are capable of managing themselves without the need for human intervention. Without autonomic technologies, many conventional systems suffer reliability degradation due to the accumulation of errors. The autonomic management techniques break the traditional reliability degradation trend. This paper comprehensively describes the roles and functions of various autonomic components, and systematically reviews past and current technologies that have been/are being developed to address the specific areas of the autonomic computing environment. The effort to identify those ideas can lead to the design of more advanced autonomic computing that support highly reliable systems, as briefly proposed in the conclusion

[1]  J. F. Bouchard,et al.  IEEE TRANSACTIONS ON SYSTEMS , MAN , AND CYBERNETICS — PART A : SYSTEMS AND HUMANS , 2001 .

[2]  David Abramson,et al.  Economic models for management of resources in peer-to-peer and grid computing , 2001, SPIE ITCom.

[3]  O. Patrick Kreidl,et al.  Feedback control applied to survivability: a host-based autonomic defense system , 2004, IEEE Transactions on Reliability.

[4]  Salim Hariri,et al.  Online monitoring and analysis for self-protection against network attacks , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[5]  Ming Zhang,et al.  Autonomia: an autonomic computing environment , 2003, Conference Proceedings of the 2003 IEEE International Performance, Computing, and Communications Conference, 2003..

[6]  Craig Boutilier,et al.  Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation , 2002, UAI.

[7]  Hoi Chan,et al.  An approach to monitor application states for self-managing (autonomic) systems , 2003, OOPSLA '03.

[8]  Umesh Bellur Topology based automation of distributed applications management , 2004, WOSP '04.

[9]  Anand Sivasubramaniam,et al.  Critical event prediction for proactive management in large-scale computer clusters , 2003, KDD '03.

[10]  Yoshihiro Tohma Incorporating Fault Tolerance into an Autonomic-Computing Environment , 2004, IEEE Distributed Syst. Online.

[11]  Pavel Motuzenko Adaptive Domain Model: Dealing With Multiple Attributes of Self-Managing Distributed Object Systems , 2003, ISICT.

[12]  Linda Dailey Paulson,et al.  Computer System, Heal Thyself , 2002, Computer.

[13]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[14]  Noah Treuhaft,et al.  Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies , 2002 .

[15]  T. De Wolf,et al.  Towards autonomic computing: agent-based modelling, dynamical systems analysis, and decentralised control , 2003, IEEE International Conference on Industrial Informatics, 2003. INDIN 2003. Proceedings..