Configuration discovery and monitoring middleware for enterprise datacenters

Automatic discovery and monitoring of IT resources is a critical part of enterprise systems management. In addition to ascertaining internal device configurations, this discovery process may also need to capture the capabilities, usage, connectivity, availability, and other information related to various IT components. Systems resource management (SRM) tools typically implement this discovery process using device specific APIs, custom agents and/or some standard-based solution (like WBEM and CIM). The discovery actions need to be systematically planned; an inefficient implementation or scheduling may easily take from a few minutes to several hours to complete in a large heterogeneous enterprise datacenter. This paper discusses the various challenges associated in discovering the configuration of a datacenter environment and presents an autonomic configuration monitoring middleware called Magellan that builds upon industry best practices and standards. Magellan reduces the overall discovery time by more than 50% in our micro-benchmark experiments as well as in a large datacenter configuration of a major financial organization.

[1]  Hiroshi Yoshida Storage Networking Industry Association , 2009 .

[2]  Zsolt Németh,et al.  Characterizing Grids: Attributes, Definitions, and Formalisms , 2003, Journal of Grid Computing.

[3]  Hiroshi Yoshida,et al.  Storage Networking Industry Association , 2009, Encyclopedia of Database Systems.

[4]  Christian Poellabauer,et al.  Resource-aware stream management with the customizable dproc distributed monitoring mechanisms , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[5]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[6]  Ronald Minnich,et al.  Supermon: a high-speed cluster monitoring system , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[7]  Timothy L. Harris,et al.  XenoSearch: distributed resource discovery in the XenoServer open platform , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[8]  Shan Lu,et al.  Flight data recorder: monitoring persistent-state interactions to improve systems management , 2006, OSDI '06.

[9]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10]  Eser Kandogan,et al.  Evolution of storage management: Transforming raw data into information , 2008, IBM J. Res. Dev..

[11]  Ian Foster,et al.  A Globus Toolkit Primer , 2005 .

[12]  David E. Culler,et al.  Wide area cluster monitoring with Ganglia , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[13]  Eliza Varney Distributed Management Task Force, Inc , 2010 .

[14]  Dirk Beyer,et al.  Designing for Disasters , 2004, FAST.