Galapagos: Automatically Discovering Application-Data Relationships in Networked Systems

In large networked systems, relationships between applications and the data that they use through multiple tiers of middleware systems are often invisible. While the benefits of knowing such relationships are clear from a systems management perspective, discovery of such relationships is complicated by the widespread adoption of virtualization technologies and the tendency to view each middleware tier as an independent "domain" from a systems management perspective. In this paper we present a methodology and a system for automatic discovery of end-to-end application-data relationships. The key to the methodology is the modeling of data locations from which applications use data and of how middleware systems make data available to software layers above them.

[1]  Michael Luck,et al.  A Protocol for Recording Provenance in Service-Oriented Grids , 2004, OPODIS.

[2]  Martin L. Griss,et al.  Towards generic application auto-discovery , 2000, NOMS 2000. 2000 IEEE/IFIP Network Operations and Management Symposium 'The Networked Planet: Management Beyond 2000' (Cat. No.00CB37074).

[3]  Aaron B. Brown,et al.  An active approach to characterizing dynamic dependencies for problem determination in a distributed environment , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[4]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[5]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.