Data monitoring in high-performance clusters for computing applications

The shared memory in a LAN-like environment (SMiLE) project at Lehrstuhl fur Rechnertechnik und Rechnerorganisation, Technical University of Munich (LRR-TUM) investigates in high-performance cluster computing using system area networks. In the context of this project, a hardware monitor is being developed to observe the system area network (SAN) traffic. This hardware monitor is, therefore, capable of delivering detailed information about the run-time communication behavior of applications running on SMiLE clusters. The central part of this monitor consists of a content-addressable counter array managing a small working set of the most recently referenced memory regions.

[1]  Margaret Martonosi,et al.  Integrating performance monitoring and communication in parallel computers , 1996, SIGMETRICS '96.

[2]  Stein Gjessing,et al.  Distributed-directory scheme: scalable coherent interface , 1990, Computer.

[3]  Martin Schulz,et al.  Multilayer Online-Monitoring for Hybrid DSM Systems on Top of PC Clusters with a SMiLE , 2000, Computer Performance Evaluation / TOOLS.

[4]  Martin Schulz,et al.  Memory access behavior analysis of NUMA-based shared memory programs , 2002, Sci. Program..

[5]  Martin Schulz,et al.  Optimizing data locality for SCI-based PC-clusters with the SMiLE monitoring approach , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[6]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[7]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[8]  S. Turner,et al.  Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[9]  Margaret Martonosi,et al.  Performance monitoring in a Myrinet-connected SHRIMP cluster , 1998, SPDT '98.

[10]  Martin Schulz,et al.  Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE , 1999, CANPC.

[11]  Martin Schulz,et al.  Using the SMiLE Monitoring Infrastructure to Detect and Lower the Inefficiency of Parallel Applications , 2000, HPCN Europe.

[12]  Martin Schulz,et al.  Design and Implementation Aspects for the SMiLE Hardware Monitor , 2000 .

[13]  M. Manzke,et al.  Non-Intrusive Deep Tracing of SCI Interconnect Traffic , 1999 .

[14]  Hermann Hellwagner,et al.  SCI: Scalable Coherent Interface: Architecture and Software for High-Performance Compute Clusters , 1999 .

[15]  David B. Gustavson,et al.  Scalable Coherent Interface , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.