Performance Monitoring and Analysis on Multi-cluster Parallel Jobs
暂无分享,去创建一个
A multi-cluster computing model is introduced. In the analysis on the basis of the multi-cluster system features, which include flexible architecture and reconfigurability, the methods and technologies of performance monitoring and analysis, which is applicable to this model, are researched. A performance monitoring and analysis tool of parallel jobs is designed and implemented. In this tool, dynamic performance analysis method is used, distributed software design framework is followed, a high cohesion and low coupling structure is designed. Operation results show it can work effectively in the multi-cluster computing model.
[1] Hong Linh Truong,et al. Novel Techniques and Methods for Performance Measurement, Analysis and Monitoring of Cluster and Grid Applications , 2005 .
[2] David R. Karger,et al. Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.
[3] Barton P. Miller,et al. On-line automated performance diagnosis on thousands of processes , 2006, PPoPP '06.