A Information Monitoring and Job Scheduling System for Multiple Linux PC Clusters

Managing and monitoring a cluster is both a tedious and challenging task, since each computing node is designed as a stand-alone system rather than a part of a parallel architecture. In this paper, a software system that allows the centralized administration of a generic Beowulf cluster is proposed. This system also provides Web services and applications to monitor multiple PC clusters with job submission and scheduling

[1]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[2]  Chao-Tung Yang,et al.  Performance Evaluation of SLIM and DRBL Diskless PC Clusters on Fedora Core 3 , 2005, Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT'05).

[3]  Chao-Tung Yang,et al.  On Construction and Performance Evaluation of Cluster of Linux PC Clusters Environments , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[4]  Thomas L. Sterling,et al.  A Coming of Age for Beowulf-Class Computing , 1999, Euro-Par.

[5]  Chao-Tung Yang,et al.  On Construction of a Large Computing Farm Using Multiple Linux PC Clusters , 2004, PDCAT.