Parallel network RAM: effectively utilizing global cluster memory for large data-intensive parallel programs

Large scientific parallel applications demand large amounts of memory space. Current parallel computing platforms schedule jobs without fully knowing their memory requirements. This leads to uneven memory allocation in which some nodes are overloaded. This, in turn, leads to disk paging, which is extremely expensive in the context of scientific parallel computing. To solve this problem, we propose a new peer-to-peer solution called parallel network RAM. This approach avoids the use of disk and better utilizes available RAM resources. This approach will allow larger problems to be solved while reducing the computational, communication and synchronization overhead typically involved in parallel applications.

[1]  Amnon Barak,et al.  Memory ushering in a scalable computing cluster , 1998, Microprocess. Microsystems.

[2]  Sanjeev Setia,et al.  The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers , 1995, JSSPP.

[3]  Dror G. Feitelson,et al.  Memory Usage in the LANL CM-5 Workload , 1997, JSSPP.

[4]  Frederica Darema,et al.  Memory access patterns of parallel scientific programs , 1987, SIGMETRICS '87.

[5]  Jens Mache,et al.  A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling , 1998, JSSPP.

[6]  Evangelos P. Markatos,et al.  Implementation of a Reliable Remote Memory Pager , 1996, USENIX ATC.

[7]  Kenneth C. Sevcik,et al.  Performance Benefits and Limitations of Large NUMA Multiprocessors , 1994, Perform. Evaluation.

[8]  Uwe Schwiegelshohn,et al.  Improving First-Come-First-Serve Job Scheduling by Gang Scheduling , 1998, JSSPP.

[9]  Li Xiao,et al.  Incorporating job migration and network RAM to share cluster memory resources , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[10]  Marios C. Papaefthymiou,et al.  Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming , 1997, JSSPP.

[11]  Kenneth C. Sevcik,et al.  Benefits of Speedup Knowledge in Memory-Constrained Multiprocessor Scheduling , 1996, Perform. Evaluation.

[12]  Sanjeev Setia,et al.  The Utility of Exploiting Idle Memory for Data-Intensive Computations , 1998 .

[13]  David A. Wood,et al.  Paging tradeoffs in distributed-shared-memory multiprocessors , 1994, Supercomputing '94.

[14]  Anna R. Karlin,et al.  Implementing global memory management in a workstation cluster , 1995, SOSP.

[15]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[16]  Li Xiao,et al.  Dynamic Cluster Resource Allocations for Jobs with Known and Unknown Memory Demands , 2002, IEEE Trans. Parallel Distributed Syst..

[17]  Warren Smith,et al.  Benchmarks and Standards for the Evaluation of Parallel Job Schedulers , 1999, JSSPP.

[18]  Dror G. Feitelson,et al.  Metrics for Parallel Job Scheduling and Their Convergence , 2001, JSSPP.

[19]  Anand Sivasubramaniam,et al.  Improving parallel job scheduling by combining gang scheduling and backfilling techniques , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[20]  Dror G. Feitelson,et al.  Gang scheduling with memory considerations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[21]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[22]  Richard P. Brent,et al.  Resource Allocation Schemes for Gang Scheduling , 2000, JSSPP.