The Utility of Exploiting Idle Memory for Data-Intensive Computations

In this paper, we examine the utility of exploiting idle memory in workstation pools. We attempt to answer the following questions. First, given a workstation pool, what fraction of the memory can be expected to be idle? This provides an estimate of the opportunity for hosting guest data. Second, what fraction of a individual host''s memory can be expected to be idle? This helps determine the recruitment policy -- what is the maximum amount of memory that should be recruited on a single host? Third, what is the distribution of memory idle-times? That is, what is the probability that a chunk of memory that is currently idle will be idle for longer than time t? This information indicates how long guest data can be expected to survive; applications that access their data-sets frequently within the expected life-time of guest data are more likely to benefit from exploiting idle memory. Fourth, how much benefit can a user expect? We use two metrics for the benefit of exploiting idle memory: (1) if I have a pool with w workstations, how much memory should I expect to get for free by harvesting idle memory; (2) how much improvement can be achieved in end-to-end execution time? Finally, how long and how frequently might a user have to wait to reclaim her machine if she volunteers to host guest pages on her machine? This helps answer the question of social acceptability. To answer the questions relating to the availability of idle memory, we have analyzed two-week long traces from five workstation pools with different sizes, locations, and patterns of use. To evaluate the expected benefits and costs, we have simulated three data-intensive applications (0.5GB-5GB) on these workstation pools.

[1]  T. Narten,et al.  Remote memory as a resource in distributed systems , 1992, [1992] Proceedings Third Workshop on Workstation Operating Systems.

[2]  Joel H. Saltz,et al.  The utility of exploiting idle workstations for parallel computation , 1997, SIGMETRICS '97.

[3]  Sanjeev Setia,et al.  Supporting dynamic space-sharing on clusters of non-dedicated workstations , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[4]  Philip S. Yu,et al.  Policies for efficient memory utilization in a remote caching architecture , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[5]  Miron Livny,et al.  Global Memory Management in Client-Server Database Architectures , 1992, VLDB.

[6]  Andrea C. Arpaci-Dusseau,et al.  The interaction of parallel and sequential workloads on a network of workstations , 1995, SIGMETRICS '95/PERFORMANCE '95.

[7]  M. Franklin,et al.  Global Memory Management in Client-Server DBMS Architectures , 1992 .

[8]  Miron Livny,et al.  Parallel Processing on Dynamic Resources with CARMI , 1995, JSSPP.

[9]  Anna R. Karlin,et al.  Implementing global memory management in a workstation cluster , 1995, SOSP.

[10]  Miron Livny,et al.  Experience with the Condor distributed batch system , 1990, IEEE Workshop on Experimental Distributed Systems.

[11]  José E. Moreira,et al.  A Programming Environment for Dynamic Resource Allocation and Data Distribution , 1996, LCPC.

[12]  Mary K. Vernon,et al.  Managing server load in global memory systems , 1997, SIGMETRICS '97.

[13]  Joel H. Saltz,et al.  Requirements of I/O systems for parallel machines: an application-driven study , 1997 .

[14]  John H. Hartman,et al.  Efficient cooperative caching using hints , 1996, OSDI '96.

[15]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[16]  L. Iftode,et al.  Memory servers for multicomputers , 1993, Digest of Papers. Compcon Spring.

[17]  Bruce Hendrickson,et al.  The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers , 1994, SIAM J. Sci. Comput..

[18]  Bill N. Schilit,et al.  Adaptive Remote Paging for Mobile Computers , 1991 .

[19]  Douglas Comer,et al.  Efficient order-dependent communication in a distributed virtual memory environment , 1992 .

[20]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[21]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[22]  Andreas Mueller,et al.  Fast sequential and parallel algorithms for association rule mining: a comparison , 1995 .