On the Cost-efficiency of Hierarchical Heterogeneous Machines for Compiler- and Hand-Parallelized Applications

His research interests are in the areas of parallel processing, computer architecture, network computing and fault-tolerant computing. He has authored or coauthored over 90 technical papers. He has worked in the areas of compilers, languages and tools for parallel machines; in performance evaluation; and in the design of high performance computer architectures. 17 (this machine organization is indicated by \L" in Table 3). These two benchmarks have a low CCratio. Furthermore, the two benchmarks that can beneet from an HPAM machine with a high ratio of processor speed between levels are CCt2 and Hairshed. These two benchmarks have a high CCratio. In general, the performance of a given benchmark when executed on a multilevel heterogeneous machine is determined by its parallelism and communication behavior. Because the xed budget comparison had to be strictly observed and network cost was assumed to track processor cost, slow networks were used with slow processors and fast networks were used with fast processors. Using fast networks in the entire HPAM machine will add to the advantages of HPAM machines over one-level homogeneous machines under the xed budget criterion. The CCratio can be improved by (1) using fast networks throughout the entire HPAM machine (2) providing hardware and software support for fast collective communication across levels and within levels 21] (3) using computing in memory for the second or third level of an HPAM machine 22]. Finally, as shown by the last two columns of Table 3, in 50% and 80% of the cases for the selected benchmarks, hybrid conngurations of the two-level and three-level machines were used, respectively. This indicates that hardware and software support for reconnguration is needed in HPAM. Furthermore, this support becomes more critical as the number of levels increases. A heterogeneous hierarchical solution to cost-eecient high performance computing. Hierarchical processors-and-memory architecture for high performance computing. Towards the design of a heterogeneous hierarchical machine: A simulation approach. In this simulation-based study each HPAM machine was compared to the optimal one-level homogeneous machine. This optimal machine varies in size and processor speed for a given budget across the benchmarks studied. The results of this study are summarized in Table 3. CCratio when # Processors = HPAM Reconnguration Benchmark # DoPs 2 128 Performance two-level three-level CCt2 2 225.19 196.57 O L H H yes yes HHt2 2 24.97 1.53 O O O L yes yes Cstereo 3 54.78 1.76 L L L L …