Performance analysis of coarse-grained parallel genetic algorithms on the multi-core sun UltraSPARC T1

The new generation of shared memory multi-core processors with multiple parallel execution paths provides a promising hardware platform for applications with high degree of task-level parallelism (TLP). Genetic Algorithm (GA), a widely-used evolutionary meta-heuristic optimization method, is a unique candidate in this class of applications and demonstrates significant amount of explicit and implicit parallelism. In this paper, we present the performance characteristics of a GA optimizing a placement problem on a Sun UltraSPARC T1 processor. To investigate the behavior of the benchmark, we vary both algorithm-specific parameters as well as the size of the target problem. The system performance is evaluated by monitoring throughput, cycle-per-instruction (CPI) and, the memory access patterns for different core and thread combinations. Our experiments show that for a constant data size, as the number of threads per core increase from 1 to 4, the throughput of the system increases by 84% keeping all cores active. Similarly, as we increase the number of cores in the system, the throughput of the system increases by a factor of 3. The average memory bandwidth is seen to scale in proportion to throughput for both core-scaling and thread-scaling. The overall increase in throughput, either by core-scaling or thread-scaling, in spite of growing memory bandwidth, shows the ability of the multi-threaded multi-core processor to hide long latency memory accesses for the targeted benchmark.

[1]  Jinuk Luke Shin,et al.  A Power-Efficient High-Throughput 32-Thread SPARC Processor , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[2]  Lizy K. John,et al.  CMP/CMT Scaling of SPECjbb2005 on UltraSPARC T1 , 2005 .

[3]  A.S. Leon,et al.  The U1traSPARC T1: A Power-Efficient High-Throughput 32-Thread SPARC Processor , 2006, 2006 IEEE Asian Solid-State Circuits Conference.

[4]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[5]  Toshio Nakatani,et al.  Performance Studies of Commercial Workloads on a Multi-core System , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  P.P. Gelsinger,et al.  Microprocessors for the new millennium: Challenges, opportunities, and new frontiers , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[8]  Erick Cantú-Paz,et al.  A Survey of Parallel Genetic Algorithms , 2000 .

[9]  James Laudon,et al.  The Coming Wave of Multithreaded Chip Multiprocessors , 2007, International Journal of Parallel Programming.

[10]  Arun Ravindran,et al.  Automated design flow for diode-based nanofabrics , 2006, JETC.