Although the performance of supercomputers on our n-body cosmology code has improved by a factor of nearly 2000 since 1991, the performance per watt has only improved 300-fold and the performance per square foot only 65fold. Clearly, we are building less and less efficient supercomputers, thus resulting in the construction of new machines rooms 1 and even entirely new buildings. Furthermore, as these supercomputers continue to follow “Moore’s Law for Power Consumption,” the reliability of these supercomputers continues to plummet, relative to Arrenhius’ equation for microelectronics. To address these problems, we built a super-efficient supercomputer dubbed Green Destiny, a 240-processor supercomputer that fits in a telephone booth (i.e., a footprint of five square feet) and sips less than 5.2 kW of power at full load [FWW02, WWF02, Feng03]. This “Supercomputer for the Rest of Us” – a 2003 R&D 100 award-winning machine – provided affordable, general-purpose supercomputing to our application scientists while sitting in an 85-90˚ F (29-32˚ C) dusty warehouse at 7,400 feet (2256 meters) above sea level. Furthermore, it delivered reliable computing cycles without any special facilities, i.e., no air conditioning, no humidification control, no air filtration, and no ventilation, and without any unscheduled downtime. However, although Green Destiny demonstrated a total price-performance ratio (ToPPeR) that was 50% better than a traditional Beowulf cluster or supercomputer, power efficiency (i.e., performance-power ratio) that was up to eight times better, and space efficiency (i.e., performance-space ratio) that was up to thirty times better, both the raw performance and price/performance lagged a traditional Beowulf cluster or supercomputer by a factor of two. Thus, many would argue that Green Destiny sacrificed too much performance in achieving power and space efficiency (and thus, better reliability and total cost of ownership). Therefore, we propose to evolve Green Destiny with a hybrid software-hardware solution, one that uses commodity processors from AMD (i.e., Athlon XP-M, Athlon 64, and Opteron) to achieve better performance, coupled with AMD’s “Cool-N-Quiet” technology (formerly PowerNow!) and our novel dynamic voltage-scaling (DVS) technique to reduce power consumption by as much as 40% while impacting performance by less than 7%.
[1]
Sharad Malik,et al.
Compile-time dynamic voltage scaling settings: opportunities and limits
,
2003,
PLDI '03.
[2]
Wu-chun Feng,et al.
Making a Case for Efficient Supercomputing
,
2003,
ACM Queue.
[3]
Wu-chun Feng,et al.
High-Density Computing: A 240-Processor Beowulf in One Cubic Meter
,
2002,
ACM/IEEE SC 2002 Conference (SC'02).
[4]
Wu-chun Feng,et al.
The Bladed Beowulf: a cost-effective alternative to traditional Beowulfs
,
2002,
Proceedings. IEEE International Conference on Cluster Computing.
[5]
Philip Levis,et al.
Policies for dynamic clock scheduling
,
2000,
OSDI.
[6]
Eric Rotenberg,et al.
FAST: Frequency-aware static timing analysis
,
2006,
TECS.