Web Search Using Small Cores: Quantifying the Price of Efficiency

The commoditization of hardware, data center economies of scale, and Internet-scale workload growth all demand greater power efficiency to sustain scalability. Traditional enterprise workloads, which are typically memory and I/O bound, have been well served by chip multiprocessors comprising of small, power-efficient cores. While small cores deliver performanceper-Watt efficiency for such data center workloads, small cores impact application quality-of-service robustness, flexibility, and reliability for emerging Internet-scale applications, which increasingly invoke computationally intensive kernels. These challenges constitute the price of efficiency, which we quantify for an industry-strength, production-quality, nextgeneration online web search engine. Specifically, we evaluate search on serverand mobile-class architectures using Xeon and Atom processors, quantifying search efficiency at the microarchitectureand system-level. Our findings prompt us toward re-thinking small core designs for a new breed of data center workloads in order to continue reaping the benefits of small-core power efficiency.

[1]  Trevor N. Mudge,et al.  Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments , 2008, 2008 International Symposium on Computer Architecture.

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Kunle Olukotun,et al.  Maximizing CMP throughput with mediocre cores , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[4]  Edward T. Grochowski,et al.  Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[5]  David E. Irwin,et al.  Ensemble-level Power Management for Dense Blade Servers , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[6]  Belliappa Kuttanna,et al.  A Sub-1W to 2W Low-Power IA Processor for Mobile Internet Devices and Ultra-Mobile PCs in 45nm Hi-Κ Metal Gate CMOS , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[7]  Ken Smits,et al.  Penryn: 45-nm next generation Intel® core™ 2 processor , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[8]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[9]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[10]  Amar Phanishayee,et al.  FAWNdamentally Power-efficient Clusters , 2009, HotOS.

[11]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[12]  Luiz André Barroso,et al.  The Price of Performance , 2005, ACM Queue.

[13]  Norman P. Jouppi,et al.  Enterprise IT trends and implications for architecture research , 2005, 11th International Symposium on High-Performance Computer Architecture.