Datacenter Design and Management: A Computer Architect's Perspective

Abstract An era of big data demands datacenters, which house the computing infrastructure that translates raw data into valuable information. This book defines datacenters broadly, as large distributed systems that perform parallel computation for diverse users. These systems exist in multiple forms—private and public—and are built at multiple scales. Datacenter design and management is multifaceted, requiring the simultaneous pursuit of multiple objectives. Performance, efficiency, and fairness are first-order design and management objectives, which can each be viewed from several perspectives. This book surveys datacenter research from a computer architect's perspective, addressing challenges in applications, design, management, server simulation, and system simulation. This perspective complements the rich bodies of work in datacenters as a warehouse-scale system, which study the implications for infrastructure that encloses computing equipment, and in datacenters as distributed systems, which employ a...

[1]  Kunle Olukotun,et al.  Maximizing CMP throughput with mediocre cores , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[2]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[4]  Benjamin C. Lee,et al.  Modeling communication costs in blade servers , 2015, HotPower '15.

[5]  James E. Smith,et al.  A performance counter architecture for computing accurate CPI components , 2006, ASPLOS XII.

[6]  Stijn Eyerman,et al.  System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.

[7]  Benjamin C. Lee,et al.  Strategies for anticipating risk in heterogeneous system design , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Christoforos E. Kozyrakis,et al.  Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.

[9]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[10]  Marios C. Papaefthymiou,et al.  Computational sprinting , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[11]  Carla Schlatter Ellis,et al.  Memory controller policies for DRAM power management , 2001, ISLPED '01.

[12]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[13]  David M. Brooks,et al.  Illustrative Design Space Studies with Microarchitectural Regression Models , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[14]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[15]  Ronald G. Dreslinski,et al.  Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers , 2015, ASPLOS.

[16]  Matt T. Yourst PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[17]  Sally A. McKee,et al.  Characterizing and subsetting big data workloads , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[18]  Sally A. McKee,et al.  Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.

[19]  Brad Calder,et al.  Using SimPoint for accurate and efficient simulation , 2003, SIGMETRICS '03.

[20]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[21]  Pradeep Dubey,et al.  Navigating the maze of graph analytics frameworks using massive graph datasets , 2014, SIGMOD Conference.

[22]  Amanda Spink,et al.  How are we searching the World Wide Web? A comparison of nine search engine transaction logs , 2006, Inf. Process. Manag..

[23]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[24]  Ronald G. Dreslinski,et al.  Integrated 3D-stacked server designs for increasing physical density of key-value stores , 2014, ASPLOS.

[25]  Babak Falsafi,et al.  Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[27]  Qingyuan Deng,et al.  MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.

[28]  Onur Mutlu,et al.  Phase change memory architecture and the quest for scalability , 2010, Commun. ACM.

[29]  Reynold Xin,et al.  GraphX: a resilient distributed graph system on Spark , 2013, GRADES.

[30]  Ying Zhang,et al.  Time series analysis of a Web search engine transaction log , 2009, Inf. Process. Manag..

[31]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[32]  Babak Falsafi,et al.  Optimizing Data-Center TCO with Scale-Out Processors , 2012, IEEE Micro.

[33]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[34]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[35]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[36]  Avi Mendelson,et al.  Fairness enforcement in switch on event multithreading , 2007, TACO.

[37]  Benjamin C. Lee,et al.  Sharing Incentives and Fair Division for Multiprocessors , 2015, IEEE Micro.

[38]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[39]  Justin Talbot,et al.  Phoenix++: modular MapReduce for shared-memory systems , 2011, MapReduce '11.

[40]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[41]  Monica S. Lam,et al.  Distributed SociaLite: A Datalog-Based Language for Large-Scale Graph Analysis , 2013, Proc. VLDB Endow..

[42]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[43]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[44]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[45]  Quan Chen,et al.  DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[46]  Kenneth A. Ross,et al.  Q100: the architecture and design of a database processing unit , 2014, ASPLOS.

[47]  George S. Fishman,et al.  Discrete-event simulation , 2001 .

[48]  Benjamin C. Lee,et al.  Navigating heterogeneous processors with market mechanisms , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[49]  Ronald G. Dreslinski,et al.  Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[50]  Babak Falsafi,et al.  Scale-out NUMA , 2014, ASPLOS.

[51]  Parag Agrawal,et al.  The case for RAMCloud , 2011, Commun. ACM.

[52]  Amer Diwan Life lessons and datacenter performance analysis , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[53]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[54]  Stijn Eyerman,et al.  Interval simulation: Raising the level of abstraction in architectural simulation , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[55]  Sriram Sankar,et al.  Server Engineering Insights for Large-Scale Online Services , 2010, IEEE Micro.

[56]  Thomas F. Wenisch,et al.  Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[57]  Benjamin C. Lee,et al.  REF: resource elasticity fairness with sharing incentives for multiprocessors , 2014, ASPLOS.

[58]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[59]  Benjamin C. Lee,et al.  Inferred Models for Dynamic and Sparse Hardware-Software Spaces , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[60]  Thomas F. Wenisch,et al.  TurboSMARTS: accurate microarchitecture simulation sampling in minutes , 2005, SIGMETRICS '05.

[61]  Marios C. Papaefthymiou,et al.  Computational sprinting on a hardware/software testbed , 2013, ASPLOS '13.

[62]  Gu-Yeon Wei,et al.  Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[63]  David M. Brooks,et al.  Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.

[64]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[65]  Pradeep Dubey,et al.  Architecting to achieve a billion requests per second throughput on a single key-value store server platform , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[66]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[67]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[68]  Jaime Teevan,et al.  Information re-retrieval: repeat queries in Yahoo's logs , 2007, SIGIR.

[69]  Haibin Wang,et al.  Cost effective data center servers , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[70]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[71]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[72]  Benjamin C. Lee,et al.  Disintegrated control for energy-efficient and heterogeneous memory systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[73]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[74]  Daniel Wong,et al.  KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[75]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[76]  Norman P. Jouppi,et al.  Core architecture optimization for heterogeneous chip multiprocessors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[77]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[78]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[79]  Mark Horowitz,et al.  Rethinking DRAM Power Modes for Energy Proportionality , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[80]  Babak Falsafi,et al.  Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors , 2012, TOCS.

[81]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[82]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[83]  Avi Mendelson,et al.  Fairness and Throughput in Switch on Event Multithreading , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[84]  John R. Gilbert,et al.  The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..

[85]  Benjamin C. Lee,et al.  Understanding query complexity and its implications for energy-efficient web search , 2013, ISLPED '13.

[86]  Kenneth A. Ross,et al.  Navigating big data with high-throughput, energy-efficient data partitioning , 2013, ISCA.

[87]  Babak Falsafi,et al.  A Case for Specialized Processors for Scale-Out Workloads , 2014, IEEE Micro.

[88]  Thomas F. Wenisch,et al.  Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.

[89]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[90]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[91]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[92]  Junjie Wu,et al.  BigHouse: A simulation infrastructure for data center systems , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[93]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[94]  Bruce Jacob,et al.  DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.

[95]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[96]  Ivan E. Sutherland,et al.  A futures market in computer time , 1968, Commun. ACM.

[97]  H. T. Kung Memory requirements for balanced computer architectures , 1986, ISCA '86.

[98]  Chunjie Luo,et al.  Characterizing data analysis workloads in data centers , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[99]  Mendel Rosenblum,et al.  Fast crash recovery in RAMCloud , 2011, SOSP.

[100]  Yuanyuan Zhou,et al.  DMA-aware memory energy management , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[101]  David A. Patterson,et al.  Direction-optimizing breadth-first search , 2012, HiPC 2012.

[102]  Ricardo Bianchini,et al.  Limiting the power consumption of main memory , 2007, ISCA '07.

[103]  Kushagra Vaid,et al.  Mobile processors for energy-efficient web search , 2011, TOCS.

[104]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2011, Commun. ACM.

[105]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[106]  Ariel D. Procaccia,et al.  Beyond dominant resource fairness: extensions, limitations, and indivisibilities , 2012, EC '12.

[107]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[108]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[109]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[110]  Gang Ren,et al.  Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers , 2010, IEEE Micro.

[111]  James E. Smith,et al.  Automated design of application specific superscalar processors: an analytical approach , 2007, ISCA '07.

[112]  Lingjia Tang,et al.  Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[113]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.