Whare-map: heterogeneity in "homogeneous" warehouse-scale computers

Modern "warehouse scale computers" (WSCs) continue to be embraced as homogeneous computing platforms. However, due to frequent machine replacements and upgrades, modern WSCs are in fact composed of diverse commodity microarchitectures and machine configurations. Yet, current WSCs are architected with the assumption of homogeneity, leaving a potentially significant performance opportunity unexplored. In this paper, we expose and quantify the performance impact of the "homogeneity assumption" for modern production WSCs using industry-strength large-scale web-service workloads. In addition, we argue for, and evaluate the benefits of, a heterogeneity-aware WSC using commercial web-service production workloads including Google's web-search. We also identify key factors impacting the available performance opportunity when exploiting heterogeneity and introduce a new metric, opportunity factor, to quantify an application's sensitivity to the heterogeneity in a given WSC. To exploit heterogeneity in "homogeneous" WSCs, we propose "Whare-Map," the WSC Heterogeneity Aware Mapper that leverages already in-place continuous profiling subsystems found in production environments. When employing "Whare-Map", we observe a cluster-wide performance improvement of 15% on average over heterogeneity--oblivious job placement and up to an 80% improvement for web-service applications that are particularly sensitive to heterogeneity.

[1]  F. Glover,et al.  In Modern Heuristic Techniques for Combinatorial Problems , 1993 .

[2]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[3]  Sadiq M. Sait,et al.  Iterative computer algorithms with applications in engineering - solving combinatorial optimization problems , 2000 .

[4]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[5]  Ricardo Bianchini,et al.  Energy conservation in heterogeneous server clusters , 2005, PPoPP.

[6]  Janet L. Wiener,et al.  Cost-aware scheduling for heterogeneous enterprise machines (CASH’EM) , 2007, 2007 IEEE International Conference on Cluster Computing.

[7]  Andrew A. Chien,et al.  Automatic resource specification generation for resource selection , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[8]  Ripal Nathuji,et al.  Exploiting Platform Heterogeneity for Power Efficient Data Centers , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[9]  Douglas G. Down,et al.  Dynamic scheduling for heterogeneous Desktop Grids , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[10]  Douglas G. Down,et al.  Dynamic scheduling for heterogeneous Desktop Grids , 2008, Grid 2008.

[11]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[12]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[13]  Jonathan A. Winter,et al.  Scheduling algorithms for unpredictably heterogeneous CMP architectures , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[14]  Trevor N. Mudge,et al.  Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments , 2008, 2008 International Symposium on Computer Architecture.

[15]  James R. Hamilton,et al.  Internet-scale service infrastructure efficiency , 2009, ISCA '09.

[16]  Hsien-Hsin S. Lee,et al.  PROPHET: goal-oriented provisioning for highly tunable multicore processors in cloud computing , 2009, OPSR.

[17]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[18]  Luiz André Barroso,et al.  Guest Editors' Introduction: Datacenter-Scale Computing , 2010, IEEE Micro.

[19]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[20]  Chita R. Das,et al.  Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[21]  James R. Larus,et al.  Orleans: A Framework for Cloud Computing , 2010 .

[22]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[23]  Hong Liu,et al.  Energy proportional datacenter networks , 2010, ISCA.

[24]  Thomas F. Wenisch,et al.  Power routing: dynamic power provisioning in the data center , 2010, ASPLOS XV.

[25]  T. N. Vijaykumar,et al.  Joint optimization of idle and cooling power in data centers while maintaining response time , 2010, ASPLOS XV.

[26]  Randy H. Katz,et al.  An energy case for hybrid datacenters , 2010, OPSR.

[27]  Gang Ren,et al.  Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers , 2010, IEEE Micro.

[28]  James R. Larus,et al.  Orleans: cloud computing for everyone , 2011, SoCC.

[29]  Lingjia Tang,et al.  The impact of memory subsystem resource sharing on datacenter applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[30]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]  Lingjia Tang,et al.  Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity , 2011, IEEE Computer Architecture Letters.

[32]  Hsien-Hsin S. Lee,et al.  Using Mathematical Modeling in Provisioning a Heterogeneous Cloud Computing Environment , 2011, Computer.

[33]  T. N. Vijaykumar,et al.  Tarazu: optimizing MapReduce on heterogeneous clusters , 2012, ASPLOS XVII.

[34]  Lingjia Tang,et al.  Compiling for niceness: mitigating contention for QoS in warehouse scale computers , 2012, CGO '12.

[35]  Wei Wang,et al.  ReQoS: reactive static/dynamic compilation for QoS in warehouse scale computers , 2013, ASPLOS '13.

[36]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[37]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.