Amdahl's law for tail latency

Queueing theoretic models can guide design trade-offs in systems targeting tail latency, not just average performance.

[1]  Christina Delimitrou,et al.  Workload characterization of interactive cloud services on big and small server platforms , 2017, 2017 IEEE International Symposium on Workload Characterization (IISWC).

[2]  Thomas F. Wenisch,et al.  Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[3]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[4]  Lieven Eeckhout,et al.  Scheduling heterogeneous multi-cores through performance impact estimation (PIE) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[5]  Trevor N. Mudge,et al.  Challenges and Opportunities for Extremely Energy-Efficient Processors , 2010, IEEE Micro.

[6]  Eric Sprangle,et al.  Increasing processor performance by implementing deeper pipelines , 2002, ISCA.

[7]  Chenyang Lu,et al.  Work stealing for interactive services to meet target latency , 2016, PPoPP.

[8]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[9]  Gu-Yeon Wei,et al.  Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[10]  Yale N. Patt,et al.  MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[11]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[12]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[13]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[14]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[15]  Coniferous softwood GENERAL TERMS , 2003 .

[16]  Thomas F. Wenisch,et al.  Does low-power design imply energy efficiency for data centers? , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[17]  Kishor S. Trivedi,et al.  Probability and Statistics with Reliability, Queuing and Computer Science Applications, Second Edition , 2002 .

[18]  Urs Hölzle,et al.  Brawny cores still beat wimpy cores, most of the time , 2010 .

[19]  Pradeep Dubey,et al.  Architecting to achieve a billion requests per second throughput on a single key-value store server platform , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[20]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[21]  Mor Harchol-Balter,et al.  Performance Modeling and Design of Computer Systems: Queueing Theory in Action , 2013 .

[22]  Hao Che,et al.  Wimpy or brawny cores: A throughput perspective , 2013, J. Parallel Distributed Comput..

[23]  Ricardo Bianchini,et al.  Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services , 2015, ASPLOS.

[24]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[25]  Babak Falsafi,et al.  Scale-out NUMA , 2014, ASPLOS.

[26]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.