论文信息 - TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services - 字舞流文

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

Hwanju Kim | Alan L. Cox | Sameh Elnikety | Yuxiong He | Scott Rixner | Myeongjae Jeon

[1] Nathan Clark,et al. Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications , 2010, ISCA.

[2] Torsten Suel,et al. Using graphics processors for high-performance IR query processing , 2008, WWW.

[3] Patrick Wendell,et al. Sparrow: distributed, low latency scheduling , 2013, SOSP.

[4] Alexandros Stamatakis,et al. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems , 2007, Parallel Comput..

[5] Wolfgang Lehner,et al. Fast integer compression using SIMD instructions , 2010, DaMoN '10.

[6] Shirish Tatikonda,et al. Posting list intersection on multicore architectures , 2011, SIGIR.

[7] Alan L. Cox,et al. Adaptive parallelism for web search , 2013, EuroSys '13.

[8] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[9] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[10] Ricardo Bianchini,et al. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services , 2015, ASPLOS.

[11] Eitan Frachtenberg,et al. Reducing Query Latencies in Web Search Using Fine-Grained Parallelism , 2009, World Wide Web.

[12] Shaolei Ren,et al. Exploiting Processor Heterogeneity in Interactive Services , 2013, ICAC.

[13] Sameh Elnikety,et al. Tians Scheduling: Using Partial Processing in Best-Effort Applications , 2011, 2011 31st International Conference on Distributed Computing Systems.

[14] Jeffrey Dean,et al. Challenges in building large-scale information retrieval systems: invited talk , 2009, WSDM '09.

[15] Christo Wilson,et al. Better never than late , 2011, SIGCOMM 2011.

[16] Raj Vaswani,et al. A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.

[17] Vijay Janapa Reddi,et al. High-performance and energy-efficient mobile web browsing on big/little systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[18] Margaret Martonosi,et al. Characterizing and improving the performance of Intel Threading Building Blocks , 2008, 2008 IEEE International Symposium on Workload Characterization.

[19] Ion Stoica,et al. The Power of Choice in Data-Aware Cluster Scheduling , 2014, OSDI.

[20] Craig MacDonald,et al. Learning to predict response times for online query scheduling , 2012, SIGIR '12.

[21] Fabrizio Silvestri,et al. Prefetching query results and its impact on search engines , 2012, SIGIR '12.

[22] Seung-won Hwang,et al. Predictive parallelization: taming tail latencies in web search , 2014, SIGIR.

[23] Arun Raman,et al. Parallelism orchestration using DoPE: the degree of parallelism executive , 2011, PLDI '11.

[24] Aristides Gionis,et al. The impact of caching on search engines , 2007, SIGIR.

[25] Berkant Barla Cambazoglu,et al. A refreshing perspective of search engine caching , 2010, WWW '10.

[26] Stijn Eyerman,et al. The benefit of SMT in the multi-core era: flexibility towards degrees of thread-level parallelism , 2014, ASPLOS.

[27] T. N. Vijaykumar,et al. Deadline-aware datacenter tcp (D2TCP) , 2012, CCRV.

[28] Thomas F. Wenisch,et al. Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[29] Torsten Suel,et al. Improved techniques for result caching in web search engines , 2009, WWW '09.

[30] Satish Narayanasamy,et al. DoublePlay: parallelizing sequential logging and replay , 2011, ASPLOS XVI.

[31] Gustavo Alonso,et al. Pydron: Semi-Automatic Parallelization for Multi-Core and the Cloud , 2014, OSDI.

[32] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33] Sebastian Burckhardt,et al. The design of a task parallel library , 2009, OOPSLA 2009.

[34] Srikanth Kandula,et al. Speeding up distributed request-response workflows , 2013, SIGCOMM.

[35] Alfons Kemper,et al. Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems , 2012, Proc. VLDB Endow..

[36] Chang-Gun Lee,et al. Multicore scheduling of parallel real-time tasks with multiple parallelization options , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[37] Ronald G. Dreslinski,et al. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[38] Wolfgang Lehner,et al. Fast Sorted-Set Intersection using SIMD Instructions , 2011, ADMS@VLDB.

[39] Yuxiong He,et al. Provably Efficient Online Nonclairvoyant Adaptive Scheduling , 2008, IEEE Trans. Parallel Distributed Syst..