µTune: Auto-Tuned Threading for OLDI Microservices
暂无分享,去创建一个
[1] Steve Vinoski,et al. Node.js: Using JavaScript to Build High-Performance Network Programs , 2010, IEEE Internet Comput..
[2] Willy Zwaenepoel,et al. Flash: An efficient and portable Web server , 1999, USENIX Annual Technical Conference, General Track.
[3] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Douglas C. Schmidt,et al. APPLYING THE PROACTOR PATTERN TO HIGH-PERFORMANCE WEB SERVERS , 1998 .
[5] David A. Patterson,et al. Attack of the killer microseconds , 2017, Commun. ACM.
[6] Alexandr Andoni,et al. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[7] T. N. Vijaykumar,et al. Deadline-aware datacenter tcp (D2TCP) , 2012, SIGCOMM '12.
[8] Luis Ceze,et al. NCAM: Near-Data Processing for Nearest Neighbor Search , 2015, MEMSYS.
[9] 吉野 智興,et al. Programmer's guide , 1993 .
[10] Michael F. P. O'Boyle,et al. Mapping parallelism to multi-cores: a machine learning based approach , 2009, PPoPP '09.
[11] Eric A. Brewer,et al. USENIX Association Proceedings of HotOS IX : The 9 th Workshop on Hot Topics in Operating Systems , 2003 .
[12] Alexandr Andoni,et al. Practical and Optimal LSH for Angular Distance , 2015, NIPS.
[13] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[14] Roger M. Needham,et al. Denial of service , 1993, CCS '93.
[15] David E. Culler,et al. SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.
[16] Ronald G. Dreslinski,et al. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[17] Brad Fitzpatrick,et al. Distributed caching with memcached , 2004 .
[18] Zhe Wang,et al. Modeling LSH for performance tuning , 2008, CIKM '08.
[19] Lingjia Tang,et al. Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[20] Thu D. Nguyen,et al. Exploiting Heterogeneity for Tail Latency and Energy Efficiency , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Adam Silberstein,et al. Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.
[22] Thomas F. Wenisch,et al. Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[23] Trevor Darrell,et al. Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[24] Yuzuru Tanaka,et al. Spherical LSH for Approximate Nearest Neighbor Search on Unit Hypersphere , 2007, WADS.
[25] Luiz André Barroso,et al. The Case for Energy-Proportional Computing , 2007, Computer.
[26] Mayank Bawa,et al. LSH forest: self-tuning indexes for similarity search , 2005, WWW '05.
[27] Roberto Rojas-Cessa,et al. Schemes for Fast Transmission of Flows in Data Center Networks , 2015, IEEE Communications Surveys & Tutorials.
[28] Douglas C. Schmidt,et al. Experience Using Design Patterns to Evolve Communication Software Across Diverse OS Platforms , 1995, ECOOP.
[29] Douglas C. Schmidt,et al. JAWS: A Framework for High-performance Web Servers , 1998 .
[30] Ron Kohavi,et al. Practical guide to controlled experiments on the web: listen to your customers not to the hippo , 2007, KDD '07.
[31] Scott F. Midkiff,et al. Denial-of-Service in Wireless Sensor Networks: Attacks and Defenses , 2008, IEEE Pervasive Computing.
[32] Qingyang Wang,et al. Performance Comparison of Web Servers with Different Architectures: A Case Study Using High Concurrency Workload , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).
[33] Jonathan Goldstein,et al. MTCache: transparent mid-tier database caching in SQL server , 2004, Proceedings. 20th International Conference on Data Engineering.
[34] Amitabh Sinha,et al. Non-Clairvoyant Scheduling for Minimizing Mean Slowdown , 2003, Algorithmica.
[35] Seung-won Hwang,et al. Predictive parallelization: taming tail latencies in web search , 2014, SIGIR.
[36] Edouard Bugnion,et al. ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks , 2017, SOSP.
[37] Yuxiong He,et al. Provably Efficient Online Nonclairvoyant Adaptive Scheduling , 2007, IEEE Transactions on Parallel and Distributed Systems.
[38] Hui Ding,et al. TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.
[39] Berkant Barla Cambazoglu,et al. Impact of response latency on user behavior in web search , 2014, SIGIR.
[40] Calton Pu,et al. A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[41] David E. Culler,et al. SEDA: An Architecture for Scalable, Well-Conditioned Internet Services , 2001 .
[42] Ricardo Bianchini,et al. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services , 2015, ASPLOS.
[43] Allan Kuchinsky,et al. Quality is in the eye of the beholder: meeting users' requirements for Internet quality of service , 2000, CHI.
[44] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[45] Thomas F. Wenisch,et al. Deconstructing the Tail at Scale Effect Across Network Protocols , 2017, ArXiv.
[46] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[47] Rubby Casallas,et al. Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud , 2015, 2015 10th Computing Colombian Conference (10CCC).
[48] Chita R. Das,et al. Characterizing Network Traffic in a Cluster-based, Multi-tier Data Center , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).
[49] Nathan Clark,et al. Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications , 2010, ISCA.
[50] Christoforos E. Kozyrakis,et al. IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.
[51] T.F. Abdelzaher,et al. Web server QoS management by adaptive content delivery , 1999, 1999 Seventh International Workshop on Quality of Service. IWQoS'99. (Cat. No.98EX354).
[52] MullenTracy,et al. Analysis of optimal thread pool size , 2000 .
[53] Eitan Frachtenberg,et al. Reducing Query Latencies in Web Search Using Fine-Grained Parallelism , 2009, World Wide Web.
[54] Eunyoung Jeong,et al. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.
[55] Christoforos E. Kozyrakis,et al. Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[56] W. John Wilbur,et al. The automatic identification of stop words , 1992, J. Inf. Sci..
[57] Zhe Wang,et al. Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.
[58] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[59] Dimitrios S. Nikolopoulos,et al. Online power-performance adaptation of multithreaded programs using hardware event-based prediction , 2006, ICS '06.
[60] Raj Vaswani,et al. A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.
[61] Maria Kihl,et al. Web server performance modeling using an M/G/1/K*PS queue , 2003, 10th International Conference on Telecommunications, 2003. ICT 2003..
[62] Peter R. Pietzuch,et al. Distributed event-based systems , 2006 .
[63] Christoforos E. Kozyrakis,et al. Energy proportionality and workload consolidation for latency-critical applications , 2015, SoCC.
[64] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.
[65] William B. March,et al. MLPACK: a scalable C++ machine learning library , 2012, J. Mach. Learn. Res..
[66] Alexandros Stamatakis,et al. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems , 2007, Parallel Comput..
[67] Panos Kalnis,et al. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space , 2010, TODS.
[68] Roy T. Fielding,et al. The Apache HTTP Server Project , 1997, IEEE Internet Comput..
[69] Timothy Roscoe,et al. Arrakis , 2014, OSDI.
[70] Brahim Medjahed,et al. A Query Rewriting Approach for Web Service Composition , 2010, IEEE Transactions on Services Computing.
[71] Dan Tsafrir,et al. The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops) , 2007, ExpCS '07.
[72] Daniel Sánchez,et al. Rubik: Fast analytical power management for latency-critical systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[73] Panos Kalnis,et al. Quality and efficiency in high dimensional nearest neighbor search , 2009, SIGMOD Conference.
[74] Laxmi N. Bhuyan,et al. Thread reinforcer: Dynamically determining number of threads via OS level monitoring , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).
[75] Thomas F. Wenisch,et al. μ Suite: A Benchmark Suite for Microservices , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[76] Jialin Li,et al. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.
[77] Mike Amundsen,et al. Microservice Architecture: Aligning Principles, Practices, and Culture , 2016 .
[78] Ryan Johnson,et al. Decoupling contention management from scheduling , 2010, ASPLOS XV.
[79] Xiaola Lin,et al. Analysis of optimal thread pool size , 2000, OPSR.
[80] Tony Tung,et al. Scaling Memcache at Facebook , 2013, NSDI.
[81] Jeffrey S. Chase,et al. Balance of power: dynamic thermal management for Internet data centers , 2005, IEEE Internet Computing.
[82] Douglas C. Schmidt,et al. Applying patterns to develop extensible ORB middleware , 1999, IEEE Commun. Mag..
[83] David M. Brooks,et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[84] Seung-won Hwang,et al. Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search , 2015, WSDM.
[85] Eric N. Herness,et al. WebSphere Application Server: A foundation for on demand computing , 2004, IBM Syst. J..
[86] Ruby B. Lee,et al. Distributed Denial of Service: Taxonomies of Attacks, Tools, and Countermeasures , 2004, PDCS.
[87] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[88] Josiah L. Carlson,et al. Redis in Action , 2013 .
[89] T. N. Vijaykumar,et al. TimeTrader: Exploiting latency tail to save datacenter energy for online search , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[90] Michael A. Casey,et al. Locality-Sensitive Hashing for Finding Nearest Neighbors , 2008 .
[91] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.
[92] Yale N. Patt,et al. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs , 2008, ASPLOS.
[93] K. Langendoen,et al. Integrating polling, interrupts, and thread management , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).
[94] Amin Vahdat,et al. Chronos: predictable low latency for data center applications , 2012, SoCC '12.
[95] Rafail Ostrovsky,et al. Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.
[96] David G. Lowe,et al. Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[97] Borko Furht,et al. Handbook of Cloud Computing , 2010 .
[98] Dimitrios S. Nikolopoulos,et al. Effective cross-platform, multilevel parallelism via dynamic adaptive execution , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[99] Chen Ding,et al. Quantifying the cost of context switch , 2007, ExpCS '07.
[100] Dong Liu,et al. The Reverse C10K Problem for Server-Side Mashups , 2009, ICSOC Workshops.
[101] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[102] Gu-Yeon Wei,et al. Tradeoffs between power management and tail latency in warehouse-scale applications , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).
[103] Hyeontaek Lim,et al. MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.
[104] Antony I. T. Rowstron,et al. Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.
[105] Shiliang Hu,et al. LASER: Light, Accurate Sharing dEtection and Repair , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[106] David R. Cheriton,et al. Comparing the performance of web server architectures , 2007, EuroSys '07.
[107] Jaejin Lee,et al. Adaptive execution techniques for SMT multiprocessor architectures , 2005, PPOPP.
[108] Dmitry Namiot,et al. On micro-services architecture , 2014 .
[109] Christoforos E. Kozyrakis,et al. Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[110] Luiz André Barroso,et al. The tail at scale , 2013, CACM.