论文信息 - AutoTune: Improving End-to-end Performance and Resource Efficiency for Microservice Applications - 字舞流文

AutoTune: Improving End-to-end Performance and Resource Efficiency for Microservice Applications

Most large web-scale applications are now built by composing collections (from a few up to 100s or 1000s) of microservices. Operators need to decide how many resources are allocated to each microservice, and these allocations can have a large impact on application performance. Manually determining allocations that are both cost-efficient and meet performance requirements is challenging, even for experienced operators. In this paper we present AutoTune, an end-to-end tool that automatically minimizes resource utilization while maintaining good application performance.

Scott Shenker | Aurojit Panda | Michael Alan Chang | Hantao Wang | Yuancheng Tsai | Rahul Balakrishnan

[1] Minlan Yu,et al. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[2] Gautam Kumar,et al. Hold 'em or fold 'em?: aggregation queries under performance variations , 2016, EuroSys.

[3] Aric Hagberg,et al. Exploring Network Structure, Dynamics, and Function using NetworkX , 2008 .

[4] Amin Vahdat,et al. DieCast: Testing Distributed Systems with an Accurate Scale Model , 2008, TOCS.

[5] Katerina J. Argyraki,et al. ResQ: Enabling SLOs in Network Function Virtualization , 2018, NSDI.

[6] Christof Fetzer,et al. Sieve: Actionable Insights from Monitored Metrics in Microservices , 2017, ArXiv.

[7] Ion Stoica,et al. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.

[8] Alessandro Orso,et al. RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking , 2017, CCS.

[9] Srikanth Kandula,et al. Multi-resource packing for cluster schedulers , 2014, SIGCOMM.

[10] Luiz André Barroso,et al. The tail at scale , 2013, CACM.

[11] Emery D. Berger,et al. Coz: finding code that counts with causal profiling , 2015, USENIX Annual Technical Conference.

[12] Xiangyu Zhang,et al. High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[13] Carlo Curino,et al. Morpheus: Towards Automated SLOs for Enterprise Clusters , 2016, OSDI.

[14] Randy H. Katz,et al. Selecting the best VM across multiple public clouds: a data-driven performance modeling approach , 2017, SoCC.

[15] Srikanth Kandula,et al. Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[16] Srikanth Kandula,et al. Speeding up distributed request-response workflows , 2013, SIGCOMM.

[17] Christina Delimitrou,et al. Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[18] Eric A. Brewer,et al. Borg, Omega, and Kubernetes , 2016, ACM Queue.

[19] Gregory R. Ganger,et al. The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[20] Barton P. Miller,et al. The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[21] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[22] Rushil Anirudh,et al. Performance Modeling under Resource Constraints Using Deep Transfer Learning , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[23] Christoforos E. Kozyrakis,et al. Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[24] Leonid Ryzhyk,et al. Automating Cluster Management with Weave , 2019, ArXiv.

[25] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[26] Wonho Kim,et al. Kraken: Leveraging Live Traffic Tests to Identify and Resolve Resource Utilization Bottlenecks in Large Scale Web Services , 2016, OSDI.

[27] Lars Koesterke,et al. PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[28] Damon Wischik,et al. SHRiNK: a method for enabling scaleable performance prediction and efficient network simulation , 2005, IEEE/ACM Transactions on Networking.

[29] Amin Vahdat,et al. To infinity and beyond: time warped network emulation , 2005, SOSP '05.