Optimus Prime: Accelerating Data Transformation in Servers
暂无分享,去创建一个
Babak Falsafi | Christoph Koch | Siddharth Gupta | Mark Sutherland | Mario Paulo Drumond | Arash Pourhabibi Zarandi | Hussein Kassir | Zilu Tian | B. Falsafi | Christoph E. Koch | M. Drumond | Mark Sutherland | Siddharth Gupta | H. Kassir | Zilu Tian
[1] Reetuparna Das,et al. Parallel automata processor , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[2] Andrew A. Chien,et al. UDP: A Programmable Accelerator for Extract-Transform-Load Workloads and More , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Karl-Heinz Krempels,et al. A Structured Approach to Support Collaborative Design, Specification and Documentation of Communication Protocols , 2018, ENASE.
[4] Yong Wang,et al. Overload Control for Scaling WeChat Microservices , 2018, SoCC.
[5] Scott Shenker,et al. Network Requirements for Resource Disaggregation , 2016, OSDI.
[6] Andrew W. Moore,et al. Understanding PCIe performance for end host networking , 2018, SIGCOMM.
[7] Thomas F. Wenisch,et al. µTune: Auto-Tuned Threading for OLDI Microservices , 2018, OSDI.
[8] Dean M. Tullsen,et al. Multithreading Architecture , 2013, Multithreading Architecture.
[9] John K. Ousterhout,et al. Homa: a receiver-driven low-latency transport protocol using network priorities , 2018, SIGCOMM.
[10] Thomas F. Wenisch,et al. HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[11] Tudor David,et al. Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.
[12] Gu-Yeon Wei,et al. Profiling a Warehouse-Scale Computer , 2016, IEEE Micro.
[13] Alex C. Snoeren,et al. Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..
[14] Edouard Bugnion,et al. ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks , 2017, SOSP.
[15] Scott A. Mahlke,et al. A comparison of full and partial predicated execution support for ILP processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[16] Qi Huang,et al. SVE: Distributed Video Processing at Facebook Scale , 2017, SOSP.
[17] Hui Ding,et al. TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.
[18] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.
[19] Michael Kaminsky,et al. Datacenter RPCs can be General and Fast , 2018, NSDI.
[20] Dave Brown,et al. Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .
[21] Thomas F. Wenisch,et al. SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.
[22] David Wentzlaff,et al. Power and Energy Characterization of an Open Source 25-Core Manycore Processor , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[23] Luca Benini,et al. Towards near-threshold server processors , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[24] Yang Li,et al. Service fabric: a distributed platform for building microservices in the cloud , 2018, EuroSys.
[25] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition , 2013, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition.
[26] Albert G. Greenberg,et al. Data center TCP (DCTCP) , 2010, SIGCOMM '10.
[27] Christina Delimitrou,et al. The Architectural Implications of Cloud Microservices , 2018, IEEE Computer Architecture Letters.
[28] Yuan He,et al. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.
[29] Mark Handley,et al. Re-architecting datacenter networks and stacks for low latency and high performance , 2017, SIGCOMM.
[30] Thomas F. Wenisch,et al. μ Suite: A Benchmark Suite for Microservices , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[31] James E. Smith,et al. Decoupled access/execute computer architectures , 1984, TOCS.
[32] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[33] David A. Patterson,et al. Attack of the killer microseconds , 2017, Commun. ACM.
[34] Christoforos E. Kozyrakis,et al. IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.
[35] Ying Zhang,et al. FBOSS: building switch software at scale , 2018, SIGCOMM.