暂无分享,去创建一个
Balaji Prabhakar | Mendel Rosenblum | Sean Choi | Muhammad Shahbaz | B. Prabhakar | M. Rosenblum | M. Shahbaz | Sean Choi
[1] Christoforos E. Kozyrakis,et al. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers , 2019, USENIX Annual Technical Conference.
[2] Mohak Shah,et al. Comparative Study of Deep Learning Software Frameworks , 2015, 1511.06435.
[3] Christoforos E. Kozyrakis,et al. Shinjuku: Preemptive Scheduling for μsecond-scale Tail Latency , 2019, NSDI.
[4] Brian N. Bershad,et al. Characterizing processor architectures for programmable network interfaces , 2000 .
[5] Andrew W. Moore,et al. Characterizing 10 Gbps network interface energy consumption , 2010, IEEE Local Computer Network Conference.
[6] Rastislav Bodík,et al. Floem: A Programming System for NIC-Accelerated Network Applications , 2018, OSDI.
[7] Thomas E. Anderson,et al. Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .
[8] Yajun Ha,et al. The Optimization of Interconnection Networks in FPGAs , 2010, Dynamically Reconfigurable Architectures.
[9] Ju Wang,et al. Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.
[10] Kunle Olukotun,et al. OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning , 2011, ICML.
[11] Andrea C. Arpaci-Dusseau,et al. Serverless Computation with OpenLambda , 2016, HotCloud.
[12] Karan Gupta,et al. Offloading distributed applications onto smartNICs using iPipe , 2019, SIGCOMM.
[13] Dirk Merkel,et al. Docker: lightweight Linux containers for consistent development and deployment , 2014 .
[14] Tim Dettmers,et al. 8-Bit Approximations for Parallelism in Deep Learning , 2015, ICLR.
[15] George Varghese,et al. P4: programming protocol-independent packet processors , 2013, CCRV.
[16] Hari Angepat,et al. A cloud-scale acceleration architecture , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[17] Elad Hoffer,et al. Scalable Methods for 8-bit Training of Neural Networks , 2018, NeurIPS.
[18] Srinivasan Seshan,et al. Hyperloop: group-based NIC-offloading to accelerate replicated transactions in multi-tenant storage systems , 2018, SIGCOMM.
[19] Herbert Bos,et al. On Sockets and System Calls: Minimizing Context Switches for the Socket API , 2014, TRIOS.
[20] Matt Holdrege,et al. IP Network Address Translator (NAT) Terminology and Considerations , 1999, RFC.
[21] Alex Glikson,et al. Deviceless edge computing: extending serverless computing to the edge of the network , 2017, SYSTOR.
[22] Nick Feamster,et al. The case for an intermediate representation for programmable data planes , 2015, SOSR.
[23] John K. Ousterhout,et al. Homa: a receiver-driven low-latency transport protocol using network priorities , 2018, SIGCOMM.
[24] Edouard Bugnion,et al. R2P2: Making RPCs first-class datacenter citizens , 2019, USENIX ATC.
[25] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.
[26] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[27] Arvind Krishnamurthy,et al. E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers , 2019, USENIX ATC.
[28] David Walker,et al. Enabling Programmable Transport Protocols in High-Speed NICs , 2020, NSDI.
[29] John K. Ousterhout,et al. In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.
[30] Yousof Al-Hammadi,et al. Performance comparison between container-based and VM-based services , 2017, 2017 20th Conference on Innovations in Clouds, Internet and Networks (ICIN).
[31] Rob Pike. Go at Google , 2012, SPLASH '12.
[32] Kushagra Vaid,et al. Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.
[33] Minlan Yu,et al. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs , 2017, SIGCOMM.
[34] 丸山 勉,et al. Field Programmable Gate Array による複雑適応系の計算の高速化 , 1999 .
[35] Nate Foster,et al. NetCache: Balancing Key-Value Stores with Fast In-Network Caching , 2017, SOSP.
[36] Fernando Pedone,et al. The Case For In-Network Computing On Demand , 2019, EuroSys.
[37] Abhay Parekh,et al. A generalized processor sharing approach to flow control in integrated services networks: the single-node case , 1993, TNET.
[38] Matthias Blume,et al. Taming the IXP network processor , 2003, PLDI.
[39] George Varghese,et al. Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.
[40] David A. Patterson,et al. A new golden age for computer architecture , 2019, Commun. ACM.
[41] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[42] Nick McKeown,et al. Programmable Packet Scheduling at Line Rate , 2016, SIGCOMM.
[43] Fred Douglis,et al. Virtualization , 2013, IEEE Internet Comput..
[44] Jennifer Rexford,et al. HULA: Scalable Load Balancing Using Programmable Data Planes , 2016, SOSR.
[45] Fernando Pedone,et al. NetPaxos: consensus at network speed , 2015, SOSR.
[46] Martín Casado,et al. The Design and Implementation of Open vSwitch , 2015, NSDI.
[47] Thierry Marianne. Cloud Computing without Containers , 2018 .
[48] Michael K. Chen,et al. Shangri-La: achieving high performance from compiled network applications while enabling ease of programming , 2005, PLDI '05.
[49] Tony Tung,et al. Scaling Memcache at Facebook , 2013, NSDI.
[50] Hari Balakrishnan,et al. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads , 2019, NSDI.
[51] Mohak Shah,et al. Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning , 2015, ArXiv.
[52] Anshul Jaiswal,et al. Realtime Data Processing at Facebook , 2016, SIGMOD Conference.
[53] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.