Missing the Forest for the Trees: End-to-End AI Application Performance in Edge Data Centers
暂无分享,去创建一个
Ankit Patel | Vijay Janapa Reddi | Ramesh Illikkal | Daniel Richins | Dharmisha Doshi | Matthew Blackmore | Aswathy Thulaseedharan Nair | Neha Pathapati | Brainard Daguman | Daniel Dobrijalowski | Kevin Long | David Zimmerman | V. Reddi | R. Illikkal | Daniel Richins | Dharmisha Doshi | Matthew Blackmore | A. Nair | Neha Pathapati | Ankit Patel | Brainard Daguman | Daniel Dobrijalowski | Kevin Long | David Zimmerman | Kevin Long
[1] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[2] D. R. Long. A Case for Case Studies , 1986 .
[3] Kenneth P. Birman,et al. Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.
[4] Jonathan Arnowitz,et al. The case for case studies , 2005, INTR.
[5] Luis Entrena,et al. Hardware Architectures for Image Processing Acceleration , 2009 .
[6] Andrea Cavallaro,et al. Video Analytics for Surveillance: Theory and Practice [From the Guest Editors] , 2010 .
[7] Lingjia Tang,et al. Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity , 2011, IEEE Computer Architecture Letters.
[8] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[9] Thomas F. Wenisch,et al. The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services , 2014, OSDI.
[10] Jim Groom,et al. Docker - Build, Ship, and Run Any App, Anywhere , 2014 .
[11] Jialin Li,et al. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.
[12] Gu-Yeon Wei,et al. Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[13] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[14] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Martin Kleppmann,et al. Kafka, Samza and the Unix Philosophy of Distributed Data , 2015, IEEE Data Eng. Bull..
[16] Hadi Esmaeilzadeh,et al. TABLA: A unified template-based framework for accelerating statistical machine learning , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[17] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[18] Yu Qiao,et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.
[19] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[20] Thu D. Nguyen,et al. Exploiting Heterogeneity for Tail Latency and Energy Efficiency , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[22] Hari Angepat,et al. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave , 2018, IEEE Micro.
[23] David M. Brooks,et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[24] Thomas Weise,et al. Apache Apex , 2019, Encyclopedia of Big Data Technologies.
[25] Cody Coleman,et al. MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[26] Carole-Jean Wu,et al. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).