暂无分享,去创建一个
Minyi Guo | Li Li | Jingwen Leng | Wenli Zheng | Quan Chen | Zihan Liu | Chao Li | Quan Chen | M. Guo | Jingwen Leng | Zihan Liu | Chao Li | Wenli Zheng | Li Li
[1] Frédo Durand,et al. Taichi , 2019, ACM Trans. Graph..
[2] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[3] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[4] Jürgen Teich,et al. Automatic Kernel Fusion for Image Processing DSLs , 2018, SCOPES.
[5] Minyi Guo,et al. Adversarial Defense Through Network Profiling Based Path Extraction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Minyi Guo,et al. Survey and design of paleozoic: a high-performance compiler tool chain for deep learning inference accelerator , 2020, CCF Transactions on High Performance Computing.
[7] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[8] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Alexander Aiken,et al. TASO: optimizing deep learning computation with automatic generation of graph substitutions , 2019, SOSP.
[11] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[12] Wei Yi,et al. Kernel Fusion: An Effective Method for Better Power Efficiency on Multithreaded GPU , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.
[13] Tianshi Chen,et al. Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[14] Minyi Guo,et al. Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[15] Minyi Guo,et al. Asymmetric Resilience: Exploiting Task-Level Idempotency for Transient Error Recovery in Accelerator-Based Systems , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[16] Minyi Guo,et al. Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[17] Ruben Mayer,et al. The tensorflow partitioning and scheduling problem: it's the critical path! , 2017, ArXiv.
[18] Quan Chen,et al. Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services , 2019, 2019 IEEE 37th International Conference on Computer Design (ICCD).
[19] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[20] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[21] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] Minyi Guo,et al. Ptolemy: Architecture Support for Robust Deep Learning , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[24] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[25] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[26] Jürgen Teich,et al. From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).