Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis
暂无分享,去创建一个
Gu-Yeon Wei | Joseph Manzano | Marco Minutoli | Vito Giovanni Castellana | Antonino Tumeo | David Brooks | Ankur Limaye | Cheng Tan | Vinay Amatya | Jeff Jun Zhang | Nicolas Bohm Agostini | Shihao Song
[1] Siddharth Garg,et al. CompAct: On-chip Compression of Activations for Low Power Systolic Array Based CNN Acceleration , 2019, ACM Trans. Embed. Comput. Syst..
[2] Hong Wang,et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.
[3] Christian Pilato,et al. Compiler Infrastructure for Specializing Domain-Specific Memory Templates , 2021, ArXiv.
[4] ScaleHLS: Achieving Scalable High-Level Synthesis through MLIR , 2021 .
[5] Christian Pilato,et al. Agile SoC Development with Open ESP : Invited Paper , 2020, 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD).
[6] Bertrand A. Maher,et al. Glow: Graph Lowering Compiler Techniques for Neural Networks , 2018, ArXiv.
[7] Qiang Wu,et al. A hierarchical CDFG as intermediate representation for hardware/software codesign , 2002, IEEE 2002 International Conference on Communications, Circuits and Systems and West Sino Expositions.
[8] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[9] Joseph Manzano,et al. Invited: Software Defined Accelerators From Learning Tools Environment , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[10] Bernard Brezzo,et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[11] David R. Kaeli,et al. Design Space Exploration of Accelerators and End-to-End DNN Evaluation with TFLITE-SOC , 2020, 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).
[12] Gu-Yeon Wei,et al. The Aladdin Approach to Accelerator Design and Modeling , 2015, IEEE Micro.
[13] Giacomo Indiveri,et al. A Scalable Multicore Architecture With Heterogeneous Memory Structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs) , 2017, IEEE Transactions on Biomedical Circuits and Systems.
[14] Gianluca Palermo,et al. Improving evolutionary exploration to area-time optimization of FPGA designs , 2008, J. Syst. Archit..
[15] Sumit Gupta,et al. SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits , 2004 .
[16] Gu-Yeon Wei,et al. CHIPKIT: An Agile, Reusable Open-Source Framework for Rapid Test Chip Development , 2020, IEEE Micro.
[17] Quoc V. Le,et al. Chip Placement with Deep Reinforcement Learning , 2020, ArXiv.
[18] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[19] Gaurav Menghani,et al. Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better , 2021, ACM Comput. Surv..
[20] Yu Ting Chen,et al. A Survey and Evaluation of FPGA High-Level Synthesis Tools , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[21] Shoaib Kamil,et al. OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[22] Yuan Xie,et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey , 2020, Proceedings of the IEEE.
[23] Vito Giovanni Castellana,et al. High-Level Synthesis of Parallel Specifications Coupling Static and Dynamic Controllers , 2021, 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[24] Nagarajan Kandasamy,et al. Endurance-Aware Mapping of Spiking Neural Networks to Neuromorphic Hardware , 2021, IEEE Transactions on Parallel and Distributed Systems.
[25] Marco Minutoli,et al. Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics , 2021, IEEE Transactions on Computers.
[26] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[27] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[28] Jeff Dean. Deep Learning for Solving Important Problems , 2019, WWW.
[29] Apala Guha,et al. μIR -An intermediate representation for transforming and optimizing the microarchitecture of application accelerators , 2019, MICRO.
[30] Pasi Liljeberg,et al. Energy-Efficient Virtual Machines Consolidation in Cloud Data Centers Using Reinforcement Learning , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[31] Uday Bondhugula,et al. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation , 2021, 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[32] Pier Luca Lanzi,et al. Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[33] Massoud Pedram,et al. A Deep Reinforcement Learning Framework for Architectural Exploration: A Routerless NoC Case Study , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[34] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[35] Wolfgang Maass,et al. Networks of Spiking Neurons: The Third Generation of Neural Network Models , 1996, Electron. Colloquium Comput. Complex..
[36] David A. Patterson,et al. A New Golden Age in Computer Architecture: Empowering the Machine-Learning Revolution , 2018, IEEE Micro.
[37] Yiyu Shi,et al. Hardware/Software Co-Exploration of Neural Architectures , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[38] Sarita V. Adve,et al. HPVM: heterogeneous parallel virtual machine , 2018, PPoPP.