暂无分享,去创建一个
Yanxiang He | Fang Liu | Zimeng Fan | Wei Hu | Dian Xu | Yanxiang He | Wei Hu | Zimeng Fan | Fang Liu | Dian Xu
[1] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[2] Jason Cong,et al. FPGA-based accelerator for long short-term memory recurrent neural networks , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[3] Lei He,et al. OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[4] Hongmei Li,et al. FPGA Based Real-Time Processing Architecture for Recurrent Neural Network , 2017 .
[5] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.
[6] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Lei He,et al. Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks , 2020, FPGA.
[8] Lei He,et al. Uni-OPU: An FPGA-Based Uniform Accelerator for Convolutional and Transposed Convolutional Networks , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[9] Mohamed S. Abdelfattah,et al. DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[10] Siyuan Lu,et al. Hardware Accelerator for Multi-Head Attention and Position-Wise Feed-Forward in the Transformer , 2020, 2020 IEEE 33rd International System-on-Chip Conference (SOCC).
[11] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[12] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[13] Jianfei Cai,et al. Scalable Vision Transformers with Hierarchical Pooling , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Luciano Lavagno,et al. Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs , 2018, FPGA.
[15] Jungwook Choi,et al. OPTIMUS: OPTImized matrix MUltiplication Structure for Transformer neural network accelerator , 2020, MLSys.
[16] Zhijian Liu,et al. Lite Transformer with Long-Short Range Attention , 2020, ICLR.
[17] Jian Cheng,et al. Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing , 2021, 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[18] Soheil Ghiasi,et al. Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[19] Levent Sagun,et al. ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases , 2021, ICML.
[20] Ji Li,et al. FTRANS: energy-efficient acceleration of transformers using FPGA , 2020, ISLPED.
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Yiming Yang,et al. MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices , 2020, ACL.
[23] Mehdi Kamal,et al. POLAR: A Pipelined/Overlapped FPGA-Based LSTM Accelerator , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[24] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[25] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[26] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[27] Lei He,et al. NPE: An FPGA-based Overlay Processor for Natural Language Processing , 2021, FPGA.
[28] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.