暂无分享,去创建一个
[1] Yiming Yang,et al. MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices , 2020, ACL.
[2] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[3] Erich Elsen,et al. Fast Sparse ConvNets , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[5] Edouard Grave,et al. Training with Quantization Noise for Extreme Model Compression , 2020, ICLR.
[6] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.
[7] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[10] Oleksandr Makeyev,et al. Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[11] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[12] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[13] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[14] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[15] Jonas Mockus,et al. On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.
[16] Pete Warden,et al. TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers , 2019 .
[17] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[18] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[19] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[20] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.
[21] Prabhu Kaliamoorthi,et al. Distilling Large Language Models into Tiny and Effective Students using pQRNN , 2021, ArXiv.
[22] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[23] Pascal Frossard,et al. Adaptive data augmentation for image classification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).
[24] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.
[25] Matthew Richardson,et al. Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.
[26] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[27] Quoc V. Le,et al. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.
[28] Ion Stoica,et al. Tune: A Research Platform for Distributed Model Selection and Training , 2018, ArXiv.
[29] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[30] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[31] Kan Chen,et al. Billion-scale semi-supervised learning for image classification , 2019, ArXiv.
[32] Zornitsa Kozareva,et al. Transferable Neural Projection Representations , 2019, NAACL.
[33] Wonyong Sung,et al. Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..
[34] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[35] Zornitsa Kozareva,et al. ProFormer: Towards On-Device LSH Projection Based Transformers , 2021, EACL.
[36] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[37] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ArXiv.
[38] Qiang Chen,et al. Towards Accurate Post-training Network Quantization via Bit-Split and Stitching , 2020, ICML.
[39] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[40] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[41] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[42] Geoffrey E. Hinton,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.
[43] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[44] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[45] Luke Zettlemoyer,et al. Sparse Networks from Scratch: Faster Training without Losing Performance , 2019, ArXiv.
[46] Hong Zhu,et al. Hyper-Parameter Optimization: A Review of Algorithms and Applications , 2020, ArXiv.
[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[48] Raghuraman Krishnamoorthi,et al. Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.
[49] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[50] Sujith Ravi,et al. ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections , 2017, ArXiv.
[51] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[52] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[53] Bin Liu,et al. Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Wei Wei,et al. 2019 Formatting Instructions for Authors Using LaTeX , 2018 .
[55] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[56] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Bertrand A. Maher,et al. Glow: Graph Lowering Compiler Techniques for Neural Networks , 2018, ArXiv.
[58] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[59] Ameet Talwalkar,et al. Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.
[60] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.
[61] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[63] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.
[64] Rico Sennrich,et al. Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.
[65] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[66] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Lynn Conway,et al. Introduction to VLSI systems , 1978 .
[68] Xiangyu Zhang,et al. MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[69] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[70] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[71] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[73] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[74] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[75] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[76] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[77] Chun-Ta Lu,et al. Neural Structured Learning: Training Neural Networks with Structured Signals , 2020, KDD.
[78] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.
[79] Zornitsa Kozareva,et al. Self-Governing Neural Networks for On-Device Short Text Classification , 2018, EMNLP.
[80] Graham Neubig,et al. SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation , 2018, EMNLP.
[81] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[82] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[83] Quoc V. Le,et al. AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[84] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[85] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[86] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[87] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[88] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[89] Geoffrey Zweig,et al. Multi-modal Self-Supervision from Generalized Data Transformations , 2020, ArXiv.
[90] Hiroshi Inoue,et al. Data Augmentation by Pairing Samples for Images Classification , 2018, ArXiv.
[91] Ameet Talwalkar,et al. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..
[92] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[93] Emily Denton,et al. Characterising Bias in Compressed Models , 2020, ArXiv.
[94] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[95] Michael Carbin,et al. The Lottery Ticket Hypothesis: Training Pruned Neural Networks , 2018, ArXiv.
[96] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[97] Rohan Ramanath,et al. An Attentive Survey of Attention Models , 2019, ACM Trans. Intell. Syst. Technol..
[98] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.
[99] Rich Caruana,et al. Model compression , 2006, KDD '06.
[100] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[101] Sujith Ravi,et al. Learning from a Teacher using Unlabeled Data , 2019, ArXiv.
[102] Erich Elsen,et al. Rigging the Lottery: Making All Tickets Winners , 2020, ICML.
[103] Tomas Mikolov,et al. Advances in Pre-Training Distributed Word Representations , 2017, LREC.
[104] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[105] Matthias Seeger,et al. Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization , 2020, ArXiv.
[106] Nipun Batra,et al. Exploring Bayesian Optimization , 2020 .
[107] James R. Glass. Towards unsupervised speech processing , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).
[108] H. T. Kung. Why systolic architectures? , 1982, Computer.
[109] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[110] Luca Maria Gambardella,et al. High-Performance Neural Networks for Visual Object Classification , 2011, ArXiv.
[111] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning , 2016, ArXiv.
[112] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[113] Dan Alistarh,et al. Model compression via distillation and quantization , 2018, ICLR.
[114] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[115] Zornitsa Kozareva,et al. PRADO: Projection Attention Networks for Document Classification On-Device , 2019, EMNLP.
[116] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[117] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.