论文信息 - The Vision Behind MLPerf: Understanding AI Inference Performance

The Vision Behind MLPerf: Understanding AI Inference Performance

Deep learning has sparked a renaissance in computer systems and architecture. Despite the breakneck pace of innovation, there is a crucial issue concerning the research and industry communities at large: how to enable neutral and useful performance assessment for machine learning (ML) software frameworks, ML hardware accelerators, and ML systems comprising both the software stack and the hardware. The ML field needs systematic methods for evaluating performance that represents real-world use cases and useful for making comparisons across different software and hardware implementations. MLPerf answers the call. MLPerf is an ML benchmark standard driven by academia and industry (70+ organizations). Built out of the expertise of multiple organizations, MLPerf establishes a standard benchmark suite with proper metrics and benchmarking methodologies to level the playing field for ML system performance measurement of different ML inference hardware, software, and services.

[1] David Patterson,et al. MLPerf Training Benchmark , 2019, MLSys.

[2] Koan-Sin Tan,et al. MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It , 2020, ArXiv.

[3] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4] Cody Coleman,et al. MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[5] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[6] Carole-Jean Wu,et al. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[7] Yinghai Lu,et al. Deep Learning Recommendation Model for Personalization and Recommendation Systems , 2019, ArXiv.

[8] Maximilian Lam,et al. Benchmarking TinyML Systems: Challenges and Direction , 2020, ArXiv.

[9] Carole-Jean Wu,et al. MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance , 2020, IEEE Micro.

[10] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[11] Jens Petersen,et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation , 2020, Nature Methods.