Artificial Intelligence (AI) is now the center of attention for many industries, ranging from private companies to academic institutions. While domains of interest and AI applications vary, one concern remains unchanged for everyone: How to determine if an end-to-end AI solution is performant? As AI is spreading to more industries, what metrics might be the reference for AI applications and benchmarks in the enterprise space? This paper intends to answer some of these questions. At present, the AI benchmarks either focus on evaluating deep learning approaches or infrastructure capabilities. Unfortunately, these approaches don’t capture end-to-end performance behavior of enterprise AI workloads. It is also clear that there is not one reference metric that will be suitable for all AI applications nor all existing platforms. We will first present the state of the art regarding the current basic and most popular AI benchmarks. We will then present the main characteristics of AI workloads from various industrial domains. Finally, we will focus on the needs for ongoing and future industry AI benchmarks and conclude on the gaps to improve AI benchmarks for enterprise workloads.
[1]
Sandia Report,et al.
HPCG Technical Specification
,
2013
.
[2]
Yuan Yu,et al.
TensorFlow: A system for large-scale machine learning
,
2016,
OSDI.
[3]
Daisuke Takahashi,et al.
The HPC Challenge (HPCC) benchmark suite
,
2006,
SC.
[4]
Jack Dongarra,et al.
HPCG Benchmark Technical Specification
,
2013
.