This paper presents ONNC (Open Neural Network Compiler), a retargetable compilation framework designed to connect ONNX (Open Neural Network Exchange) models to proprietary deep learning accelerators (DLAs). The intermediate representations (IRs) of ONNC have one-to-one mapping to ONNX IRs, thus making porting ONNC to proprietary DLAs much simpler than other compilation frameworks such as TVM and Glow especially for hardware with coarse-grained operators that are not part of the generic IRs in the LLVM backend. ONNC also has a flexible pass manager designed to support compiler optimizations at all levels. A docker image of ONNC bundled with a Vanilla backend is released with this paper to enable fast porting to new hardware targets. To illustrate how an ONNC-based toolkit guides our research and development in DLA design, we present a case study on compiler optimizations for activation memory consumption. The study shows that the Best-Fit algorithm with a proposed heuristic and a reordering scheme may act as a near-optimal strategy, getting the memory consumption close to the ideal lower bound in 11 of 12 models from the ONNX model zoo. To our best knowledge, ONNC is the first open source compilation framework that is specially designed to support the ONNX-based models for both commercial and research projects for deep learning applications.
[1]
Andrew W. Appel.
Modern Compiler Implementation in ML: Basic Techniques
,
1997
.
[2]
David A. Patterson,et al.
In-datacenter performance analysis of a tensor processing unit
,
2017,
2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[3]
Gregory J. Chaitin,et al.
Register allocation & spilling via graph coloring
,
1982,
SIGPLAN '82.
[4]
Carter Bays,et al.
A comparison of next-fit, first-fit, and best-fit
,
1977,
CACM.
[5]
Bertrand A. Maher,et al.
Glow: Graph Lowering Compiler Techniques for Neural Networks
,
2018,
ArXiv.
[6]
Haichen Shen,et al.
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
,
2018
.