Optimising deep learning inference across edge devices and optimisation targets such as inference time, memory footprint and power consumption is a key challenge due to the ubiquity of neural networks. Today, production deep learning frameworks provide useful abstractions to aid machine learning engineers and systems researchers. However, in exchange they can suffer from compatibility challenges (especially on constrained platforms), inaccessible code complexity, or design choices that otherwise limit research from a systems perspective. This paper presents Orpheus, a new deep learning framework for easy prototyping, deployment and evaluation of inference optimisations. Orpheus features a small codebase, minimal dependencies, and a simple process for integrating other third party systems. We present some preliminary evaluation results.
[1]
Amos J. Storkey,et al.
Moonshine: Distilling with Cheap Convolutions
,
2017,
NeurIPS.
[2]
Amos J. Storkey,et al.
Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks
,
2018,
2018 IEEE International Symposium on Workload Characterization (IISWC).
[3]
Louis B. Rall,et al.
Automatic differentiation
,
1981
.
[4]
Luca Antiga,et al.
Automatic differentiation in PyTorch
,
2017
.
[5]
Haichen Shen,et al.
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
,
2018,
OSDI.