High-Throughput Multi-Threaded Sum-Product Network Inference in the Reconfigurable Cloud

Large cloud providers have started to make powerful FPGAs available as part of their public cloud offers. One promising application area for this kind of instances is the acceleration of machine learning tasks. This work presents an accelerator architecture that uses multiple accelerator cores for the inference in so-called Sum-Product Networks and complements it with a host software interface that overlaps data-transfer and actual computation. The evaluation shows that, the proposed architecture deployed to Amazon AWS F1 instances is able to outperform a 12-core Xeon processor by a factor of up to 1.9x and a Nvidia Tesla V100 GPU by a factor of up to 6.6x.

[1]  Rajesh P. N. Rao,et al.  Deep Spatial Affordance Hierarchy : Spatial Knowledge Representation for Planning in Large-scale Environments , 2017 .

[2]  Andreas Koch,et al.  The TaPaSCo Open-Source Toolflow , 2019, Journal of Signal Processing Systems.

[3]  Kristian Kersting,et al.  Mixed Sum-Product Networks: A Deep Architecture for Hybrid Domains , 2018, AAAI.

[4]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[5]  Sebastian Tschiatschek,et al.  On Theoretical Properties of Sum-Product Networks , 2015, AISTATS.

[7]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[8]  Florent de Dinechin,et al.  Designing Custom Arithmetic Data Paths with FloPoCo , 2011, IEEE Design & Test of Computers.

[9]  Carsten Binnig,et al.  Automatic Mapping of the Sum-Product Network Inference Problem to FPGA-Based Accelerators , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).

[10]  Sebastian Tschiatschek,et al.  Sum-Product Networks for Sequence Labeling , 2018, ArXiv.