ViennaCL++: Enable TensorFlow/Eigen via ViennaCL with OpenCL C++ Flow

This paper presents the ViennaCL++, an OpenCL C++ kernel library for Vienna Computing Library (ViennaCL) combined with TensorFlow/Eigen library to enable acceleration and optimization of linear algebraic computing. Previously, TensorFlow would invoke Eigen for solvers. To enable OpenCL flow, one can invoke Eigen via ViennaCL to generate kernel programs for GPU computation. In order to support the features of the latest specification, the linear algebraic kernel library is migrated to OpenCL C++ with C++ features in ViennaCL++ to construct the OpenCL flow for TensorFlow and its underlying computational library Eigen. The software flow is based on the state-of-the-art specification of OpenCL and OpenCL C++ kernel langauge, as well as SPIR-V binary intermediate representation. The experimental results of ViennaCL++ which includes C++ class and SPIR-V flow are achieving 8 times and 49 times speedup for BLAS2 and BLAS3 operations compared to Eigen library on the x86_64 of Intel hardware. Overall, these results indicate that the performance of ViennaCL++ runtime execution with OpenCL C++ and SPIR-V flow is similar to traditional OpenCL C flow. Note that the Intel OpenCL 2.1 compiler is equipped with most Khronos OpenCL 2.2 (OpenCL C++) linguistic to support the experiment.

[1]  Shao-Chung Wang,et al.  Architecture and Compiler Support for GPUs Using Energy-Efficient Affine Register Files , 2017, TODE.

[2]  Chun-Chieh Yang,et al.  OpenCL 2.0 Compiler Adaptation on LLVM for PTX Simulators , 2017, 2017 46th International Conference on Parallel Processing Workshops (ICPPW).

[3]  Jenq Kuen Lee,et al.  Energy Efficient Affine Register File for GPU Microarchitecture , 2016, 2016 45th International Conference on Parallel Processing Workshops (ICPPW).

[4]  Karl Rupp,et al.  ViennaCL - Linear Algebra Library for Multi- and Many-Core Architectures , 2016, SIAM J. Sci. Comput..