Big biomedical image processing hardware acceleration: A case study for K-means and image filtering

Most hospitals today are dealing with the big data problem, as they generate and store petabytes of patient records most of which in form of medical imaging, such as pathological images, CT scans and X-rays in their datacenters. Analyzing such large amounts of biomedical imaging data to enable discovery and guide physicians in personalized care is becoming an important focus of data mining and machine learning algorithms developed for biomedical Informatics (BMI). Algorithms that are developed for BMI heavily rely on complex and computationally intensive machine learning and data mining methods to learn from large data. The high processing demand of big biomedical imaging data has given rise to their implementation in high-end server platforms running software ecosystems that are optimized for dealing with large amount of data including Apache Hadoop and Apache Spark. However, efficient processing of such large amount of imaging data running computational intensive learning methods is becoming a challenging problem using state-of-the-art high performance computing server architectures. To address this challenge, in this paper, we introduce a scalable and efficient hardware acceleration method using low cost commodity FPGAs that is interfaced with a server architecture through a high speed interface. In this work we present a full end-to-end implementation of big data image processing and machine learning applications in a heterogeneous CPU+FPGA architecture. We develop the MapReduce implementation of K-means and Laplacian Filtering in Hadoop Streaming environment that allows developing mapper functions in non-Java based languages suited for interfacing with FPGA-based hardware accelerating environment. We accelerate the mapper functions through hardware+software (HW+SW) co-design. We do a full implementation of the HW+SW mappers on the Zynq FPGA platform. The results show promising kernel speedup of up to 27× for large image data sets. This translate to 7.8× and 1.8× speedup in an end-to-end Hadoop MapReduce implementation of K-mean s and Laplacian Filtering algorithm, respectively.

[1]  Avesta Sasan,et al.  Energy-efficient acceleration of big data analytics applications using FPGAs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[2]  Ping Sun,et al.  Breast Cancer Mortality After a Diagnosis of Ductal Carcinoma In Situ. , 2015, JAMA oncology.

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Houman Homayoun,et al.  Energy-efficient mapping of biomedical applications on domain-specific accelerator under process variation , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[5]  Kevin Skadron,et al.  Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.

[6]  Jerry L. Prince,et al.  A Survey of Current Methods in Medical Image Segmentation , 1999 .

[7]  Georgios Ch. Sirakoulis,et al.  A configurable mapreduce accelerator for multi-core FPGAs (abstract only) , 2014, FPGA.

[8]  Bo Li,et al.  Parallel K-Means Clustering of Remote Sensing Images Based on MapReduce , 2010, WISM.

[9]  Luca Maria Gambardella,et al.  Assessment of algorithms for mitosis detection in breast cancer histopathology images , 2014, Medical Image Anal..

[10]  Houman Homayoun,et al.  Power and performance characterization, analysis and tuning for energy-efficient edge detection on atom and ARM based platforms , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).

[11]  Kwan-Liu Ma,et al.  Multi-GPU volume rendering using MapReduce , 2010, HPDC '10.

[12]  Miao Xin,et al.  An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs , 2012, 2012 International Joint Conference on Service Sciences.

[13]  Houman Homayoun,et al.  Accelerating Big Data Analytics Using FPGAs , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[14]  Paul Chow,et al.  ZCluster: A Zynq-based Hadoop cluster , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[15]  Jason Lawrence,et al.  HIPI : A Hadoop Image Processing Interface for Image-based MapReduce Tasks , 2011 .

[16]  Houman Homayoun,et al.  Accelerating Machine Learning Kernel in Hadoop Using FPGAs , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[17]  Mohamed H. Almeer Hadoop Mapreduce for Remote Sensing Image Analysis , 2012 .

[18]  Jerry L Prince,et al.  Current methods in medical image segmentation. , 2000, Annual review of biomedical engineering.

[19]  Yu Wang,et al.  FPMR: MapReduce framework on FPGA , 2010, FPGA '10.

[20]  Houman Homayoun,et al.  A 64-core platform for biomedical signal processing , 2013, International Symposium on Quality Electronic Design (ISQED).

[21]  Abdul Rahman Ramli,et al.  Review of brain MRI image segmentation methods , 2010, Artificial Intelligence Review.

[22]  Toshimori Honjo,et al.  Hardware acceleration of Hadoop MapReduce , 2013, 2013 IEEE International Conference on Big Data.

[23]  Houman Homayoun,et al.  Big data on low power cores: Are low power embedded processors a good fit for the big data workloads? , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).

[24]  Ferenc A. Jolesz,et al.  Radiogenomic Mapping of Edema/Cellular Invasion MRI-Phenotypes in Glioblastoma Multiforme , 2011, PloS one.

[25]  Houman Homayoun,et al.  A parallel and reconfigurable architecture for efficient OMP compressive sensing reconstruction , 2014, GLSVLSI '14.

[26]  Greg Brown,et al.  A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications , 2012, FPGA '12.