Parallel compressive sampling matching pursuit algorithm for compressed sensing signal reconstruction with OpenCL

Using the newly computing technology to solve the computing intensive problem in Geo-sciences area.Using OpenCL, the developed parallel algorithm has the capability that can run on different heterogeneous computing platforms, either the NVIDA or AMD GPU-based platform; Using CUDA to develop is not has such crossing-platform capability.In Geo-sciences area, according to our knowledge, there is almost no research that using heterogeneous computing technology to solve the such problem. This work should be helpful to the researchers in this field. Compressive sensing (CS) is a new signal processing method, which was developed recent years. CS can sample signals with a frequency far below the Nyquist frequency. CS can also compress the signals while sampling, which can reduce the usage of resources for signal transmission and storage. However, the reconstruction algorithm used in the corresponding decoder is highly complex and computationally expensive. Thus, in some specific applications, e.g., remote sensing image processing for disaster monitoring, the CS algorithm usually cannot satisfy the time requirements on traditional computing platforms. Various studies have shown that many-core computing platforms such as OpenCL are among the most promising platforms that are available for real-time processing because of their powerful floating-point computing capabilities. In this study, we present the design and implementation of parallel compressive sampling matching pursuit (CoSaMP), which is an OpenCL-based parallel CS reconstruction algorithm, as well as some optimization strategies, such as access efficiency, numerical merge, and instruction optimization. Based on experiments using remote sensing images with different sizes, we demonstrated that the proposed parallel algorithm can achieve speedups of about 41 times and 58 times on AMD HD7350 and NVIDIA K20Xm platforms, respectively, without modifying the application code.

[1]  Rajkumar Buyya,et al.  Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS , 2015, IEEE Transactions on Cloud Computing.

[2]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.

[3]  Huadong Meng,et al.  Novel hardware architecture of sparse recovery based on FPGAs , 2010, 2010 2nd International Conference on Signal Processing Systems.

[4]  Rajiv Ranjan,et al.  Peer-to-peer service provisioning in cloud computing environments , 2011, The Journal of Supercomputing.

[5]  Christine Pohl,et al.  Multisensor image fusion in remote sensing: concepts, methods and applications , 1998 .

[6]  R.G. Baraniuk,et al.  Compressive Sensing [Lecture Notes] , 2007, IEEE Signal Processing Magazine.

[7]  Jinjun Chen,et al.  A dynamic prime number based efficient security mechanism for big sensing data streams , 2017, J. Comput. Syst. Sci..

[8]  Hong Bao,et al.  GPGPU-Aided Ensemble Empirical-Mode Decomposition for EEG Analysis During Anesthesia , 2010, IEEE Transactions on Information Technology in Biomedicine.

[9]  Steven Brawer,et al.  An Introduction to Parallel Programming , 1989 .

[10]  Jinjun Chen,et al.  Authorized Public Auditing of Dynamic Big Data Storage on Cloud with Efficient Verifiable Fine-Grained Updates , 2014, IEEE Transactions on Parallel and Distributed Systems.

[11]  H. Harbrecht,et al.  On the low-rank approximation by the pivoted Cholesky decomposition , 2012 .

[12]  Rajiv Ranjan,et al.  Streaming Big Data Processing in Datacenter Clouds , 2014, IEEE Cloud Computing.

[13]  Chao Yang,et al.  Ultra-Scalable CPU-MIC Acceleration of Mesoscale Atmospheric Modeling on Tianhe-2 , 2015, IEEE Transactions on Computers.

[14]  Tao Yuan,et al.  Parallel Processing of Massive Remote Sensing Images in a GPU Architecture , 2014, Comput. Informatics.

[15]  Rajiv Ranjan,et al.  A scalable Helmholtz solver in GRAPES over large‐scale multicore cluster , 2013, Concurr. Comput. Pract. Exp..

[16]  Chao Yang,et al.  Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms , 2015, TRETS.

[17]  J. Tropp,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, Commun. ACM.

[18]  Deng Jia-xian,et al.  Compressed sensing parallel processing algorithm based on OpenMP , 2013 .

[19]  William Gropp,et al.  An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.

[20]  Ke Lu,et al.  Compressed Sensing of a Remote Sensing Image Based on the Priors of the Reference Image , 2015, IEEE Geoscience and Remote Sensing Letters.

[21]  Jack Dongarra,et al.  LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.

[22]  Huadong Meng,et al.  A GPU-based Parallel Implementation of Compressive Sampling Reconstruction for SAR Image Compression , 2011 .

[23]  Jinjun Chen,et al.  KASR: A Keyword-Aware Service Recommendation Method on MapReduce for Big Data Applications , 2014, IEEE Transactions on Parallel and Distributed Systems.

[24]  D. L. Donoho,et al.  Compressed sensing , 2006, IEEE Trans. Inf. Theory.

[25]  Zbigniew J. Czech,et al.  Introduction to Parallel Computing , 2017 .

[26]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[27]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[28]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[29]  Rajkumar Buyya,et al.  Coordinated load management in Peer-to-Peer coupled federated grid systems , 2012, The Journal of Supercomputing.

[30]  Ching-Hsien Hsu Editorial: enabling technologies for programming extreme scale systems , 2012, The Journal of Supercomputing.

[31]  Michael I. Gordon,et al.  Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.

[32]  Mircea Andrecut,et al.  Fast GPU Implementation of Sparse Signal Recovery from Random Projections , 2008, Eng. Lett..

[33]  Jérôme Darbon,et al.  A Simple Compressive Sensing Algorithm for Parallel Many-Core Architectures , 2013, J. Signal Process. Syst..

[34]  Aly E. Fathy,et al.  Compressed sensing based UWB receiver: Hardware compressing and FPGA reconstruction , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[35]  Jason Cong,et al.  FPGA-accelerated 3D reconstruction using compressive sensing , 2012, FPGA '12.

[36]  Qiuchan Bai,et al.  Image Fusion and Recognition based on Compressed Sensing Theory , 2015 .

[37]  Jared Tanner,et al.  Performance comparisons of greedy algorithms in compressed sensing , 2015, Numer. Linear Algebra Appl..

[38]  Matt Pharr,et al.  Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation , 2005 .

[39]  Arian Maleki,et al.  Optimally Tuned Iterative Reconstruction Algorithms for Compressed Sensing , 2009, IEEE Journal of Selected Topics in Signal Processing.

[40]  Richard G. Baraniuk,et al.  Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[41]  Liang Chen,et al.  GPU Implementation of Orthogonal Matching Pursuit for Compressive Sensing , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.