Parallel implementation and optimization of high definition video real-time dehazing

In some warning applications, such as aircraft taking-off and landing, ship sailing, and traffic guidance in foggy weather, the high definition (HD) and rapid dehazing of images and videos is increasingly necessary. Existing technologies for the dehazing of videos or images have not completely exploited the parallel computing capacity of modern multi-core CPU and GPU, and leads to the long dehazing time or the low frame rate of video dehazing which cannot meet the real-time requirement. In this paper, we propose a parallel implementation and optimization method for the real-time dehazing of the high definition videos based on a single image haze removal algorithm. Our optimization takes full advantage of the modern CPU+GPU architecture, which increases the parallelism of the algorithm, and greatly reduces the computational complexity and the execution time. The optimized OpenCL parallel implementation is integrate into FFmpeg as an independent module. The experimental results show that for a single image, the performance of the optimized OpenCL algorithm is improved approximately 500% compared with the existing algorithm, and approximately 153% over the basic OpenCL algorithm. The 1080p (1920 × 1080) high definition hazy video can also processed at a real-time rate (more than 41 frames per second).

[1]  Jean-Philippe Tarel,et al.  Fast visibility restoration from a single color or gray level image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Jie Shen,et al.  Performance Traps in OpenCL for CPUs , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[3]  John Zahorjan,et al.  Optimizing Data Locality by Array Restructuring , 1995 .

[4]  Ju Ren,et al.  Parallel Implementation and Optimization of Haze Removal Using Dark Channel Prior Based on CUDA , 2012, HiPC 2012.

[5]  Dana Schaa,et al.  Static Memory Access Pattern Analysis on a Massively Parallel GPU , 2011 .

[6]  Zixing Cai,et al.  Improved Single Image Dehazing Using Dark Channel Prior and Multi-scale Retinex , 2010, 2010 International Conference on Intelligent System Design and Engineering Application.

[7]  Raanan Fattal,et al.  Single image dehazing , 2008, ACM Trans. Graph..

[8]  Thomas Fahringer,et al.  Automatic OpenCL Device Characterization: Guiding Optimized Kernel Design , 2011, Euro-Par.

[9]  Chen Chao,et al.  Real-time and adaptive video dehazing , 2016 .

[10]  Jianbin Fang,et al.  Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels , 2014, 2014 43rd International Conference on Parallel Processing.

[11]  Robby T. Tan,et al.  Visibility in bad weather from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  CremonesiPaolo Parallel, distributed and network-based processing , 2006 .

[13]  Wenbin Chen,et al.  Real-Time Dehazing for Image and Video , 2010, 2010 18th Pacific Conference on Computer Graphics and Applications.

[14]  Liya Zhou,et al.  Uneven cloud and fog removing for satellite remote sensing image , 2011, 2011 Second International Conference on Mechanic Automation and Control Engineering.

[15]  Jaejin Lee,et al.  Automatic OpenCL work-group size selection for multicore CPUs , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[16]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .

[17]  Hongying Zhang,et al.  Research on image dehazing algorithms based on physical model , 2011, 2011 International Conference on Multimedia Technology.

[18]  Henk Sips,et al.  Quantifying the performance impacts of using local memory for many-core processors , 2013, 2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS).

[19]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Donghua Zhou,et al.  Fast haze removal from a single image , 2013, 2013 25th Chinese Control and Decision Conference (CCDC).

[21]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[22]  David R. Kaeli,et al.  Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.