Real-time dictionary based super-resolution of surveillance video streams and targets

Real-time super-resolution within surveillance video streams is a powerful tool for security and crime prevention allowing for example, events, faces or objects such number-plates and luggage to be more accurately identified on the fly and from a distance. However, many of the state of the art approaches to super-resolution are computationally too expensive to be suitable for real-time applications within a surveillance context. We consider one particular contemporary method based on sparse coding,1 and show how, by relaxing some model constraints, it can be sped up significantly compared to the reference implementation, and thus approach real-time performance with visually indistinct reduction in fidelity. The final computation is three orders of magnitude faster than the reference implementation. The quality of the output is maintained: PSNR of the super-resolved images compared to ground truth is not significantly different to the reference implementation, while maintaining a noticeable improvement over baseline bicubic-interpolation approach.

[1]  William T. Freeman,et al.  Learning low-level vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Yair Weiss,et al.  From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[3]  Deqing Sun,et al.  A Bayesian approach to adaptive video super resolution , 2011, CVPR 2011.

[4]  Shaogang Gong,et al.  Generalized Face Super-Resolution , 2008, IEEE Transactions on Image Processing.

[5]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Michal Irani,et al.  Improving resolution by image registration , 1991, CVGIP Graph. Model. Image Process..

[7]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[8]  J. D. van Ouwerkerk,et al.  Image super-resolution survey , 2006, Image Vis. Comput..

[9]  Thomas S. Huang,et al.  Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Frédo Durand,et al.  Efficient marginal likelihood optimization in blind deconvolution , 2011, CVPR 2011.

[11]  Shaogang Gong,et al.  User-assisted visual search and tracking across distributed multi-camera networks , 2011, Security and Defence.

[12]  Thomas S. Huang,et al.  Face hallucination VIA sparse coding , 2008, 2008 15th IEEE International Conference on Image Processing.

[13]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[14]  i-LIDS Team,et al.  Imagery Library for Intelligent Detection Systems (i-LIDS); A Standard for Testing Video Based Detection Systems , 2006, Proceedings 40th Annual 2006 International Carnahan Conference on Security Technology.

[15]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[16]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[17]  Wan-Chi Siu,et al.  Single image super-resolution using Gaussian process regression , 2011, CVPR 2011.

[18]  Lina J. Karam,et al.  A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB) , 2009, IEEE Transactions on Image Processing.