Towards a methodology for creating time-critical, cloud-based CUDA applications

CUDA has been used in many different application domains, not all of which are specifically image processing related. There is the opportunity to use multiple and/or distributed CUDA resources in cloud facilities such as Amazon Web Services (AWS), in order to obtain enhanced processing power and to satisfy time-critical requirements which cannot be satisfied using a single CUDA resource. In particular, this would provide enhanced ability for processing Big Data, especially in conjunction with distributed file systems (for example). In this paper, we present a survey of time-critical CUDA applications, identifying requirements and concepts that they tend to have in common. In particular, we investigate the terminology used for Quality of Service metrics, and present a taxonomy which summarises the underlying concepts and maps these terms to the diverse terminology used. We also survey typical requirements for developing, deploying and managing such applications. Given these requirements, we consider how the SWITCH platform can in principle support the entire life-cycle of time-critical CUDA application development and cloud deployment, and identify specific extensions which would be needed in order fully to support this particular class of time-critical cloud applications.

[1]  Adam Herout,et al.  PClines — Line detection using parallel coordinates , 2011, CVPR 2011.

[2]  Lars Moland Eliassen,et al.  A Comparison of Learning Based Background Subtraction Techniques Implemented in CUDA , 2009 .

[3]  Gilles Pagès,et al.  Optimal Quantization for the Pricing of Swing Options , 2007, 0705.2110.

[4]  Brent Ellerbroek,et al.  Computer simulations and real-time control of ELT AO systems using graphical processing units , 2012, Other Conferences.

[5]  Jean-Philippe Thirion,et al.  Image matching as a diffusion process: an analogy with Maxwell's demons , 1998, Medical Image Anal..

[6]  Sébastien Lafond,et al.  Frame Synchronization of Live Video Streams Using Visible Light Communication , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[7]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[8]  A. Kak,et al.  Simultaneous Algebraic Reconstruction Technique (SART): A Superior Implementation of the Art Algorithm , 1984, Ultrasonic imaging.

[9]  I. Feldmann,et al.  Real-time depth estimation for immersive 3D videoconferencing , 2010, 2010 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[10]  C. Mobley,et al.  Hyperspectral remote sensing for shallow waters. 2. Deriving bottom depths and water properties by optimization. , 1999, Applied optics.

[11]  C. Davis,et al.  Method to derive ocean absorption coefficients from remote-sensing reflectance. , 1996, Applied optics.

[12]  Amit A. Kale,et al.  Modeling and tracking of faces in real-life illumination conditions , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Lucia Maddalena,et al.  A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications , 2008, IEEE Transactions on Image Processing.

[14]  Larry S. Davis,et al.  A Robust Background Subtraction and Shadow Detection , 1999 .

[15]  Jun Li,et al.  Real-Time Implementation of the Sparse Multinomial Logistic Regression for Hyperspectral Image Classification on GPUs , 2015, IEEE Geoscience and Remote Sensing Letters.

[16]  GPU-Based Deep Learning Inference: A Performance and Power Analysis , 2015 .

[17]  K. Miller,et al.  Total Lagrangian explicit dynamics finite element algorithm for computing soft tissue deformation , 2006 .

[18]  Sébastien Ourselin,et al.  Fast free-form deformation using graphics processing units , 2010, Comput. Methods Programs Biomed..

[19]  John D. Owens,et al.  Fast Deformable Registration on the GPU: A CUDA Implementation of Demons , 2008, 2008 International Conference on Computational Sciences and Its Applications.

[20]  G. Pagès,et al.  A quantization algorithm for solving multidimensional discrete-time optimal stopping problems , 2003 .

[21]  Bettina Schnor,et al.  A comparison of CUDA and OpenACC: Accelerating the Tsunami Simulation EasyWave , 2014, ARCS Workshops.

[22]  Yang-Lang Chang,et al.  Accelerating the Kalman Filter on a GPU , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[23]  David R. Kaeli,et al.  Accelerating an Imaging Spectroscopy Algorithm for Submerged Marine Environments Using Graphics Processing Units , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Surya S. Durbha,et al.  High performance SIFT feature classification of VHR satellite imagery for disaster management , 2014, 2014 IEEE Geoscience and Remote Sensing Symposium.

[25]  C. Mobley,et al.  Hyperspectral remote sensing for shallow waters. I. A semianalytical model. , 1998, Applied optics.

[26]  G. Pagès,et al.  A QUANTIZATION TREE METHOD FOR PRICING AND HEDGING MULTIDIMENSIONAL AMERICAN OPTIONS , 2005 .

[27]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[28]  Gilles Pagès,et al.  GPGPUs in computational finance: massive parallel computing for American style options , 2011, Concurr. Comput. Pract. Exp..

[29]  L. Feldkamp,et al.  Practical cone-beam algorithm , 1984 .

[30]  Cees T. A. M. de Laat,et al.  A Software Workbench for Interactive, Time Critical and Highly Self-Adaptive Cloud Applications (SWITCH) , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[31]  Gilles Pagès,et al.  How to speed up the quantization tree algorithm with an application to swing options , 2010 .

[32]  Dmitri Riabkov,et al.  Accelerated cone-beam backprojection using GPU-CPU hardware , 2022 .

[33]  Surya S. Durbha,et al.  High resolution disaster data clustering using Graphics Processing Units , 2013, 2013 IEEE International Geoscience and Remote Sensing Symposium - IGARSS.

[34]  Martin Hammitzsch,et al.  Development of tsunami early warning systems and future challenges , 2012 .

[35]  Jos Vander Sloten,et al.  Analyzing the potential of GPGPUs for real-time explicit finite element analysis of soft tissue deformation using CUDA , 2015 .

[36]  Anthony K. H. Tung,et al.  Spatial clustering methods in data mining : A survey , 2001 .

[37]  N. Birbaumer,et al.  BCI2000: a general-purpose brain-computer interface (BCI) system , 2004, IEEE Transactions on Biomedical Engineering.

[38]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[39]  Lucia Maddalena,et al.  Multivalued Background/Foreground Separation for Moving Object Detection , 2009, WILF.

[40]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[41]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[42]  Ulrich Brunsmann,et al.  Gpu architecture for stationary multisensor pedestrian detection at smart intersections , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[43]  Justin C. Williams,et al.  Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain–Computer Interface Feature Extraction , 2009, Front. Neuroeng..

[44]  Adam Herout,et al.  Real-time detection of lines using parallel coordinates and CUDA , 2012, Journal of Real-Time Image Processing.

[45]  Julien Maillard,et al.  Enhancing the audience experience during sport events: real-time processing of multiple stereoscopic cameras , 2013, Ann. des Télécommunications.

[46]  Pavel Zemcík,et al.  Real-time object detection on CUDA , 2010, Journal of Real-Time Image Processing.

[47]  Amit A. Kale,et al.  Towards a robust, real-time face processing system using CUDA-enabled GPUs , 2009, 2009 International Conference on High Performance Computing (HiPC).

[48]  Pheng-Ann Heng,et al.  Accelerating simultaneous algebraic reconstruction technique with motion compensation using CUDA-enabled GPU , 2010, International Journal of Computer Assisted Radiology and Surgery.

[49]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[50]  Andrew Jones,et al.  Quality of Service Models for Microservices and Their Integration into the SWITCH IDE , 2017, 2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[51]  Miles Weston,et al.  Full matrix capture with time-efficient auto-focusing of unknown geometry through dual-layered media , 2013 .

[52]  A. Berkhout,et al.  Acoustic control by wave field synthesis , 1993 .

[53]  Markus Kowarschik,et al.  GPU-accelerated SART reconstruction using the CUDA programming environment , 2009, Medical Imaging.