Deep Learning Hyper-Parameter Optimization for Video Analytics in Clouds

A system to perform video analytics is proposed using a dynamically tuned convolutional network. Videos are fetched from cloud storage, preprocessed, and a model for supporting classification is developed on these video streams using cloud-based infrastructure. A key focus in this paper is on tuning hyper-parameters associated with the deep learning algorithm used to construct the model. We further propose an automatic video object classification pipeline to validate the system. The mathematical model used to support hyper-parameter tuning improves performance of the proposed pipeline, and outcomes of various parameters on system’s performance is compared. Subsequently, the parameters that contribute toward the most optimal performance are selected for the video object classification pipeline. Our experiment-based validation reveals an accuracy and precision of 97% and 96%, respectively. The system proved to be scalable, robust, and customizable for a variety of different applications.

[1]  Ashiq Anjum,et al.  Spatial Frequency Based Video Stream Analysis for Object Classification and Recognition in Clouds , 2016, 2016 IEEE/ACM 3rd International Conference on Big Data Computing Applications and Technologies (BDCAT).

[2]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[3]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[7]  Xiaodong Cui,et al.  Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Richard Hill,et al.  Cloud-based scalable object detection and classification in video streams , 2018, Future Gener. Comput. Syst..

[10]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[11]  Omer F. Rana,et al.  Deadline Constrained Video Analysis via In-Transit Computational Environments , 2020, IEEE Transactions on Services Computing.

[12]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Qingfu Zhang,et al.  Expensive Multiobjective Optimization by MOEA/D With Gaussian Process Model , 2010, IEEE Transactions on Evolutionary Computation.

[14]  Ning Chen,et al.  Gibbs max-margin topic models with data augmentation , 2013, J. Mach. Learn. Res..

[15]  Ashiq Anjum,et al.  Modeling and Analysis of a Deep Learning Pipeline for Cloud based Video Analytics , 2017, BDCAT.

[16]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[17]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[18]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[19]  Qi Wang,et al.  Embedding structured contour and location prior in siamesed fully convolutional networks for road detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Qi Wang,et al.  Tracking as a Whole: Multi-Target Tracking by Modeling Group Behavior With Sequential Detection , 2017, IEEE Transactions on Intelligent Transportation Systems.

[21]  Nick Antonopoulos,et al.  Video Stream Analysis in Clouds: An Object Detection and Classification Framework for High Performance Video Analytics , 2019, IEEE Transactions on Cloud Computing.

[22]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Jitendra Malik,et al.  Deformable part models are convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[26]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[28]  Guang-Bin Huang,et al.  Extreme Learning Machine for Multilayer Perceptron , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Michael Thomas,et al.  Data Intensive and Network Aware (DIANA) Grid Scheduling , 2007, Journal of Grid Computing.

[30]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .