Deep Learning in Visual Computing and Signal Processing

Deep learning is a subfield of machine learning, which aims to learn a hierarchy of features from input data. Nowadays, researchers have intensively investigated deep learning algorithms for solving challenging problems in many areas such as image classification, speech recognition, signal processing, and natural language processing. In this study, we not only review typical deep learning algorithms in computer vision and signal processing but also provide detailed information on how to apply deep learning to specific areas such as road crack detection, fault diagnosis, and human activity detection. Besides, this study also discusses the challenges of designing and training deep neural networks.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Jian Sun,et al.  A Practical Transfer Learning Algorithm for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[4]  Xiaogang Wang,et al.  Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification , 2014, ArXiv.

[5]  L. M. Frazier MDR for law enforcement [motion detector radar] , 1998 .

[6]  Geoffrey E. Hinton,et al.  Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.

[7]  Yann LeCun,et al.  Loss Functions for Discriminative Training of Energy-Based Models , 2005, AISTATS.

[8]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[10]  Danfeng Xie,et al.  A Hierarchical Deep Neural Network for Fault Diagnosis on Tennessee-Eastman Process , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[11]  Geoffrey E. Hinton Deep belief networks , 2009, Scholarpedia.

[12]  Ruxu Du,et al.  Fault diagnosis using support vector machine with an application in sheet metal stamping operations , 2004 .

[13]  Liang Lin,et al.  Deep Joint Task Learning for Generic Object Extraction , 2014, NIPS.

[14]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[15]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[16]  Yong Jae Lee,et al.  Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.

[17]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[18]  Geoffrey E. Hinton,et al.  Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.

[19]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[20]  Ruqiang Yan,et al.  A sparse auto-encoder-based deep neural network approach for induction motor faults classification , 2016 .

[21]  Reza Eslamloueyan,et al.  Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee-Eastman process , 2011, Appl. Soft Comput..

[22]  Eugene F. Greneker,et al.  RADAR flashlight for through-the-wall detection of humans , 1998, Defense, Security, and Sensing.

[23]  Alexander Karlsson,et al.  Root-cause localization using Restricted Boltzmann Machines , 2016, 2016 19th International Conference on Information Fusion (FUSION).

[24]  Lei Zhang,et al.  Palm Vein Recognition Using Directional Features Derived from Local Binary Patterns , 2016 .

[25]  Ronald M. Summers,et al.  Colitis detection on abdominal CT scans by rich feature hierarchies , 2016, SPIE Medical Imaging.

[26]  Venkat Venkatasubramanian,et al.  A neural network methodology for process fault diagnosis , 1989 .

[27]  Marc'Aurelio Ranzato,et al.  Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[28]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[30]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[31]  Zhiwei Huang,et al.  Moving Objects Segmentation from compressed surveillance video based on Motion Estimation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[32]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[33]  Yann LeCun,et al.  Deep belief net learning in a long-range vision system for autonomous off-road driving , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[35]  Masahiro Abe,et al.  Incipient fault diagnosis of chemical processes via artificial neural networks , 1989 .

[36]  Deyu Meng,et al.  Two-Stream Contextualized CNN for Fine-Grained Image Classification , 2016, AAAI.

[37]  Shehroz S. Khan,et al.  Detecting unseen falls from wearable devices using channel-wise ensemble of autoencoders , 2016, Expert Syst. Appl..

[38]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[39]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[40]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[41]  Cong Wang,et al.  Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings , 2016 .

[42]  R Eslamloueian,et al.  Multiple Simultaneous Fault Diagnosis via Hierarchical and Single Artificial Neural Networks , 2003 .

[43]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[44]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[45]  Joshua R. Smith,et al.  RFID-based techniques for human-activity detection , 2005, Commun. ACM.

[46]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[47]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[48]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[49]  Arthur K. Kordon,et al.  Fault diagnosis based on Fisher discriminant analysis and support vector machines , 2004, Comput. Chem. Eng..

[50]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[51]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[52]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[53]  Meng Wang,et al.  A Deep Structured Model with Radius–Margin Bound for 3D Human Activity Recognition , 2015, International Journal of Computer Vision.

[54]  Inmaculada Plaza,et al.  Challenges, issues and trends in fall detection systems , 2013, Biomedical engineering online.

[55]  Peng Jiang,et al.  Fault Diagnosis Based on Chemical Sensor Data with an Active Deep Neural Network , 2016, Sensors.

[56]  Jian Sun,et al.  Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[58]  Ronald M. Summers,et al.  Spatial Aggregation of Holistically-Nested Networks for Automated Pancreas Segmentation , 2016, MICCAI.

[59]  Alex Mihailidis,et al.  A Survey on Ambient-Assisted Living Tools for Older Adults , 2013, IEEE Journal of Biomedical and Health Informatics.

[60]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[61]  Shengen Yan,et al.  Deep Image: Scaling up Image Recognition , 2015, ArXiv.

[62]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[63]  Peng Xu,et al.  Decentralized fault detection and diagnosis via sparse PCA based decomposition and Maximum Entropy decision fusion , 2012 .

[64]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[65]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[66]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[67]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[68]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[69]  Lei Zhang,et al.  Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[71]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[72]  Ronald M. Summers,et al.  Improving Computer-Aided Detection Using Convolutional Neural Networks and Random View Aggregation , 2015, IEEE Transactions on Medical Imaging.

[73]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[74]  Liya Hou,et al.  Diagnosis of multiple simultaneous fault via hierarchical artificial neural networks , 1994 .

[75]  Youngwook Kim,et al.  Application of Linear Predictive Coding for Human Activity Classification Based on Micro-Doppler Signatures , 2014, IEEE Geoscience and Remote Sensing Letters.

[76]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[77]  Wendi B. Heinzelman,et al.  Cloud-Vision: Real-time face recognition using a mobile-cloudlet-cloud acceleration architecture , 2012, 2012 IEEE Symposium on Computers and Communications (ISCC).

[78]  Youngwook Kim,et al.  Micro-Doppler Based Classification of Human Aquatic Activities via Transfer Learning of Convolutional Neural Networks , 2016, Sensors.

[79]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[80]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Peter N. Belhumeur,et al.  Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification , 2012, BMVC.

[82]  Takeo Kanade,et al.  Picture Processing System by Computer Complex and Recognition of Human Faces , 1974 .

[83]  Youngwook Kim,et al.  Human Activity Classification Based on Micro-Doppler Signatures Using a Support Vector Machine , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[84]  E. F. Vogel,et al.  A plant-wide industrial process control problem , 1993 .

[85]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[86]  Michael Nikolaou,et al.  An approach to fault diagnosis of chemical processes via neural networks , 1993 .

[87]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[88]  Geoffrey E. Hinton,et al.  Modeling image patches with a directed hierarchy of Markov random fields , 2007, NIPS.

[89]  Yimin D. Zhang,et al.  Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[90]  Luca Maria Gambardella,et al.  Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.

[91]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[92]  Heguang Liu,et al.  A novel scheme to code object flags for video synopsis , 2012, 2012 Visual Communications and Image Processing.

[93]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[94]  Kuan-Ta Chen,et al.  Face Recognition on Drones: Issues and Limitations , 2015, DroNet@MobiSys.

[95]  F. Groen,et al.  Human walking estimation with radar , 2003 .

[96]  Branka Jokanovic,et al.  Radar fall motion detection using deep learning , 2016, 2016 IEEE Radar Conference (RadarConf).

[97]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[98]  Yihong Gong,et al.  Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks , 2008, ECCV.

[99]  Henggui Zhang,et al.  A deep learning network for right ventricle segmentation in short-axis MRI , 2016, 2016 Computing in Cardiology Conference (CinC).

[100]  Paul E. Utgoff,et al.  Many-Layered Learning , 2002, Neural Computation.

[101]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing , 2011 .

[102]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[103]  Gerald Penn,et al.  Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[104]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .