A Survey on Deep Learning

The field of machine learning is witnessing its golden era as deep learning slowly becomes the leader in this domain. Deep learning uses multiple layers to represent the abstractions of data to build computational models. Some key enabler deep learning algorithms such as generative adversarial networks, convolutional neural networks, and model transfers have completely changed our perception of information processing. However, there exists an aperture of understanding behind this tremendously fast-paced domain, because it was never previously represented from a multiscope perspective. The lack of core understanding renders these powerful methods as black-box machines that inhibit development at a fundamental level. Moreover, deep learning has repeatedly been perceived as a silver bullet to all stumbling blocks in machine learning, which is far from the truth. This article presents a comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications. It was also undertaken to review the issues faced in deep learning such as unsupervised learning, black-box models, and online learning and to illustrate how these challenges can be transformed into prolific future research avenues.

[1]  Marc'Aurelio Ranzato,et al.  Single Server Multi-GPU Training of ConvNets , 2013 .

[2]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Tao Wang,et al.  Deep learning with COTS HPC systems , 2013, ICML.

[4]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[5]  Shu-Ching Chen,et al.  An efficient deep residual-inception network for multimedia classification , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Yiannis Kompatsiaris,et al.  Deep Learning Advances in Computer Vision with 3D Data , 2017, ACM Comput. Surv..

[7]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[8]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.

[10]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[11]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[13]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[14]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[15]  Igor Mozetic,et al.  Multilingual Twitter Sentiment Classification: The Role of Human Annotators , 2016, PloS one.

[16]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[17]  Tim Fingscheidt,et al.  A DNN regression approach to speech enhancement by artificial bandwidth extension , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[18]  DeLiang Wang,et al.  Deep Learning Based Binaural Speech Separation in Reverberant Environments , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[20]  Xiangang Li,et al.  Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Dong Yu,et al.  Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[23]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[24]  Adler J. Perotte,et al.  Deep Survival Analysis , 2016, MLHC.

[25]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[27]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[28]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[29]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[30]  Paris Smaragdis,et al.  Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[32]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[33]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[34]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[35]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Timothy Dozat,et al.  Incorporating Nesterov Momentum into Adam , 2016 .

[37]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[38]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[39]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[40]  Fabio Viola,et al.  The Kinetics Human Action Video Dataset , 2017, ArXiv.

[41]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[42]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[44]  Chao Wang,et al.  A Deep Learning Prediction Process Accelerator Based FPGA , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[45]  Bowen Zhou,et al.  Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[46]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[47]  Hsin-Yu Ha,et al.  Integrating deep learning with correlation-based multimedia semantic concept detection , 2015 .

[48]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[49]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[50]  Christopher D. Manning Understanding Human Language: Can NLP and Deep Learning Help? , 2016, SIGIR.

[51]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition , 1996 .

[52]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[53]  A. Kalinovsky,et al.  Deep learning with theano, torch, caffe, tensorflow, and deeplearning4J: which one is the best in speed and accuracy? , 2016 .

[54]  Geoffrey E. Hinton,et al.  Modeling Natural Images Using Gated MRFs , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[56]  Zhiwei Zhao,et al.  Attention-Based Convolutional Neural Networks for Sentence Classification , 2016, INTERSPEECH.

[57]  Dipti Srinivasan,et al.  Neural Networks for Continuous Online Learning and Control , 2006, IEEE Transactions on Neural Networks.

[58]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[59]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Quoc V. Le,et al.  A Hierarchical Model for Device Placement , 2018, ICLR.

[61]  Jesper Jensen,et al.  Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[62]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[63]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[64]  Qi Yu,et al.  DLAU: A Scalable Deep Learning Accelerator Unit on FPGA , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[65]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[66]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[67]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[68]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[69]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Lei Guo,et al.  Traffic Matrix Prediction and Estimation Based on Deep Learning for Data Center Networks , 2016, 2016 IEEE Globecom Workshops (GC Wkshps).

[71]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[72]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[73]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[74]  Jen-Tzung Chien,et al.  Nonstationary Source Separation Using Sequential and Variational Bayesian Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[75]  Michael V. McConnell,et al.  Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning , 2017, Nature Biomedical Engineering.

[76]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[77]  Ngoc Thang Vu,et al.  Attentive Convolutional Neural Network Based Speech Emotion Recognition: A Study on the Impact of Input Features, Signal Length, and Acted Speech , 2017, INTERSPEECH.

[78]  Jürgen Schmidhuber,et al.  Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.

[79]  Feng Liu,et al.  Deep Belief Network-Based Approaches for Link Prediction in Signed Social Networks , 2015, Entropy.

[80]  Min Chen,et al.  Efficient Imbalanced Multimedia Concept Retrieval by Deep Learning on Spark Clusters , 2017, Int. J. Multim. Data Eng. Manag..

[81]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[82]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[83]  Geoffrey E. Hinton,et al.  An Efficient Learning Procedure for Deep Boltzmann Machines , 2012, Neural Computation.

[84]  Davide Castelvecchi,et al.  Can we open the black box of AI? , 2016, Nature.

[85]  Christopher Joseph Pal,et al.  Delving Deeper into Convolutional Networks for Learning Video Representations , 2015, ICLR.

[86]  Manfred Huber,et al.  Using deep learning to enhance cancer diagnosis and classication , 2013 .

[87]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[88]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[89]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[90]  Rodrigo C. Barros,et al.  A character-based convolutional neural network for language-agnostic Twitter sentiment analysis , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[91]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[92]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[93]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[94]  Charles Blundell,et al.  Early Visual Concept Learning with Unsupervised Deep Learning , 2016, ArXiv.

[95]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[96]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[97]  Antonio Bonafonte,et al.  SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.

[98]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[99]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[100]  Jesper Jensen,et al.  Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[101]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[102]  Devdatt P. Dubhashi,et al.  Extractive Summarization using Continuous Vector Space Models , 2014, CVSC@EACL.

[103]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[104]  Jonathan Krause,et al.  The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition , 2015, ECCV.

[105]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[106]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[107]  Yunpeng Wang,et al.  Large-Scale Transportation Network Congestion Evolution Prediction Using Deep Learning Theory , 2015, PloS one.

[108]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter. , 2019 .

[109]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[110]  Yoshua Bengio,et al.  Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.

[111]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[112]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[113]  Jeff Johnson,et al.  Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.

[114]  Luca Benini,et al.  Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes , 2017, IEEE Transactions on Parallel and Distributed Systems.

[115]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[116]  Shu-Ching Chen,et al.  T-LRA: Trend-Based Learning Rate Annealing for Deep Neural Networks , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[117]  Gregory S. Corrado,et al.  Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning , 2017, Nature Biomedical Engineering.

[118]  Shu-Ching Chen,et al.  A Video-Aided Semantic Analytics System for Disaster Information Integration , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[119]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[120]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[121]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[122]  Bernd Bischl,et al.  Automatic model selection for high-dimensional survival analysis , 2015 .

[123]  Rasool Fakoor,et al.  Using deep learning to enhance cancer diagnosis and classication , 2013 .

[124]  Shu-Ching Chen,et al.  A Classifier Ensemble Framework for Multimedia Big Data Classification , 2016, 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI).

[125]  Min Chen,et al.  Deep Learning for Imbalanced Multimedia Data Classification , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[126]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[127]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[128]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[129]  Yuichiro Shibata,et al.  Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm , 2011, 2011 International Conference on Field-Programmable Technology.

[130]  Shafiq R. Joty,et al.  Applications of Online Deep Learning for Crisis Response Using Social Media Information , 2016, ArXiv.

[131]  Alessio Micheli,et al.  Neural Network for Graphs: A Contextual Constructive Approach , 2009, IEEE Transactions on Neural Networks.

[132]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[133]  Panagiotis Tsakalides,et al.  Goal!! Event detection in sports video , 2017, Computer Vision Applications in Sports.

[134]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[135]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[136]  Roded Sharan,et al.  Using deep learning to model the hierarchical structure and function of a cell , 2018, Nature Methods.

[137]  Marc'Aurelio Ranzato,et al.  Multi-GPU Training of ConvNets , 2013, ICLR.

[138]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[139]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[140]  Yann LeCun,et al.  Deep belief net learning in a long-range vision system for autonomous off-road driving , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[141]  Qunsheng Peng,et al.  Vectorization of line drawing image based on junction analysis , 2014, Science China Information Sciences.

[142]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[143]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[144]  Hyesoon Kim,et al.  An integrated GPU power and performance model , 2010, ISCA.

[145]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[146]  Steven R. Young,et al.  A 1 TOPS/W Analog Deep Machine-Learning Engine With Floating-Gate Storage in 0.13 µm CMOS , 2014, IEEE Journal of Solid-State Circuits.

[147]  Dong Yu,et al.  Speech emotion recognition using deep neural network and extreme learning machine , 2014, INTERSPEECH.

[148]  Chris Quirk,et al.  Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[149]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[150]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[151]  Ting Chen,et al.  Deep Learning Based Automatic Immune Cell Detection for Immunohistochemistry Images , 2014, MLMI.

[152]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[153]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition: Fundamentals and Applications , 1995 .

[154]  Shu-Ching Chen,et al.  MCA-NN: Multiple Correspondence Analysis Based Neural Network for Disaster Information Detection , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[155]  Hang Su,et al.  Experiments on Parallel Training of Deep Neural Network using Model Averaging , 2015, ArXiv.

[156]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[157]  E. Newell,et al.  Mass cytometry: blessed with the curse of dimensionality , 2016, Nature Immunology.

[158]  David Zhang,et al.  Part-based convolutional neural network for visual recognition , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[159]  Shu-Ching Chen,et al.  Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep Learning , 2017, Int. J. Semantic Comput..

[160]  Shu-Ching Chen,et al.  Correlation-Based Deep Learning for Multimedia Semantic Concept Detection , 2015, WISE.

[161]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[162]  Geoffrey E. Hinton Deep belief networks , 2009, Scholarpedia.

[163]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[164]  Xiaolin Hu,et al.  Recurrent convolutional neural network for speech processing , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[165]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[166]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[167]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[168]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[169]  Ming Zhou,et al.  Question Answering over Freebase with Multi-Column Convolutional Neural Networks , 2015, ACL.

[170]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[171]  B. van Ginneken,et al.  Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis , 2016, Scientific Reports.

[172]  Geoffrey Zweig,et al.  An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.

[173]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[174]  Soroush Vosoughi,et al.  Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder , 2016, SIGIR.

[175]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[176]  Jianfeng Gao,et al.  Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[177]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[178]  Li Deng,et al.  A tutorial survey of architectures, algorithms, and applications for deep learning , 2014, APSIPA Transactions on Signal and Information Processing.

[179]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.