Optimization of Deep Learning using Various Optimizers , Loss Functions and Dropout

Deep Learning is gaining lot of prominence due to its break through results in various fields like Computer Vision, Natural Language Processing, Time Series Analysis, Health Care etc. Earlier, the Deep Learning was implemented using the batch and stochastic gradient descent algorithms and some optimizers which lead to very less performance of the models. But today, lot of work is going on for the enhancement of the performance of Deep Learning using various optimization techniques. So, in this context, It is proposed to build a Deep Learning model using various Optimizers (Adagrad, RmsProp, Adam), Loss functions (mean squared error, binary cross entropy) and Dropout concept for the Convolutional neural networks and Recurrent neural networks and verify the performance such as Accuracy and Loss of the model. The proposed model has achieved maximum Accuracy when Adam optimizer and mean squared error loss function are applied on convolutional neural networks and the model is run with minimum Loss when the same Adam optimizer and mean squared error loss function are applied on Recurrent neural networks. While performing the Regularization of the model, the maximum Accuracy is achieved when the Dropout with a minimum fraction ‘p’ of nodes is applied on convolutional neural networks and the model has run with minimum Loss when the same dropout value is applied on Recurrent neural networks.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Asmelash Teka,et al.  Large-scale learning with AdaGrad on Spark , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[3]  Matthias Hein,et al.  Variants of RMSProp and Adagrad with Logarithmic Regret Bounds , 2017, ICML.

[4]  Wojciech Czarnecki,et al.  On Loss Functions for Deep Neural Networks in Classification , 2017, ArXiv.

[5]  Quoc V. Le,et al.  Neural Optimizer Search with Reinforcement Learning , 2017, ICML.

[6]  Zijun Zhang,et al.  Improved Adam Optimizer for Deep Neural Networks , 2018, 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS).

[7]  Abhishek Das,et al.  Grad-CAM: Why did you say that? , 2016, ArXiv.

[8]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[9]  Julian Fierrez,et al.  Exploring Recurrent Neural Networks for On-Line Handwritten Signature Biometrics , 2018, IEEE Access.

[10]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[12]  Atul Negi,et al.  An OCR system for Telugu , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[13]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[16]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.