Apricot: A Weight-Adaptation Approach to Fixing Deep Learning Models

A deep learning (DL) model is inherently imprecise. To address this problem, existing techniques retrain a DL model over a larger training dataset or with the help of fault injected models or using the insight of failing test cases in a DL model. In this paper, we present Apricot, a novel weight-adaptation approach to fixing DL models iteratively. Our key observation is that if the deep learning architecture of a DL model is trained over many different subsets of the original training dataset, the weights in the resultant reduced DL model (rDLM) can provide insights on the adjustment direction and magnitude of the weights in the original DL model to handle the test cases that the original DL model misclassifies. Apricot generates a set of such reduced DL models from the original DL model. In each iteration, for each failing test case experienced by the input DL model (iDLM), Apricot adjusts each weight of this iDLM toward the average weight of these rDLMs correctly classifying the test case and/or away from that of these rDLMs misclassifying the same test case, followed by training the weight-adjusted iDLM over the original training dataset to generate a new iDLM for the next iteration. The experiment using five state-of-the-art DL models shows that Apricot can increase the test accuracy of these models by 0.87%-1.55% with an average of 1.08%. The experiment also reveals the complementary nature of these rDLMs in Apricot.

[1]  Kalyanmoy Deb,et al.  A Computationally Efficient Evolutionary Algorithm for Real-Parameter Optimization , 2002, Evolutionary Computation.

[2]  Song Han,et al.  DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training Flow , 2016, ArXiv.

[3]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[6]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[7]  E. Cantu-Paz,et al.  An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Shih-Hung Yang,et al.  An evolutionary constructive and pruning algorithm for artificial neural networks and its prediction applications , 2012, Neurocomputing.

[9]  Christian Igel,et al.  Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[10]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[12]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[13]  A. Topchy,et al.  Neural network training by means of cooperative evolutionary search , 1997 .

[14]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[15]  Fardin Akhlaghian Tab,et al.  Artificial neural networks generation using grammatical evolution , 2013, 2013 21st Iranian Conference on Electrical Engineering (ICEE).

[16]  Fardin Ahmadizar,et al.  Artificial neural network development by means of a novel combination of grammatical evolution and genetic algorithm , 2015, Eng. Appl. Artif. Intell..

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Jin Sha,et al.  Prune Deep Neural Networks With the Modified $L_{1/2}$ Penalty , 2019, IEEE Access.

[19]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[20]  Daniel Kroening,et al.  Testing Deep Neural Networks , 2018, ArXiv.

[21]  C. L. Giles,et al.  Dynamic recurrent neural networks: Theory and applications , 1994, IEEE Trans. Neural Networks Learn. Syst..

[22]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Verena Heidrich-Meisner,et al.  Neuroevolution strategies for episodic reinforcement learning , 2009, J. Algorithms.

[26]  Han-Xiong Huang,et al.  A proposed iteration optimization approach integrating backpropagation neural network with genetic algorithm , 2015, Expert Syst. Appl..

[27]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Wen-Chuan Lee,et al.  MODE: automated neural network model debugging via state differential analysis and input selection , 2018, ESEC/SIGSOFT FSE.

[29]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[30]  Ian Goodfellow,et al.  TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.

[31]  Nina Narodytska,et al.  Simple Black-Box Adversarial Attacks on Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[33]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[34]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[35]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[36]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[37]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[38]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[39]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[40]  Lei Ma,et al.  DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[41]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[42]  Jacek M. Zurada,et al.  Input Layer Regularization of Multilayer Feedforward Neural Networks , 2017, IEEE Access.

[43]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[44]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[47]  Lei Ma,et al.  DeepMutation: Mutation Testing of Deep Learning Systems , 2018, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE).

[48]  Danilo Comminiello,et al.  Group sparse regularization for deep neural networks , 2016, Neurocomputing.

[49]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[50]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .