Hidden representations in deep neural networks: Part 1. Classification problems

Abstract Deep neural networks have evolved into a powerful tool applicable for a wide range of problems. However, a clear understanding of their internal mechanism has not been developed satisfactorily yet. Factors such as the architecture, number of hidden layers and neurons, and activation function are largely determined in a guess-and-test manner that is reminiscent of alchemy more than of chemistry. In this paper, we attempt to address these concerns systematically using carefully chosen model systems to gain insights for classification problems. We show how wider networks result in several simple patterns identified on the input space, while deeper networks result in more complex patterns. We show also the transformation of input space by each layer and identify the origin of techniques such as transfer learning, weight normalization and early stopping. This paper is an initial step towards a systematic approach to uncover key hidden properties that can be exploited to improve the performance and understanding of deep neural networks.

[1]  David M. Himmelblau,et al.  FAULT DIAGNOSIS IN COMPLEX CHEMICAL PLANTS USING ARTIFICIAL NEURAL NETWORKS , 1991 .

[2]  Matthew Hutson,et al.  AI researchers allege that machine learning is alchemy , 2018 .

[3]  Faisal Khan,et al.  Root Cause Diagnosis of Process Fault Using KPCA and Bayesian Network , 2017 .

[4]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[5]  Hao Wu,et al.  Deep convolutional neural network model based chemical process fault diagnosis , 2018, Comput. Chem. Eng..

[6]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[8]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[9]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[10]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[11]  Huan Liu,et al.  Understanding Neural Networks via Rule Extraction , 1995, IJCAI.

[12]  U. Rajendra Acharya,et al.  Application of deep transfer learning for automated brain abnormality classification using MR images , 2019, Cognitive Systems Research.

[13]  Matthew S. Thiese,et al.  The misuse and abuse of statistics in biomedical research , 2015, Biochemia medica.

[14]  Zhanpeng Zhang,et al.  A deep belief network based fault diagnosis model for complex chemical processes , 2017, Comput. Chem. Eng..

[15]  Venkat Venkatasubramanian,et al.  On the nature of fault space classification structure developed by neural networks , 1992 .

[16]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[17]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[18]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[19]  Masahiro Abe,et al.  Incipient fault diagnosis of chemical processes via artificial neural networks , 1989 .

[20]  Peng Jiang,et al.  Fault Diagnosis Based on Chemical Sensor Data with an Active Deep Neural Network , 2016, Sensors.

[21]  Siddhartha Kumar Khaitan,et al.  Deep Convolutional Neural Networks with transfer learning for computer vision-based data-driven pavement distress detection , 2017 .

[22]  Moisès Graells,et al.  Fault diagnosis of chemical processes with incomplete observations: A comparative study , 2016, Comput. Chem. Eng..

[23]  V. Venkatasubramanian The promise of artificial intelligence in chemical engineering: Is it here, finally? , 2018, AIChE Journal.

[24]  Michael I. Jordan,et al.  Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[25]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[26]  Nick Allum,et al.  Disparities in science literacy , 2018, Science.

[27]  Moisès Graells,et al.  Dynamic kriging based fault detection and diagnosis approach for nonlinear noisy dynamic processes , 2017, Comput. Chem. Eng..

[28]  Michael Nikolaou,et al.  An approach to fault diagnosis of chemical processes via neural networks , 1993 .

[29]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[30]  Venkat Venkatasubramanian,et al.  A neural network methodology for process fault diagnosis , 1989 .