Neural network training and rule extraction with augmented discretized input

The classification and prediction accuracy of neural networks can be improved when they are trained with discretized continuous attributes as additional inputs. Such input augmentation makes it easier for the network weights to form more accurate decision boundaries when the data samples of different classes in the data set are contained in distinct hyper-rectangular subregions in the original input space. In this paper, we present first how a neural network can be trained with augmented discretized inputs. The additional inputs are obtained by dividing the original interval of each continuous attribute into subintervals of equal length. The network is then pruned to remove most of the discretized inputs as well as the original continuous attributes as long as the network still achieves a minimum preset accuracy requirement. We then discuss how comprehensible classification rules can be extracted from the pruned network by analyzing the activations of the network hidden units and the weights of the network connections that remain in the pruned network. Our experiments on artificial data sets show that the rules extracted from the neural networks can perfectly replicate the class membership rules used to create the data perfectly. On real-life benchmark data sets, neural networks trained with augmented discretized inputs are shown to achieve better accuracy than neural networks trained with the original data.

[1]  James A. Reggia,et al.  Guiding Hidden Layer Representations for Improved Rule Extraction From Neural Networks , 2011, IEEE Transactions on Neural Networks.

[2]  Xuefeng Yan,et al.  Improved simple deterministically constructed Cycle Reservoir Network with Sensitive Iterative Pruning Algorithm , 2014, Neurocomputing.

[3]  Marghny H. Mohamed,et al.  Rules extraction from constructively trained neural networks based on genetic algorithms , 2011, Neurocomputing.

[4]  Vasile Palade,et al.  Ensemble of Elman neural networks and support vector machines for reverse engineering of gene regulatory networks , 2011, Appl. Soft Comput..

[5]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[6]  So-Jung Park,et al.  Rule Extraction for Dynamic Hand Gesture Recognition using a Modified FMM Neural Network , 2013 .

[7]  Chris Aldrich,et al.  ANN-DT: an algorithm for extraction of decision trees from artificial neural networks , 1999, IEEE Trans. Neural Networks.

[8]  Rudy Setiono,et al.  A note on knowledge discovery using neural networks and its application to credit card screening , 2009, Eur. J. Oper. Res..

[9]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[10]  B. TickleA.,et al.  The truth will come to light , 1998 .

[11]  Francisco Herrera,et al.  A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[13]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[14]  Zhi-Hua Zhou,et al.  Extracting symbolic rules from trained neural network ensembles , 2003, AI Commun..

[15]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[16]  Chee Peng Lim,et al.  Improved GART Neural Network Model for Pattern Classification and Rule Extraction With Application to Power Systems , 2011, IEEE Transactions on Neural Networks.

[17]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[18]  Huan Liu,et al.  Symbolic Representation of Neural Networks , 1996, Computer.

[19]  Efstratios F. Georgopoulos,et al.  Forecasting foreign exchange rates with adaptive neural networks using radial-basis functions and Particle Swarm Optimization , 2013, Eur. J. Oper. Res..

[20]  Bart Baesens,et al.  Rule Extraction from Minimal Neural Networks for Credit Card Screening , 2011, Int. J. Neural Syst..

[21]  Jude W. Shavlik,et al.  Extracting Refined Rules from Knowledge-Based Neural Networks , 1993, Machine Learning.

[22]  Joachim Diederich,et al.  The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks , 1998, IEEE Trans. Neural Networks.

[23]  Shih-Hung Yang,et al.  An evolutionary constructive and pruning algorithm for artificial neural networks and its prediction applications , 2012, Neurocomputing.

[24]  Randall S. Sexton,et al.  Knowledge discovery using a neural network simultaneous optimization algorithm on a real world classification problem , 2006, Eur. J. Oper. Res..

[25]  David McLean,et al.  Rule extraction from neural networks for medical domains , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[26]  ZhouZhi-Hua,et al.  Ensembling neural networks , 2002 .

[27]  Hari Om Gupta,et al.  Function analysis based rule extraction from artificial neural networks for transformer incipient fault diagnosis , 2012 .

[28]  Murray Smith,et al.  Neural Networks for Statistical Modeling , 1993 .

[29]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[30]  J LisboaPaulo,et al.  The use of artificial neural networks in decision support in cancer , 2006 .

[31]  G. Bologna,et al.  N‐Terminal myristoylation predictions by ensembles of neural networks , 2004, Proteomics.

[32]  A. E. Amin,et al.  A novel classification model for cotton yarn quality based on trained neural network using genetic algorithm , 2013, Knowl. Based Syst..

[33]  Bart Baesens,et al.  Development and application of consumer credit scoring models using profit-based classification measures , 2014, Eur. J. Oper. Res..

[34]  Erkam Güresen,et al.  Developing an early warning system to predict currency crises , 2014, Eur. J. Oper. Res..

[35]  Christian Pellegrini,et al.  Constraining the MLP power of expression to facilitate symbolic rule extraction , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[36]  Hojjat Adeli,et al.  Neural Networks in Civil Engineering: 1989–2000 , 2001 .

[37]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[38]  Vincent A. Schmidt,et al.  Connectionist-Based Rules Describing the Pass-Through of Individual Goods Prices into Trend Inflation in the United States , 2011 .

[39]  Rudy Setiono Extracting M-of-N rules from trained neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[40]  Novruz Allahverdi,et al.  Rule extraction from trained adaptive neural networks using artificial immune systems , 2009, Expert Syst. Appl..

[41]  Richi Nayak Generating rules with predicates, terms and variables from the pruned neural networks , 2009, Neural Networks.

[42]  Jacek M. Zurada,et al.  Extracting Rules From Neural Networks as Decision Diagrams , 2011, IEEE Transactions on Neural Networks.

[43]  Kin Keung Lai,et al.  Forecasting Foreign Exchange Rates With Artificial Neural Networks: A Review , 2004, Int. J. Inf. Technol. Decis. Mak..

[44]  Zhi-Hua Zhou,et al.  Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble , 2003, IEEE Transactions on Information Technology in Biomedicine.

[45]  Rudy Setiono,et al.  Discrete Variable Generation for Improved Neural Network Classification , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[46]  Kit Yan Chan,et al.  Identification of significant factors for air pollution levels using a neural network based knowledge discovery system , 2013, Neurocomputing.

[47]  Jacek M. Zurada,et al.  Guest Editorial White Box Nonlinear Prediction Models , 2011, IEEE Transactions on Neural Networks.

[48]  Vadlamani Ravi,et al.  Cash demand forecasting in ATMs by clustering and neural networks , 2014, Eur. J. Oper. Res..

[49]  D. K. Mishra,et al.  KDRuleEx: A Novel Approach for Enhancing User Comprehensibility Using Rule Extraction , 2012, 2012 Third International Conference on Intelligent Systems Modelling and Simulation.

[50]  Mohammad Saniee Abadeh,et al.  Fuzzy Rule Extraction from a trained artificial neural network using Genetic Algorithm for WECS control and parameter estimation , 2011, 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[51]  Bart Baesens,et al.  Recursive Neural Network Rule Extraction for Data With Mixed Attributes , 2008, IEEE Transactions on Neural Networks.

[52]  T. Kathirvalavakumar,et al.  Rule extraction from neural networks — A comparative study , 2012, International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012).

[53]  Hujun Yin,et al.  A neural gas mixture autoregressive network for modelling and forecasting FX time series , 2014, Neurocomputing.

[54]  Engelbert Mephu Nguifo,et al.  Towards a generalization of decompositional approach of rule extraction from multilayer artificial neural network , 2011, The 2011 International Joint Conference on Neural Networks.

[55]  Asaad Y. Shamseldin,et al.  Knowledge Extraction from Artificial Neural Networks for Rainfall-Runoff Model Combination Systems , 2014 .

[56]  Chee Peng Lim,et al.  A hybrid neural network model for rule generation and its application to process fault detection and diagnosis , 2007, Eng. Appl. Artif. Intell..

[57]  HerreraFrancisco,et al.  A Survey of Discretization Techniques , 2013 .

[58]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[59]  Mu-Chun Su,et al.  A Rule Extraction Based Approach in Predicting Derivative Use for Financial Risk Hedging in Construction Companies , 2011, 2011 International Conference on Information Management, Innovation Management and Industrial Engineering.

[60]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[61]  George S. Atsalakis,et al.  New technology product demand forecasting using a fuzzy inference system , 2014, Oper. Res..

[62]  Ruibin Geng,et al.  Prediction of financial distress: An empirical study of listed Chinese companies using data mining , 2015, Eur. J. Oper. Res..

[63]  Paulo J. G. Lisboa,et al.  The Use of Artificial Neural Networks in Decision Support in Cancer: a Systematic Review , 2005 .

[64]  Chris Ninness,et al.  Neural Network and Multivariate Analyses: Pattern Recognition in Academic and Social Research , 2013 .

[65]  Lale Özbakir,et al.  A soft computing-based approach for integrated training and rule extraction from artificial neural networks: DIFACONN-miner , 2010, Appl. Soft Comput..

[66]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[67]  Emilio Carrizosa,et al.  Detecting relevant variables and interactions in supervised classification , 2011, Eur. J. Oper. Res..

[68]  William W. Hager,et al.  The Limited Memory Conjugate Gradient Method , 2013, SIAM J. Optim..

[69]  Imran Khan,et al.  Knowledge Extraction from Survey Data Using Neural Networks , 2013, Complex Adaptive Systems.