Using high-fidelity meta-models to improve performance of small dataset trained Bayesian Networks

Abstract Machine Learning (ML) is increasingly being used by companies like Google, Amazon and Apple to help identify market trends and predict customer behavior. Continuous improvement and maturing of these ML tools will help improve decision making across a number of industries. Unfortunately, before many ML strategies can be utilized the methods often require large amounts of data. For a number of realistic situations, however, only smaller subsets of data are available (i.e. hundreds to thousands of points). This work explores this problem by investigating the feasibility of using meta-models, specifically Kriging and Radial Basis Functions, to generate data for training a BN when only small amounts of original data are available. This paper details the meta-model creation process and the results of using Particle Swarm Optimization (PSO) for tuning parameters for four network structures trained using three relatively small data sets. Additionally, a series of experiments augment these small datasets by generating ten thousand, one-hundred thousand, and a million synthetic data points using the Kriging and RBF meta-models as well as intelligently establishing prior probabilities using PSO. Results show that augmenting limited existing datasets with meta-model generated data can dramatically affect network accuracy. Overall, the exploratory results presented in this paper demonstrate the feasibility of using meta-model generated data to increase the accuracy of small sample set trained BN. Further developing this method will help underserved areas with access to only small datasets make use of the powerful predictive analytics of ML.

[1]  Der-Chiang Li,et al.  A neural network weight determination model designed uniquely for small data set learning , 2009, Expert Syst. Appl..

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  Vinod Kumar,et al.  Definition and Verification of Workers' Aptitude Toward Assembly Tasks in Production Cells , 2016 .

[4]  Paul Davies,et al.  Fusing Self-Reported and Sensor Data from Mixed-Reality Training , 2014 .

[5]  Yusaku Okada,et al.  Human factor requirements for Applying Augmented reality to manuals in actual work situations , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[6]  Y. Rahmat-Samii,et al.  Advances in Particle Swarm Optimization for Antenna Designs: Real-Number, Binary, Single-Objective and Multiobjective Implementations , 2007, IEEE Transactions on Antennas and Propagation.

[7]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[8]  David Page,et al.  KDD Cup 2001 report , 2002, SKDD.

[9]  Marek J. Druzdzel,et al.  Learning Bayesian network parameters from small data sets: application of Noisy-OR gates , 2001, Int. J. Approx. Reason..

[10]  D. Kumaran,et al.  Frames, Biases, and Rational Decision-Making in the Human Brain , 2006, Science.

[11]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[12]  Piet Demeester,et al.  ooDACE toolbox: a flexible object-oriented Kriging implementation , 2014, J. Mach. Learn. Res..

[13]  Jack P. C. Kleijnen,et al.  Kriging Metamodeling in Simulation: A Review , 2007, Eur. J. Oper. Res..

[14]  Todd Andrew Stephenson,et al.  An Introduction to Bayesian Network Theory and Usage , 2000 .

[15]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[16]  Jianpei Zhang,et al.  A novel virtual sample generation method based on Gaussian distribution , 2011, Knowl. Based Syst..

[17]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[18]  BanksAlec,et al.  A review of particle swarm optimization. Part II , 2007 .

[19]  G. Gary Wang,et al.  Review of Metamodeling Techniques in Support of Engineering Design Optimization , 2007, DAC 2006.

[20]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[21]  Eric Bonjour,et al.  Improving users’ product acceptability: an approach based on Bayesian networks and a simulated annealing algorithm , 2016 .

[22]  Joaquim R. R. A. Martins,et al.  Multidisciplinary design optimization: A survey of architectures , 2013 .

[23]  Andrew D. Back,et al.  Radial Basis Functions , 2001 .

[24]  Andrew Y. C. Nee,et al.  A comprehensive survey of augmented reality assembly research , 2016, Advances in Manufacturing.

[25]  David W. Coit,et al.  Multi-objective optimization using genetic algorithms: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[26]  R. Hastie Problems for judgment and decision making. , 2001, Annual review of psychology.

[27]  David J. C. MacKay,et al.  Choice of Basis for Laplace Approximation , 1998, Machine Learning.

[28]  Yanlin He,et al.  A PSO based virtual sample generation method for small sample sets: Applications to regression datasets , 2017, Eng. Appl. Artif. Intell..

[29]  Stefan Lessmann,et al.  Tuning metaheuristics: A data mining based approach for particle swarm optimization , 2011, Expert Syst. Appl..

[30]  Alex Alves Freitas,et al.  Learning Bayesian network classifiers using ant colony optimization , 2013, Swarm Intelligence.

[31]  Chukwudi Anyakoha,et al.  A review of particle swarm optimization. Part I: background and development , 2007, Natural Computing.

[32]  Roberta Padovan,et al.  Genetic optimization of a PCM enhanced storage tank for Solar Domestic Hot Water Systems , 2014 .

[33]  Jasbir S. Arora,et al.  Survey of multi-objective optimization methods for engineering , 2004 .

[34]  Benoît Iung,et al.  Overview on Bayesian networks applications for dependability, risk analysis and maintenance areas , 2012, Eng. Appl. Artif. Intell..

[35]  Russell Greiner,et al.  Learning Bayesian Belief Network Classifiers: Algorithms and System , 2001, Canadian Conference on AI.

[36]  Der-Chiang Li,et al.  Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency , 2014, Decis. Support Syst..

[37]  Thiagarajan Krishnamurthy,et al.  Comparison of Response Surface Construction Methods for Derivative Estimation Using Moving Least Squares, Kriging and Radial Basis Functions , 2013 .

[38]  Cole Muller,et al.  Reliability analysis of the 4.5 roller bearing , 2003 .

[39]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[40]  Tomaso Poggio,et al.  Incorporating prior information in machine learning by creating virtual examples , 1998, Proc. IEEE.