An Active Learning Methodology for Efficient Estimation of Expensive Noisy Black-Box Functions Using Gaussian Process Regression

Estimation of black-box functions often requires evaluating an extensive number of expensive noisy points. Learning algorithms can actively compare the similarity between the evaluated and unevaluated points to determine the most informative subsequent points for efficient estimation of expensive functions in a sequential procedure. In this paper, we propose an active learning methodology based on the integration of Laplacian regularization and active learning - Cohn (ALC) measure for identification of the most informative points for efficient estimation of noisy black-box functions using Gaussian processes. We propose two simple greedy search algorithms for sequential optimization of the tuning parameters and determination of subsequent points based on the information from the previously evaluated points. We also enhance the graph Laplacian with the information of both the predictor and response variables to capture the similarity between the points more effectively. The proposed methodology is particularly suited for problems involving estimation of expensive black-box functions with a high level of noise and plenty of unevaluated points. Using a case study for analysis of the kinematics of pitching in baseball as well as simulation experiments, we demonstrate the performance of the proposed methodology against existing methods in the literature in terms of estimation error.

[1]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[2]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[3]  V. Roshan Joseph,et al.  Space-filling designs for computer experiments: A review , 2016 .

[4]  Yong Zhang,et al.  Uniform Design: Theory and Application , 2000, Technometrics.

[5]  Michael R. Lyu,et al.  A semi-supervised active learning framework for image retrieval , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[7]  Xiaofei He,et al.  Laplacian Regularized D-Optimal Design for Active Learning and Its Application to Image Retrieval , 2010, IEEE Transactions on Image Processing.

[8]  Jon Lee Maximum entropy sampling , 2001 .

[9]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[10]  Farid Melgani,et al.  Gaussian process regression within an active learning scheme , 2011, 2011 IEEE International Geoscience and Remote Sensing Symposium.

[11]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[12]  Bing Liu,et al.  Semi-supervised extreme learning machine with manifold and pairwise constraints regularization , 2015, Neurocomputing.

[13]  Michael C. Caramanis,et al.  Sequential DOE via dynamic programming , 2002 .

[14]  Jyhwen Wang,et al.  Gaussian process method for form error assessment using coordinate measurements , 2008 .

[15]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[16]  ANNE MARSDEN,et al.  EIGENVALUES OF THE LAPLACIAN AND THEIR RELATIONSHIP TO THE CONNECTEDNESS , 2013 .

[17]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[18]  Luc Pronzato,et al.  Design of computer experiments: space filling and beyond , 2011, Statistics and Computing.

[19]  Julien Bect,et al.  Robust Gaussian Process-Based Global Optimization Using a Fully Bayesian Expected Improvement Criterion , 2011, LION.

[20]  Kun Zhou,et al.  Laplacian optimal design for image retrieval , 2007, SIGIR.

[21]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[22]  Benjamin Van Roy,et al.  Learning to Optimize via Information-Directed Sampling , 2014, NIPS.

[23]  G. Matheron Principles of geostatistics , 1963 .

[24]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[25]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[26]  Dongrui Wu,et al.  Active Learning for Regression Using Greedy Sampling , 2018, Inf. Sci..

[27]  David Mackay,et al.  Gaussian Processes - A Replacement for Supervised Neural Networks? , 1997 .

[28]  Klaus Obermayer,et al.  Gaussian process regression: active data selection and test point rejection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[29]  S. S. Ravi,et al.  A Graph-Based Approach for Active Learning in Regression , 2020, SDM.

[30]  Yiqiang Chen,et al.  SELM: Semi-supervised ELM with application in sparse calibrated location estimation , 2011, Neurocomputing.

[31]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[32]  Weichung Wang,et al.  Building surrogates with overcomplete bases in computer experiments with applications to bistable laser diodes , 2010 .

[33]  Chun Chen,et al.  Active Learning Based on Locally Linear Reconstruction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  J. Kocijan,et al.  Gaussian process model based predictive control , 2004, Proceedings of the 2004 American Control Conference.

[35]  Qingshan She,et al.  Safe Semi-Supervised Extreme Learning Machine for EEG Signal Classification , 2018, IEEE Access.

[36]  Udo von Toussaint,et al.  Global Optimization Employing Gaussian Process-Based Bayesian Surrogates† , 2018, Entropy.

[37]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[38]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[39]  Robert B. Gramacy,et al.  Adaptive Design and Analysis of Supercomputer Experiments , 2008, Technometrics.

[40]  Xi Chen,et al.  Semi-supervised Kernel Minimum Squared Error Based on Manifold Structure , 2013, ISNN.

[41]  Xu Chen,et al.  Combining Active Learning and Semi-Supervised Learning by Using Selective Label Spreading , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[42]  Florian Steinke,et al.  Semi-supervised Regression using Hessian energy with an application to semi-supervised dimensionality reduction , 2009, NIPS.

[43]  Eric Walter,et al.  Global optimization based on noisy evaluations: An empirical study of two statistical approaches , 2008 .

[44]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[45]  Weichung Wang,et al.  Sequential Designs Based on Bayesian Uncertainty Quantification in Sparse Representation Surrogate Modeling , 2017, Technometrics.

[46]  Chris Bailey-Kellogg,et al.  Gaussian Processes for Active Data Mining of Spatial Aggregates , 2005, SDM.

[47]  Yi Yang,et al.  Interactive Video Indexing With Statistical Active Learning , 2012, IEEE Transactions on Multimedia.

[48]  M. E. Johnson,et al.  Minimax and maximin distance designs , 1990 .

[49]  Yiqiang Chen,et al.  Semi-supervised deep extreme learning machine for Wi-Fi based localization , 2015, Neurocomputing.

[50]  Mehryar Mohri,et al.  Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.

[51]  R. Munos,et al.  Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.

[52]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[54]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[55]  Zoubin Ghahramani,et al.  Semi-supervised learning : from Gaussian fields to Gaussian processes , 2003 .

[56]  Alexander D MacCalman,et al.  Flexible Space-Filling Designs for Complex System Simulations , 2013 .

[57]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[58]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[59]  Qinghua Zheng,et al.  Regularized Extreme Learning Machine , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[60]  Xin Du,et al.  Distributed Semi-Supervised Metric Learning , 2016, IEEE Access.

[61]  Stephen P. Boyd Convex optimization of graph Laplacian eigenvalues , 2006 .

[62]  Ky Khac Vu,et al.  Surrogate-based methods for black-box optimization , 2017, Int. Trans. Oper. Res..

[63]  Zhihui Lai,et al.  Structured optimal graph based sparse feature extraction for semi-supervised learning , 2020, Signal Process..