A Novel Active Learning Regression Framework for Balancing the Exploration-Exploitation Trade-Off

Recently, active learning is considered a promising approach for data acquisition due to the significant cost of the data labeling process in many real world applications, such as natural language processing and image processing. Most active learning methods are merely designed to enhance the learning model accuracy. However, the model accuracy may not be the primary goal and there could be other domain-specific objectives to be optimized. In this work, we develop a novel active learning framework that aims to solve a general class of optimization problems. The proposed framework mainly targets the optimization problems exposed to the exploration-exploitation trade-off. The active learning framework is comprehensive, it includes exploration-based, exploitation-based and balancing strategies that seek to achieve the balance between exploration and exploitation. The paper mainly considers regression tasks, as they are under-researched in the active learning field compared to classification tasks. Furthermore, in this work, we investigate the different active querying approaches—pool-based and the query synthesis—and compare them. We apply the proposed framework to the problem of learning the price-demand function, an application that is important in optimal product pricing and dynamic (or time-varying) pricing. In our experiments, we provide a comparative study including the proposed framework strategies and some other baselines. The accomplished results demonstrate a significant performance for the proposed methods.

[1]  Xiangliang Zhang,et al.  Efficient Active Learning of Halfspaces via Query Synthesis , 2015, AAAI.

[2]  Murat Akcakaya,et al.  A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Dongrui Wu,et al.  Active Learning for Regression Using Greedy Sampling , 2018, Inf. Sci..

[4]  J. Durbin,et al.  Testing for serial correlation in least squares regression. I. , 1950, Biometrika.

[5]  Masashi Sugiyama,et al.  Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error , 2006, J. Mach. Learn. Res..

[6]  Marc Toussaint,et al.  Active Learning of Hyperparameters: An Expected Cross Entropy Criterion for Active Model Selection , 2014, ArXiv.

[7]  Murat Akçakaya,et al.  Classification Active Learning Based on Mutual Information , 2016, Entropy.

[8]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[9]  José António Tenreiro Machado,et al.  Entropy Analysis of Monetary Unions , 2017, Entropy.

[10]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[11]  Clemens Elster,et al.  A tutorial on Bayesian Normal linear regression , 2015 .

[12]  M. Zuluaga,et al.  ε-PAL: an active learning approach to the multi-objective optimization problem , 2016 .

[13]  Marjan Mernik,et al.  Exploration and exploitation in evolutionary algorithms: A survey , 2013, CSUR.

[14]  Javier E. Contreras-Reyes,et al.  Flexible Bayesian analysis of the von Bertalanffy growth function with the use of a log-skew-t distribution , 2016 .

[15]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[16]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[17]  M. A. Rafe Biswas,et al.  Regression analysis for prediction of residential energy consumption , 2015 .

[18]  Amir F. Atiya,et al.  Analytical solutions to the dynamic pricing problem for time-normalized revenue , 2016, Eur. J. Oper. Res..

[19]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[20]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[21]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[22]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[23]  Doina Precup,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[24]  Edward R. Dougherty,et al.  Optimal Experimental Design for Gene Regulatory Networks in the Presence of Uncertainty , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Foster J. Provost,et al.  Decision-Centric Active Learning of Binary-Outcome Models , 2007, Inf. Syst. Res..

[26]  Andreas Krause,et al.  Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach , 2007, ICML '07.

[27]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[28]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[29]  Edward R. Dougherty,et al.  Quantifying the Objective Cost of Uncertainty in Complex Dynamical Systems , 2013, IEEE Transactions on Signal Processing.

[30]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[31]  Jun Zhou,et al.  Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[32]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[33]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[34]  S.H.G. ten Hagen,et al.  Exploration/exploitation in adaptive recommender systems , 2003 .

[35]  Henry Schultz,et al.  A comparison of elasticities of demand obtained by different methods , 1933 .

[36]  Dongrui Wu,et al.  Pool-Based Sequential Active Learning for Regression , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Roman Garnett,et al.  Bayesian Optimal Active Search and Surveying , 2012, ICML.

[38]  Samuel Kotz,et al.  Multivariate T-Distributions and Their Applications , 2004 .

[39]  Fredrik Olsson,et al.  A literature survey of active machine learning in the context of natural language processing , 2009 .

[40]  Warren B. Powell,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[41]  Amir F. Atiya,et al.  A Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[43]  Mehryar Mohri,et al.  Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.

[44]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[45]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[46]  Hatem A. Fayed,et al.  A framework for an agent-based dynamic pricing for broadband wireless price rate plans , 2019, J. Simulation.

[47]  Pietro Perona,et al.  Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[48]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[49]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[50]  Byron Hall Bayesian Inference , 2011 .

[51]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[52]  John C. Baez,et al.  Relative Entropy in Biological Systems , 2015, Entropy.

[53]  S. Boyd,et al.  Pricing and learning with uncertain demand , 2003 .

[54]  Amir F. Atiya,et al.  Dynamic pricing for hotel revenue management using price multipliers , 2013 .