Optimal Computing Budget Allocation for Binary Classification with Noisy Labels and its Applications on Simulation Analytics

In this study, we consider the budget allocation problem for binary classification with noisy labels. The classification accuracy can be improved by reducing the label noises which can be achieved by observing multiple independent observations of the labels. Hence, an efficient budget allocation strategy is needed to reduce the label noise and meanwhile guarantees a promising classification accuracy. Two problem settings are investigated in this work. One assumes that we do not know the underlying classification structures and labels can only be determined by comparing the sample average of its Bernoulli success probability with a given threshold. The other case assumes that data points with different labels can be separated by a hyperplane. For both cases, the closed-form optimal budget allocation strategies are developed. A simulation analytics example is used to demonstrate how the budget is allocated to different scenarios to further improve the learning of optimal decision functions.

[1]  Ying Daisy Zhuo,et al.  Robust Classification , 2019, INFORMS Journal on Optimization.

[2]  The probability of correct selection , 1988 .

[3]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[4]  Peter W. Glynn,et al.  A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..

[5]  Hui Xiao,et al.  Robust ranking and selection with optimal computing budget allocation , 2017, Autom..

[6]  Weizhi Liu,et al.  A multi-objective perspective on robust ranking and selection , 2017, 2017 Winter Simulation Conference (WSC).

[7]  P. Frazier,et al.  Advanced tutorial: Input uncertainty and robust analysis in stochastic simulation , 2016, 2016 Winter Simulation Conference (WSC).

[8]  W. Karush Minima of Functions of Several Variables with Inequalities as Side Conditions , 2014 .

[9]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Robustness of classifiers: from adversarial to random noise , 2016, NIPS.

[10]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Guy Feldman,et al.  SCORE Allocations for Bi-objective Ranking and Selection , 2018, ACM Trans. Model. Comput. Simul..

[12]  Claudio Gentile,et al.  Learning noisy linear classifiers via adaptive and selective sampling , 2011, Machine Learning.

[13]  Barry L. Nelson,et al.  Input-Output Uncertainty Comparisons for Discrete Optimization via Simulation , 2019, Oper. Res..

[14]  Enver Yücesan,et al.  A new perspective on feasibility determination , 2008, 2008 Winter Simulation Conference.

[15]  Loo Hay Lee,et al.  Optimal Computing Budget Allocation to Select the Nondominated Systems—A Large Deviations Perspective , 2018, IEEE Transactions on Automatic Control.

[16]  L. Lee,et al.  Finding the non-dominated Pareto set for multi-objective simulation models , 2010 .

[17]  Chun-Hung Chen,et al.  Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[18]  Loo Hay Lee,et al.  Approximate Simulation Budget Allocation for Selecting the Best Design in the Presence of Stochastic Constraints , 2012, IEEE Transactions on Automatic Control.

[19]  Cheng Soon Ong,et al.  Learning from Corrupted Binary Labels via Class-Probability Estimation , 2015, ICML.

[20]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[21]  Nagarajan Natarajan,et al.  Cost-Sensitive Learning with Noisy Labels , 2017, J. Mach. Learn. Res..

[22]  Mohan S. Kankanhalli,et al.  Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).