Non-Parametric Stochastic Sequential Assignment With Random Arrival Times

We consider a problem wherein jobs arrive at random times and assume random values. Upon each job arrival, the decision-maker must decide immediately whether or not to accept the job and gain the value on offer as a reward, with the constraint that they may only accept at most n jobs over some reference time period. The decision-maker only has access to M independent realisations of the job arrival process. We propose an algorithm, Non-Parametric Sequential Allocation (NPSA), for solving this problem. Moreover, we prove that the expected reward returned by the NPSA algorithm converges in probability to optimality as M grows large. We demonstrate the effectiveness of the algorithm empirically on synthetic data and on public fraud-detection datasets, from where the motivation for this work is derived.

[1]  J. Dormand,et al.  A family of embedded Runge-Kutta formulae , 1980 .

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .

[4]  Charles B. Davis Compliance Quantified: An Introduction to Data Verification , 1996, Technometrics.

[5]  Björn E. Ottersten,et al.  Example-Dependent Cost-Sensitive Logistic Regression for Credit Scoring , 2014, 2014 13th International Conference on Machine Learning and Applications.

[6]  F. Brauer Bounds for solutions of ordinary differential equations , 1963 .

[7]  M. Sakaguchi A SEQUENTIAL ALLOCATION GAME FOR TARGETS WITH VARYING VALUES , 1977 .

[8]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[9]  Hongda Shen,et al.  Deep Q-network-based adaptive alert threshold selection policy for payment fraud systems in retail banking , 2020, ICAIF.

[10]  Golshid Baharian Khoshkhou Stochastic sequential assignment problem , 2014 .

[11]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[12]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[13]  Euclid,et al.  Advances in Applied Probability , 1981, Journal of Applied Probability.

[14]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[15]  Gianluca Bontempi,et al.  Adaptive Machine Learning for Credit Card Fraud Detection , 2015 .

[16]  I. Morishita,et al.  Automated visual inspection systems for industrial applications , 1983 .

[17]  Cédric Archambeau,et al.  Adaptive Algorithms for Online Convex Optimization with Long-term Constraints , 2015, ICML.

[18]  Joel Nothman,et al.  SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[19]  Richard A. Johnson,et al.  A new family of power transformations to improve normality or symmetry , 2000 .

[20]  M. Degroot Optimal Statistical Decisions , 1970 .

[21]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[22]  Masayuki Noro,et al.  A Computer Algebra System , 2022 .

[23]  E. Altman Constrained Markov Decision Processes , 1999 .

[24]  Branislav Bosanský,et al.  Game-theoretic resource allocation for malicious packet detection in computer networks , 2012, AAMAS.

[25]  C. Derman,et al.  A Sequential Stochastic Assignment Problem , 1972 .

[26]  John N. Tsitsiklis,et al.  Online Learning with Constraints , 2006, COLT.

[27]  Reid A. Johnson,et al.  Calibrating Probability with Undersampling for Unbalanced Classification , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[28]  Shane G. Henderson,et al.  Estimation for nonhomogeneous Poisson processes from aggregated data , 2003, Oper. Res. Lett..

[29]  Zhi-Hua Zhou,et al.  Bandit Convex Optimization in Non-stationary Environments , 2019, AISTATS.

[30]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[31]  Niall Murphy,et al.  Site Reliability Engineering: How Google Runs Production Systems , 2016 .