Automatic Model Monitoring for Data Streams

Detecting concept drift is a well known problem that affects production systems. However, two important issues that are frequently not addressed in the literature are 1) the detection of drift when the labels are not immediately available; and 2) the automatic generation of explanations to identify possible causes for the drift. For example, a fraud detection model in online payments could show a drift due to a hot sale item (with an increase in false positives) or due to a true fraud attack (with an increase in false negatives) before labels are available. In this paper we propose SAMM, an automatic model monitoring system for data streams. SAMM detects concept drift using a time and space efficient unsupervised streaming algorithm and it generates alarm reports with a summary of the events and features that are important to explain it. SAMM was evaluated in five real world fraud detection datasets, each spanning periods up to eight months and totaling more than 22 million online transactions. We evaluated SAMM using human feedback from domain experts, by sending them 100 reports generated by the system. Our results show that SAMM is able to detect anomalous events in a model life cycle that are considered useful by the domain experts. Given these results, SAMM will be rolled out in a next version of Feedzai's Fraud Detection solution.

[1]  Olli Martikainen,et al.  Exponentially Weighted Simultaneous Estimation of Several Quantiles , 2007 .

[2]  Michèle Sebag,et al.  Towards AutoML in the presence of Drift: first results , 2018, IJCAI 2018.

[3]  Otmar Ertl,et al.  Computing Extremely Accurate Quantiles Using t-Digests , 2019, ArXiv.

[4]  Kilian Q. Weinberger,et al.  Gradient boosted feature selection , 2014, KDD.

[5]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[6]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[7]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[8]  Mehmed M. Kantardzic,et al.  On the reliable detection of concept drift from streaming unlabeled data , 2017, Expert Syst. Appl..

[9]  Indre liobaite,et al.  Change with Delayed Labeling: When is it Detectable? , 2010, ICDM 2010.

[10]  Sriram Subramanian,et al.  ML Health: Fitness Tracking for Production Models , 2019, ArXiv.

[11]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[12]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[13]  Fei Chen,et al.  Incremental quantile estimation for massive tracking , 2000, KDD '00.

[14]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[15]  Stan Matwin,et al.  Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test , 2016, KDD.

[16]  Deepak S. Turaga,et al.  Learning Feature Engineering for Classification , 2017, IJCAI.

[17]  R. Caruana,et al.  Data Diff: Interpretable, Executable Summaries of Changes in Distributions for Data Wrangling , 2018, KDD.

[18]  D. Sculley,et al.  Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[19]  Lars Kotthoff,et al.  Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[20]  Anis Yazidi,et al.  A new quantile tracking algorithm using a generalized exponentially weighted average of observations , 2018, Applied Intelligence.

[21]  Mehryar Mohri,et al.  AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.

[22]  Lu Wang,et al.  Quantiles over data streams: experimental comparisons, new analyses, and further improvements , 2016, The VLDB Journal.

[23]  Luke Tierney,et al.  A Space-Efficient Recursive Procedure for Estimating a Quantile of an Unknown Distribution , 1983 .

[24]  Elliot Meyerson,et al.  Evolutionary neural AutoML for deep learning , 2019, GECCO.

[25]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[26]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[27]  D. Darling,et al.  A Test of Goodness of Fit , 1954 .

[28]  João Gama,et al.  On evaluating stream learning algorithms , 2012, Machine Learning.

[29]  Aaron Klein,et al.  Towards Automatically-Tuned Neural Networks , 2016, AutoML@ICML.

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  N. Kuiper Tests concerning random points on a circle , 1960 .

[32]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[33]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[34]  Vikram Pudi,et al.  AutoLearn — Automated Feature Generation and Selection , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[35]  José Carlos Príncipe,et al.  Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels , 2018, IJCAI.