Bayesian Methods and Extensions for the Two State Markov Modulated Poisson Process

We develop a framework for detecting fraud committed by a criminal existing outside the network of accounts he victimizes. The cornerstone of our approach is a Markov modulated Poisson process (MMPP), a Poisson process on which a second Poisson process is superimposed at random intervals determined by a Markov process. The MMPP is known in queuing theory but rare in statistics. The Markov process describes the presence/absence of the criminal. The second Poisson process describes the criminal's tra c when he is present. The theory switches from continuous time to a discrete index set by considering intervals between observations as a sequence of dependent random variables. A result from hidden Markov models (Baum et al., 1970) enables sampling of MMPP parameters from their posterior distribution, given event times, using a Gibbs sampler with only two steps per iteration. One Gibbs step reconstructs the original Poisson and Markov processes from their superposition. The other samples model parameters given the complete data. Displays are produced showing the probability a criminal is present as a function of time. Several extensions to the MMPP are developed. An expanded de nition of hidden Markov models is given, enabling consideration of background tra c generated by a nonhomogeneous Poisson process. Parallel series of events are modeled with mixtures of hierarchical models. Covariates associated with each event are allowed. The ability to stochastically reconstruct the original processes from their superposition in a single Gibbs step is preserved throughout. Our approach is an important departure from earlier fraud detection procedures focusing on characteristics of a single event. The model accepts output from such procedures as covariates, resulting in a principled accumulation of evidence over time. From an applied perspective this means the algorithm may be viewed either as a supplement to existing fraud detection systems or as a free standing procedure in its own right. This and other features of the model are illustrated on an AT&T data set containing information about fraud in international telephone tra c. This thesis was supervised by Arthur Dempster.

[1]  S. Chib,et al.  Bayes inference via Gibbs sampling of autoregressive time series subject to Markov mean and variance shifts , 1993 .

[2]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[3]  C. Morris Natural Exponential Families with Quadratic Variance Functions , 1982 .

[4]  A bayesian treatpent of nonresponse when sampling from a dichotomous population , 1985 .

[5]  E. Kolaczyk Bayesian Multiscale Models for Poisson Processes , 1999 .

[6]  D. Gaver,et al.  Robust empirical bayes analyses of event rates , 1987 .

[7]  G. McLachlan,et al.  Fitting mixture models to grouped and truncated data via the EM algorithm. , 1988, Biometrics.

[8]  M. Rajagopalan,et al.  Bayes estimates of mixing proportions in finite mixture distributions , 1991 .

[9]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[10]  John M. Olin Calculating posterior distributions and modal estimates in Markov mixture models , 1996 .

[11]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[12]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[13]  D. Rubin,et al.  The analysis of repeated-measures data on schizophrenic reaction times using mixture models. , 1995, Statistics in medicine.

[14]  C. Morris Natural Exponential Families with Quadratic Variance Functions: Statistical Theory , 1983 .

[15]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[16]  C. Morris,et al.  Hierarchical Poisson Regression Modeling , 1997 .

[17]  C. Robert,et al.  Bayesian estimation of hidden Markov chains: a stochastic implementation , 1993 .

[18]  G. McLachlan,et al.  Algorithm AS 254: maximum likelihood estimation from grouped and truncated data with finite normal mixture models , 1990 .

[19]  Tom Leonard Bayesiam simultaneous estimation for several multinomial distributions , 1977 .

[20]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[21]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[22]  Qing Du A MONOTONICITY RESULT FOR A SINGLE-SERVER QUEUE SUBJECT TO A MARKOV-MODULATED POISSON PROCESS , 1995 .

[23]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[24]  S. P. Pederson,et al.  Hidden Markov and Other Models for Discrete-Valued Time Series , 1998 .

[25]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[26]  J. Besag,et al.  Bayesian Computation and Stochastic Systems , 1995 .

[27]  R. L. Plackett,et al.  Inference sensitivity for Poisson mixtures , 1978 .

[28]  Jim Albert,et al.  A Bayesian Analysis of a Poisson Random Effects Model for Home Run Hitters , 1992 .

[29]  Stephen E. Fienberg,et al.  Discrete Multivariate Analyses: Theory and Practice , 1977 .

[30]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[31]  C. McLaren,et al.  Detection of two-component mixtures of lognormal distributions in grouped, doubly truncated data: analysis of red blood cell volume distributions. , 1991, Biometrics.

[32]  D. Cox Some Statistical Methods Connected with Series of Events , 1955 .

[33]  J. F. Crook,et al.  The Powers and Strengths of Tests for Multinomials and Contingency Tables , 1982 .

[34]  Ronald A. Thisted,et al.  Elements of statistical computing , 1986 .

[35]  Scott L. Zeger,et al.  Generalized linear models with random e ects: a Gibbs sampling approach , 1991 .

[36]  D. Rubin,et al.  ML ESTIMATION OF THE t DISTRIBUTION USING EM AND ITS EXTENSIONS, ECM AND ECME , 1999 .

[37]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[38]  J. Wendelberger Adventures in Stochastic Processes , 1993 .

[39]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[40]  C. Morris Parametric Empirical Bayes Inference: Theory and Applications , 1983 .

[41]  Daniel B. Carr,et al.  Scatterplot matrix techniques for large N , 1986 .

[42]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  James F. Nelson Multivariate Gamma-Poisson Models , 1985 .

[44]  P. Müller,et al.  Bayesian curve fitting using multivariate normal mixtures , 1996 .

[45]  L. Shepp,et al.  A POISSON PROCESS WHOSE RATE IS A HIDDEN MARKOV PROCESS , 1982 .

[46]  W. Turin Fitting probabilistic automata via the em algorithm , 1996 .

[47]  M. Greenwood,et al.  An Inquiry into the Nature of Frequency Distributions Representative of Multiple Happenings with Particular Reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents , 1920 .

[48]  M. Puterman,et al.  Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. , 1992, Biometrics.

[49]  A. G. Arbous,et al.  Accident statistics and the concept of accident-proneness , 1951 .

[50]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[51]  A. Davison,et al.  Some Models for Discretized Series of Events , 1996 .

[52]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[53]  G. Yule On the Distribution of Deaths with Age when the Causes of Death Act Cumulatively, and Similar Frequency Distributions , 1910 .

[54]  S. Karlin,et al.  A second course in stochastic processes , 1981 .

[55]  J. Rao,et al.  Small-Sample Comparisons of Level and Power for Simple Goodness-of-Fit Statistics under Cluster Sampling , 1987 .

[56]  W. D. Ray Hidden Markov and other models for discrete-valued time series , 1997 .