Stochastic input model selection

Input modeling is the selection of a probability distribution to capture the uncertainty in the input environment of a stochastic system. Example applications of input modeling include the representation of the randomness in the time to failure for a machining process, the time between arrivals of calls to a call center, and the demand received for a product of an inventory system. Building simulations of stochastic systems requires the development of input models that adequately represent the uncertainty in such random variables. Since there are an abundance of probability distributions that can be used for this purpose, a natural question to ask is how to identify the probability distribution that best represents the particular situation under study. For example, is the exponential distribution a reasonable choice to represent the time to failure for a machining process, or is it better to use an empirical distribution function obtained from the historical time-to-failure data? Recognizing the fact that there is no true input model waiting to be found, the goal of stochastic input modeling is to obtain an approximation that captures the key characteristics of the system inputs. The development of a good input model requires the collection of as much information as possible about the relevant randomness in the system as well as the historical data consisting of the past realizations of the random variables of interest. In the presence of a data set, the input model can be identified by fitting a probability distribution to the historical data. However, it may be difficult and/or costly to collect data for the stochastic system under study; it can also be impossible to properly collect any data at all such as when the proposed system does not exist. In the absence of historical data, any relevant information (e.g., expert opinion and the conventional bounds suggested by the underlying physical situation) can be used for input modeling. This article addresses the key issues that arise in stochastic input modeling both in the presence and in the absence of historical data. The first step in input modeling is to identify the sources of randomness in the input environment of the system under study. Many stochastic systems contain multiple sources of uncertainty, e.g., the completion time of an item on a particular machine, the potential breakdown of the machine, and the percentage of defective items produced by the machine might be among the sources of uncertainty in a manufacturing setting. Throughout, the random vector X = (X1, X2, …, X K )′ is used to represent the collection of K different inputs of a stochastic system, where X k is the random variable denoting the kth system input. The K components of this random vector might also be correlated with each other. Therefore, the stochastic properties of the random inputs X k , k = 1, 2, …, K, are captured in the joint probability

[1]  S. Vincent Input Data Analysis , 2007 .

[2]  Barry L. Nelson,et al.  Autoregressive to anything: Time-series input processes for simulation , 1996, Oper. Res. Lett..

[3]  K. Pearson Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material , 1895 .

[4]  Miron Livny,et al.  The Impact of Autocorrelation on Queuing Systems , 1993 .

[5]  Stuart Jay Deutsch,et al.  A Versatile Four Parameter Family of Probability Distributions Suitable for Simulation , 1977 .

[6]  N. L. Johnson,et al.  Continuous Multivariate Distributions: Models and Applications , 2005 .

[7]  Barry L. Nelson,et al.  Fitting Time-Series Input Processes for Simulation , 2005, Oper. Res..

[8]  H. Joe Multivariate models and dependence concepts , 1998 .

[9]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[10]  Philip M. Lurie,et al.  An Approximate Method for Sampling Correlated Random Variables From Partially-Specified Distributions , 1998 .

[11]  Benjamin Melamed,et al.  TES: A Class of Methods for Generating Autocorrelated Uniform Variates , 1991, INFORMS J. Comput..

[12]  Bruce W. Schmeiser,et al.  An approximate method for generating symmetric random variables , 1972, CACM.

[13]  Barry L. Nelson,et al.  Numerical Methods for Fitting and Simulating Autoregressive-to-Anything Processes , 1998, INFORMS J. Comput..

[14]  V. Rohatgi,et al.  An introduction to probability and statistics , 1968 .

[15]  W. J. DeCoursey,et al.  Introduction: Probability and Statistics , 2003 .

[16]  Bahar Biller,et al.  Copula-Based Multivariate Input Models for Stochastic Simulation , 2009, Oper. Res..

[17]  Barry L. Nelson,et al.  Modeling and generating multivariate time-series input processes using a vector autoregressive technique , 2003, TOMC.

[18]  J. Banks,et al.  Handbook of Simulation , 1998 .

[19]  L. Devroye Discrete Univariate Distributions , 1986 .

[20]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[21]  N. L. Johnson,et al.  Continuous Multivariate Distributions, Volume 1: Models and Applications , 2019 .

[22]  Shane G. Henderson,et al.  Chessboard Distributions and Random Vectors with Specified Marginals and Covariance Matrix , 2002, Oper. Res..

[23]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[24]  N. L. Johnson,et al.  Discrete Multivariate Distributions , 1998 .

[25]  Emily K. Lada,et al.  Multivariate Input Models for Stochastic Simulation , 2005 .

[26]  A. W. Kemp,et al.  Univariate Discrete Distributions: Johnson/Univariate Discrete Distributions , 2005 .