Variable Length Markov Chains: Methodology, Computing, and Software

We study estimation in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of high order, but with memory of variable length yielding a much bigger and structurally richer class of models than ordinary high-order Markov chains. From an algorithmic view, the VLMC model class has attracted interest in information theory and machine learning, but statistical properties have not yet been explored. Provided that good estimation is available, the additional structural richness of the model class enhances predictive power by finding a better trade-off between model bias and variance and allowing better structural description which can be of specific interest. The latter is exemplified with some DNA data. A version of the tree-structured context algorithm, proposed by Rissanen in an information theoretical set-up is shown to have new good asymptotic properties for estimation in the class of VLMCs. This remains true even when the underlying model increases in dimensionality. Furthermore, consistent estimation of minimal state spaces and mixing properties of fitted models are given. We also propose a new bootstrap scheme based on fitted VLMCs. We show its validity for quite general stationary categorical time series and for a broad range of statistical procedures.

[1]  J. Lamperti ON CONVERGENCE OF STOCHASTIC PROCESSES , 1962 .

[2]  Radu Theodorescu,et al.  Random processes and learning , 1969 .

[3]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[4]  C. Withers Central Limit Theorems for dependent variables. I , 1981 .

[5]  JORMA RISSANEN,et al.  A universal data compression system , 1983, IEEE Trans. Inf. Theory.

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  R. Gill Non- and semi-parametric maximum likelihood estimators and the Von Mises method , 1986 .

[8]  Jorma Rissanen,et al.  Complexity of strings in the class of Markov sources , 1986, IEEE Trans. Inf. Theory.

[9]  H. Künsch The Jackknife and the Bootstrap for General Stationary Observations , 1989 .

[10]  M. B. Rajarshi Bootstrap in Markov-sequences based on estimates of transition density , 1990 .

[11]  P. Bickel,et al.  Achieving Information Bounds in Non and Semiparametric Models , 1990 .

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[14]  Abraham Lempel,et al.  A sequential algorithm for the universal coding of finite memory sources , 1992, IEEE Trans. Inf. Theory.

[15]  P. Doukhan Mixing: Properties and Examples , 1994 .

[16]  A. Raftery,et al.  Estimation and Modelling Repeated Patterns in High Order Markov Chains with the Mixture Transition Distribution Model , 1994 .

[17]  L. Fahrmeir,et al.  Multivariate statistical modelling based on generalized linear models , 1994 .

[18]  M. A. Arcones,et al.  Central limit theorems for empirical andU-processes of stationary mixing sequences , 1994 .

[19]  D. Pollard,et al.  An introduction to functional central limit theorems for dependent stochastic processes , 1994 .

[20]  David R. Brillinger,et al.  Examples of Scientific Problems and Data Analyses in Demography, Neurophysiology, and Seismology , 1994 .

[21]  M. Feder,et al.  Predictive stochastic complexity and model estimation for finite-state processes , 1994 .

[22]  Bernard Prum,et al.  Finding words with unexpected frequencies in deoxyribonucleic acid sequences , 1995 .

[23]  D. R. Brillinger,et al.  Trend analysis: binary-valued and point cases , 1995 .

[24]  Meir Feder,et al.  A universal finite memory source , 1995, IEEE Trans. Inf. Theory.

[25]  P. Guttorp Stochastic modeling of scientific data , 1995 .

[26]  Frans M. J. Willems,et al.  Context weighting for general finite-context sources , 1996, IEEE Trans. Inf. Theory.

[27]  R. Shibata BOOTSTRAP ESTIMATE OF KULLBACK-LEIBLER INFORMATION FOR MODEL SELECTION , 1997 .

[28]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[29]  P. Bühlmann Extreme events from the return-volume process: a discretization approach for complexity reduction , 1998 .

[30]  Jun-ichiro Fukuchi,et al.  Subsampling and model selection in time series analysis , 1999 .

[31]  Peter Bühlmann,et al.  Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm , 2000 .