Statistical Methods Based on Universal Codes

We show how universal codes can be used for solving some of the most important statistical problems for time series. By definition, a universal code (or a universal lossless data compressor) can compress any sequence generated by a stationary and ergodic source asymptotically to the Shannon entropy, which, in turn, is the best achievable ratio for lossless data compressors.

[1]  Pawel Góra,et al.  A New Statistical Method for Filtering and Entropy Estimation of a Chaotic Map from Noisy Data , 2004, Int. J. Bifurc. Chaos.

[2]  Alon Orlitsky,et al.  A lower bound on compression of unknown alphabets , 2005, Theor. Comput. Sci..

[3]  En-Hui Yang,et al.  Grammar-based codes: A new class of universal lossless source codes , 2000, IEEE Trans. Inf. Theory.

[4]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[5]  Ronald de Wolf,et al.  Algorithmic Clustering of Music Based on String Compression , 2004, Computer Music Journal.

[6]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[7]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[8]  Paul H. Algoet,et al.  Universal Schemes for Learning the Best Nonlinear Predictor Given the Infinite Past and Side Information , 1999, IEEE Trans. Inf. Theory.

[9]  A. Shiryaev,et al.  Probability (2nd ed.) , 1995, Technometrics.

[10]  Michelle Effros Distortion-rate bounds for fixed- and variable-rate multiresolution source codes , 1999, IEEE Trans. Inf. Theory.

[11]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[12]  Boris Ryabko,et al.  Using Shannon entropy and Kolmogorov complexity to study the communicative system and cognitive capacities in ants , 1996 .

[13]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[14]  Boris Ryabko,et al.  The Complexity and Effectiveness of Prediction Algorithms , 1994, J. Complex..

[15]  R. Gallager Information Theory and Reliable Communication , 1968 .

[16]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[17]  A. Orlitsky,et al.  Always Good Turing: Asymptotically Optimal Probability Estimation , 2003, Science.

[18]  Jorma Rissanen,et al.  Generalized Kraft Inequality and Arithmetic Coding , 1976, IBM J. Res. Dev..

[19]  Jaakko Astola,et al.  Application of Kolmogorov complexity and universal codes to identity testing and nonparametric testing of serial independence for time series , 2005, Theor. Comput. Sci..

[20]  Serap A. Savari A probabilistic approach to some asymptotics in noiseless communication , 2000, IEEE Trans. Inf. Theory.

[21]  Andrew B. Nobel,et al.  On optimal sequential prediction for general processes , 2003, IEEE Trans. Inf. Theory.

[22]  Jaakko Astola,et al.  Universal Codes as a Basis for Time Series Testing , 2006, ArXiv.

[23]  A. Barron THE STRONG ERGODIC THEOREM FOR DENSITIES: GENERALIZED SHANNON-MCMILLAN-BREIMAN THEOREM' , 1985 .

[24]  P. Billingsley,et al.  Ergodic theory and information , 1966 .

[25]  Jaakko Astola,et al.  Fast Codes for Large Alphabets , 2003, Commun. Inf. Syst..

[26]  Claude E. Shannon,et al.  Communication theory of secrecy systems , 1949, Bell Syst. Tech. J..

[27]  Boris Ryabko,et al.  On Asymptotically Optimal Methods of Prediction and Adaptive Coding for Markov Sources , 2002, J. Complex..

[28]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[29]  Sanjeev R. Kulkarni,et al.  Universal lossless source coding with the Burrows Wheeler Transform , 2002, IEEE Trans. Inf. Theory.

[30]  László Györfi,et al.  There is no universal source code for an infinite source alphabet , 1994, IEEE Trans. Inf. Theory.

[31]  Chuang-Chun Liu,et al.  The optimal error exponent for Markov order estimation , 1996, IEEE Trans. Inf. Theory.

[32]  Marcus Hutter,et al.  Sequence prediction for non-stationary processes , 2006, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery.

[33]  V. A. Monarev,et al.  Experimental investigation of forecasting methods based on data compression algorithms , 2005, Probl. Inf. Transm..

[34]  Igor Vajda,et al.  Estimation of the Information by an Adaptive Partitioning of the Observation Space , 1999, IEEE Trans. Inf. Theory.

[35]  I. Csiszár,et al.  The consistency of the BIC Markov order estimator , 2000 .

[36]  Dharmendra S. Modha,et al.  Memory-Universal Prediction of Stationary Random Processes , 1998, IEEE Trans. Inf. Theory.

[37]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[38]  J. J. Kelly A new interpretation of information rate , 1956 .

[39]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[40]  Boris Ryabko,et al.  The use of ideas of Information Theory for studying "language" and intelligence in ants , 2009, Entropy.

[41]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .