Memory Order Decomposition of Symbolic Sequences

We introduce a general method for the study of memory in symbolic sequences based on higher-order Markov analysis. The Markov process that best represents a sequence is expressed as a mixture of matrices of minimal orders, enabling the definition of the so-called memory profile, which unambiguously reflects the true order of correlations. The method is validated by recovering the memory profiles of tunable synthetic sequences. Finally, we scan real data and showcase with practical examples how our protocol can be used to extract relevant stochastic properties of symbolic sequences.

[1]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[2]  S. D. Pethel,et al.  Exact significance test for Markov order , 2014 .

[3]  Stan Matwin,et al.  Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2017, KDD.

[4]  Martin Rosvall,et al.  Memory in network flows and its effects on spreading dynamics and community detection , 2013, Nature Communications.

[5]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[6]  Jeffrey Shallit,et al.  Automatic Sequences by Jean-Paul Allouche , 2003 .

[7]  Nils Lid Hjort,et al.  Model Selection and Model Averaging , 2001 .

[8]  STAT , 2019, Springer Reference Medizin.

[9]  Simul , 2021 .

[10]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[11]  C. R. Gonçalves,et al.  Markov chain order estimation based on the chi‐square divergence , 2014 .

[12]  Fabrizio Lillo,et al.  Effects of memory on spreading processes in non-Markovian temporal networks , 2018, New Journal of Physics.

[13]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[14]  L. Pardo Statistical Inference Based on Divergence Measures , 2005 .

[15]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[16]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[17]  Leandro Pardo,et al.  Testing the Order of Markov Dependence in DNA Sequences , 2011 .

[18]  Yvonne Freeh,et al.  Non Negative Matrices And Markov Chains Springer Series In Statistics , 2016 .

[19]  Dimitris Kugiumtzis,et al.  Markov chain order estimation with conditional mutual information , 2013 .

[20]  R. Lambiotte,et al.  From networks to optimal higher-order models of complex systems , 2019, Nature Physics.

[21]  Devdatt P. Dubhashi,et al.  The Peres-Shields Order Estimator for Fixed and Variable Length Markov Models with Applications to DNA Sequence Similarity , 2005, WABI.

[22]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[23]  Ingo Scholtes,et al.  Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal networks , 2013, Nature Communications.

[24]  Ericka Stricklin-Parker,et al.  Ann , 2005 .

[25]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[26]  Dimitris Kugiumtzis,et al.  Markov chain order estimation with parametric significance tests of conditional mutual information , 2015, Simul. Model. Pract. Theory.