Shannon Entropy Rate of Hidden Markov Processes

Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is generically infinite. As such, to date one cannot make general statements about how random they are nor how structured. Here, we address the first part of this challenge by showing how to efficiently and accurately calculate their entropy rates. We also show how this method gives the minimal set of infinite predictive features. A sequel addresses the challenge’s second part on structure.

[1]  Vladimir B. Balakirsky,et al.  On the entropy rate of a hidden Markov model , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[2]  Young,et al.  Inferring statistical complexity. , 1989, Physical review letters.

[3]  James P. Crutchfield,et al.  Spectral Simplicity of Apparent Complexity, Part II: Exact Complexities and Complexity Spectra , 2017, Chaos.

[4]  Sean R Eddy,et al.  What is a hidden Markov model? , 2004, Nature Biotechnology.

[5]  Nicholas F. Travers Exponential Bounds for Convergence of Entropy Rate Approximations in Hidden Markov Models Satisfying a Path-Mergeability Condition , 2012, 1211.6181.

[6]  Claude E. Shannon,et al.  A Universal Turing Machine with Two Internal States , 1956 .

[7]  Aaron A. King,et al.  Time series analysis via mechanistic models , 2008, 0802.0021.

[8]  Tsachy Weissman,et al.  New bounds on the entropy rate of hidden Markov processes , 2004, Information Theory Workshop.

[9]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[10]  Karol Zyczkowski,et al.  Entropy computing via integration over fractal measures. , 1998, Chaos.

[11]  J. Crutchfield,et al.  Regularities unseen, randomness observed: levels of entropy convergence. , 2001, Chaos.

[12]  Ya. G. Sinai,et al.  On the Notion of Entropy of a Dynamical System , 2010 .

[13]  A. N. Kolmogorov Combinatorial foundations of information theory and the calculus of probabilities , 1983 .

[14]  James P. Crutchfield,et al.  Nearly Maximally Predictive Features and Their Dimensions , 2017, Physical review. E.

[15]  J. Bechhoefer Hidden Markov models for stochastic thermodynamics , 2015, 1504.00293.

[16]  James P. Crutchfield,et al.  Predictive Rate-Distortion for Infinite-Order Markov Processes , 2016 .

[17]  Naftali Tishby,et al.  Past-future information bottleneck in dynamical systems. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Philippe Jacquet,et al.  On the entropy of a hidden Markov process , 2008, Theor. Comput. Sci..

[19]  Mohammad Rezaeian Hidden Markov Process: A New Representation, Entropy Rate and Estimation Entropy , 2006, ArXiv.

[20]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[21]  James P. Crutchfield,et al.  Information Anatomy of Stochastic Equilibria , 2014, Entropy.

[22]  Rolando Cavazos-Cadena,et al.  An alternative derivation of Birkhoff's formula for the contraction coefficient of a positive matrix , 2003 .

[23]  A. Gabriel Editor , 2018, Best "New" African Poets 2018 Anthology.

[24]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[25]  J. Crutchfield Between order and chaos , 2011, Nature Physics.

[26]  Constance de Koning,et al.  Editors , 2003, Annals of Emergency Medicine.

[27]  Ewan Birney,et al.  Hidden Markov models in biological sequence analysis , 2001, IBM J. Res. Dev..

[28]  A. N. Kolmogorov,et al.  Foundations of the theory of probability , 1960 .

[29]  Michael F. Barnsley,et al.  Fractals everywhere , 1988 .

[30]  James P. Crutchfield,et al.  Exact Complexity: The Spectral Decomposition of Intrinsic Computation , 2013, ArXiv.

[31]  Susanne Still,et al.  Optimal causal inference: estimating stored information and approximating causal architecture. , 2007, Chaos.

[32]  G. Birkhoff Extensions of Jentzsch’s theorem , 1957 .

[33]  Marvin Minsky,et al.  Computation : finite and infinite machines , 2016 .

[34]  James P. Crutchfield,et al.  Functional Thermodynamics of Maxwellian Ratchets: Constructing and Deconstructing Patterns, Randomizing and Derandomizing Behaviors , 2020 .

[35]  Gerd Folkers,et al.  On computable numbers , 2016 .

[36]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[37]  Armen E. Allahverdyan,et al.  Entropy of Hidden Markov Processes via Cycle Expansion , 2008, ArXiv.

[38]  T. Rydén,et al.  Stylized Facts of Daily Return Series and the Hidden Markov Model , 1998 .

[39]  J. Elton An ergodic theorem for iterated maps , 1987, Ergodic Theory and Dynamical Systems.

[40]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[41]  James P. Crutchfield,et al.  Infinite Excess Entropy Processes with Countable-State Generators , 2011, Entropy.

[42]  James P. Crutchfield,et al.  Computational Mechanics: Pattern and Prediction, Structure and Simplicity , 1999, ArXiv.

[43]  J. Crutchfield The calculi of emergence: computation, dynamics and induction , 1994 .

[44]  John J. Birch Approximations for the Entropy for Functions of Markov Chains , 1962 .

[45]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[46]  Brian H. Marcus,et al.  Analyticity of Entropy Rate of Hidden Markov Chains , 2005, IEEE Transactions on Information Theory.

[47]  Elon Kohlberg,et al.  The Contraction Mapping Approach to the Perron-Frobenius Theory: Why Hilbert's Metric? , 1982, Math. Oper. Res..