The power of amnesia: Learning probabilistic automata with variable memory length
暂无分享,去创建一个
[1] Claude E. Shannon,et al. Prediction and Entropy of Printed English , 1951 .
[2] F. Jelinek. Fast sequential decoding algorithm using a stack , 1969 .
[3] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[4] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[5] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[6] Abraham Lempel,et al. Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.
[7] JORMA RISSANEN,et al. A universal data compression system , 1983, IEEE Trans. Inf. Theory.
[8] A. Nadas,et al. Estimation of probabilities in the language model of the IBM speech recognition system , 1984 .
[9] Frederick Jelinek,et al. Markov Source Modeling of Text Generation , 1985 .
[10] Jorma Rissanen,et al. Complexity of strings in the class of Markov sources , 1986, IEEE Trans. Inf. Theory.
[11] Milena Mihail,et al. Conductance and convergence of Markov chains-a combinatorial treatment of expanders , 1989, 30th Annual Symposium on Foundations of Computer Science.
[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[13] Anselm Blumer. Applications of DAWGs to data compression , 1990 .
[14] Manfred K. Warmuth,et al. On the Computational Complexity of Approximating Distributions by Probabilistic Automata , 1990, COLT '90.
[15] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[16] P. Krishnan,et al. Optimal prefetching via data compression , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.
[17] J. A. Fill. Eigenvalue bounds on convergence to stationarity for nonreversible markov chains , 1991 .
[18] Eyal Kushilevitz,et al. Learning decision trees using the Fourier spectrum , 1991, STOC '91.
[19] Abraham Lempel,et al. A sequential algorithm for the universal coding of finite memory sources , 1992, IEEE Trans. Inf. Theory.
[20] Dana Ron,et al. The Power of Amnesia , 1993, NIPS.
[21] Klaus-Uwe Höffgen,et al. Learning and robust learning of product distributions , 1993, COLT '93.
[22] Ronitt Rubinfeld,et al. Efficient learning of typical finite automata from random walks , 1993, STOC.
[23] D. Haussler,et al. A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.
[24] P. Krishnan,et al. Optimal prediction for prefetching in the worst case , 1994, SODA '94.
[25] Hinrich Schütze,et al. Part-of-Speech Tagging Using a Variable Memory Markov Model , 1994, ACL.
[26] Michael Sipser,et al. Inference and minimization of hidden Markov chains , 1994, COLT '94.
[27] Ronitt Rubinfeld,et al. On the learnability of discrete distributions , 1994, STOC '94.
[28] Frans M. J. Willems,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.
[29] Dana Ron,et al. On the learnability and usage of acyclic probabilistic finite automata , 1995, COLT '95.
[30] J. Cleary,et al. \self-organized Language Modeling for Speech Recognition". In , 2022 .
[31] Naoki Abe,et al. On the computational complexity of approximating distributions by probabilistic automata , 1990, Machine Learning.
[32] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[33] Ronald Saul,et al. Discrete sequence prediction and its applications , 2005, Machine Learning.