A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures
暂无分享,去创建一个
Lang Tong | Vincent Y. F. Tan | Anima Anandkumar | Alan S. Willsky | A. Willsky | Anima Anandkumar | L. Tong | V. Tan
[1] A. Wald,et al. On the Statistical Treatment of Linear Stochastic Difference Equations , 1943 .
[2] J. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .
[3] R. Prim. Shortest connection networks and some generalizations , 1957 .
[4] Amiel Feinstein,et al. Information and information stability of random variables and processes , 1964 .
[5] W. Rudin. Principles of mathematical analysis , 1964 .
[6] Robert B. Ash,et al. Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.
[7] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.
[8] Harry L. Van Trees,et al. Detection, Estimation, and Modulation Theory, Part I , 1968 .
[9] Patrick Billingsley,et al. Weak convergence of measures - applications in probability , 1971, CBMS-NSF regional conference series in applied mathematics.
[10] Terry J. Wagner,et al. Consistency of an estimate of tree-dependent probability distributions (Corresp.) , 1973, IEEE Trans. Inf. Theory.
[11] P. J. Huber,et al. Minimax Tests and the Neyman-Pearson Lemma for Capacities , 1973 .
[12] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[13] A. Kester,et al. Large Deviations of Estimators , 1986 .
[14] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[15] Ofer Zeitouni,et al. On universal hypotheses testing via large deviations , 1991, IEEE Trans. Inf. Theory.
[16] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[17] Kathryn Fraughnaugh,et al. Introduction to graph theory , 1973, Mathematical Gazette.
[18] Amir Dembo,et al. Large Deviations Techniques and Applications , 1998 .
[19] Michael I. Jordan. Graphical Models , 2003 .
[20] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[21] A. Antos,et al. Convergence properties of functional estimates for discrete distributions , 2001 .
[22] Marcus Hutter,et al. Distribution of Mutual Information , 2001, NIPS.
[23] Michael I. Jordan,et al. Thin Junction Trees , 2001, NIPS.
[24] Thomas H. Cormen,et al. Introduction to algorithms [2nd ed.] , 2001 .
[25] David R. Karger,et al. Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.
[26] Shun-ichi Amari,et al. Information geometry on hierarchy of probability distributions , 2001, IEEE Trans. Inf. Theory.
[27] Liam Paninski,et al. Estimation of Entropy and Mutual Information , 2003, Neural Computation.
[28] Imre Csiszár,et al. Information projections revisited , 2000, IEEE Trans. Inf. Theory.
[29] Miroslav Dudík,et al. Performance Guarantees for Regularized Maximum Entropy Density Estimation , 2004, COLT.
[30] J. Chazottes,et al. Large deviations for empirical entropies of g-measures , 2004, math/0406083.
[31] Lizhong Zheng,et al. I-Projection and the Geometry of Error Exponents , 2006 .
[32] Sean P. Meyn,et al. Worst-case large-deviation asymptotics with application to queueing and information theory , 2006 .
[33] N. Meinshausen,et al. High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.
[34] Martin J. Wainwright,et al. High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.
[35] Eytan Domany,et al. On the Number of Samples Needed to Learn the Correct Structure of a Bayesian Network , 2006, UAI.
[36] J. N. Laneman. On the Distribution of Mutual Information , 2006 .
[37] Daphne Koller,et al. Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.
[38] Carlos Guestrin,et al. Efficient Principled Learning of Thin Junction Trees , 2007, NIPS.
[39] Thomas Hofmann,et al. Efficient Structure Learning of Markov Networks using L1-Regularization , 2007 .
[40] Venkat Chandrasekaran,et al. Learning Markov Structure by Maximum Entropy Relaxation , 2007, AISTATS.
[41] Richard E. Neapolitan,et al. Learning Bayesian networks , 2007, KDD '07.
[42] B. Schölkopf,et al. High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2007 .
[43] Lizhong Zheng,et al. Euclidean Information Theory , 2008, 2008 IEEE International Zurich Seminar on Communications.
[44] Lizhong Zheng,et al. Linear universal decoding for compound channels: an Euclidean Geometric Approach , 2008, 2008 IEEE International Symposium on Information Theory.
[45] Lang Tong,et al. A large-deviation analysis for the maximum likelihood learning of tree structures , 2009, 2009 IEEE International Symposium on Information Theory.
[46] S. Varadhan,et al. Large deviations , 2019, Graduate Studies in Mathematics.
[47] Vincent Y. F. Tan,et al. Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures , 2009, IEEE Transactions on Signal Processing.
[48] Vincent Y. F. Tan,et al. Error exponents for composite hypothesis testing of Markov forest distributions , 2010, 2010 IEEE International Symposium on Information Theory.
[49] Imre Csiszár,et al. Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .
[50] Vincent Y. F. Tan,et al. Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates , 2010, J. Mach. Learn. Res..
[51] Vincent Y. F. Tan,et al. Learning Latent Tree Graphical Models , 2010, J. Mach. Learn. Res..
[52] Sean P. Meyn,et al. Universal and Composite Hypothesis Testing via Mismatched Divergence , 2009, IEEE Transactions on Information Theory.