Reconstruction on Trees: Exponential Moment Bounds for Linear Estimators

Consider a Markov chain $(\xi_v)_{v \in V} \in [k]^V$ on the infinite $b$-ary tree $T = (V,E)$ with irreducible edge transition matrix $M$, where $b \geq 2$, $k \geq 2$ and $[k] = \{1,...,k\}$. We denote by $L_n$ the level-$n$ vertices of $T$. Assume $M$ has a real second-largest (in absolute value) eigenvalue $\lambda$ with corresponding real eigenvector $\nu \neq 0$. Letting $\sigma_v = \nu_{\xi_v}$, we consider the following root-state estimator, which was introduced by Mossel and Peres (2003) in the context of the "recontruction problem" on trees: \begin{equation*} S_n = (b\lambda)^{-n} \sum_{x\in L_n} \sigma_x. \end{equation*} As noted by Mossel and Peres, when $b\lambda^2 > 1$ (the so-called Kesten-Stigum reconstruction phase) the quantity $S_n$ has uniformly bounded variance. Here, we give bounds on the moment-generating functions of $S_n$ and $S_n^2$ when $b\lambda^2 > 1$. Our results have implications for the inference of evolutionary trees.

[1]  R. Graham,et al.  Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time , 1982 .

[2]  H. Kesten,et al.  Additional Limit Theorems for Indecomposable Multidimensional Galton-Watson Processes , 1966 .

[3]  Sébastien Roch,et al.  Phase Transition in Distance-Based Phylogeny Reconstruction , 2011, ArXiv.

[4]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[5]  Elchanan Mossel Phase transitions in phylogeny , 2003, Transactions of the American Mathematical Society.

[6]  Tandy J. Warnow,et al.  A Few Logs Suffice to Build (almost) All Trees: Part II , 1999, Theor. Comput. Sci..

[7]  Sébastien Roch,et al.  Sequence Length Requirement of Distance-Based Phylogeny Reconstruction: Breaking the Polynomial Barrier , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[8]  Elchanan Mossel,et al.  Evolutionary trees and the Ising model on the Bethe lattice: a proof of Steel’s conjecture , 2005, ArXiv.

[9]  László A. Székely,et al.  Inverting Random Functions II: Explicit Bounds for Discrete Maximum Likelihood Estimation, with Applications , 2002, SIAM J. Discret. Math..

[10]  Elchanan Mossel,et al.  Information flow on trees , 2001, math/0107033.

[11]  Tamir Tuller,et al.  Finding a maximum likelihood tree is hard , 2006, JACM.

[12]  Y. Peres,et al.  Broadcasting on trees and the Ising model , 2000 .

[13]  László A. Székely,et al.  Inverting random functions , 1999 .

[14]  Sébastien Roch,et al.  A short proof that phylogenetic tree reconstruction by maximum likelihood is hard , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Kevin Atteson,et al.  The Performance of Neighbor-Joining Methods of Phylogenetic Reconstruction , 1999, Algorithmica.

[16]  S. Roch Toward Extracting All Phylogenetic Information from Matrices of Evolutionary Distances , 2010, Science.

[17]  David Sankoff,et al.  COMPUTATIONAL COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY , 1986 .

[18]  Joseph T. Chang,et al.  A signal-to-noise analysis of phylogeny estimation by neighbor-joining: Insufficiency of polynomial length sequences. , 2006, Mathematical biosciences.

[19]  W. H. Day Computational complexity of inferring phylogenies from dissimilarity matrices. , 1987, Bulletin of mathematical biology.

[20]  K. Athreya,et al.  Large Deviation Rates for Branching Processes. II. The Multitype Case , 1995 .