Characterization of random walks on space of unordered trees using efficient metric simulation

The simple random walk on $\mathbb{Z}^p$ shows two drastically different behaviours depending on the value of $p$: it is recurrent when $p\in\{1,2\}$ while it escapes (with a rate increasing with $p$) as soon as $p\geq3$. This classical example illustrates that the asymptotic properties of a random walk provides some information on the structure of its state space. This paper aims to explore analogous questions on space made up of combinatorial objects with no algebraic structure. We take as a model for this problem the space of unordered unlabeled rooted trees endowed with Zhang edit distance. To this end, it defines the canonical unbiased random walk on the space of trees and provides an efficient algorithm to evaluate its escape rate. Compared to Zhang algorithm, it is incremental and computes the edit distance along the random walk approximately 100 times faster on trees of size $500$ on average. The escape rate of the random walk on trees is precisely estimated using intensive numerical simulations, out of reasonable reach without the incremental algorithm.

[1]  Guillaume Cerutti,et al.  treex: a Python package for manipulating rooted trees , 2019, J. Open Source Softw..

[2]  Romain Azaïs,et al.  Nearest Embedded and Embedding Self-Nested Trees , 2017, Algorithms.

[3]  S. Gouëzel Analyticity of the entropy and the escape rate of random walks in hyperbolic groups , 2017 .

[4]  Christophe Godin,et al.  Approximation of trees by self-nested trees , 2016, ALENEX.

[5]  Zoltán Király,et al.  Efficient implementations of minimum-cost flow algorithms , 2012, ArXiv.

[6]  G. Katriel Asymptotic behavior of random walks on a half-line with a jump at the origin , 2011, 1108.5621.

[7]  Christophe Godin,et al.  Quantifying the Degree of Self-Nestedness of Trees: Application to the Structural Analysis of Plants , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Atsuhiro Takasu,et al.  Exact algorithms for computing the tree edit distance between unordered trees , 2010, Theor. Comput. Sci..

[9]  Philippe Flajolet,et al.  Analytic Combinatorics , 2009 .

[10]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[11]  Kaizhong Zhang,et al.  A constrained edit distance between unordered labeled trees , 1996, Algorithmica.

[12]  Kaizhong Zhang,et al.  On the Editing Distance Between Unordered Labeled Trees , 1992, Inf. Process. Lett..

[13]  Andrew V. Goldberg,et al.  Finding Minimum-Cost Circulations by Successive Approximation , 1990, Math. Oper. Res..

[14]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[15]  Shin-Yee Lu A Tree-to-Tree Distance and Its Application to Cluster Analysis , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  F. Göbel,et al.  Random walks on graphs , 1974 .

[17]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[18]  J. Mazur Distribution Function of the End-to-End Distances of Linear Polymers With Excluded Volume Effects. , 1965, Journal of research of the National Bureau of Standards. Section A, Physics and chemistry.

[19]  Inge Li Gørtz,et al.  COMP251: Network flows , 2014 .

[20]  J. Delvenne,et al.  Random walks on graphs , 2004 .

[21]  W. Woess Random Walks on Infinite Graphs and Groups: An introduction to topological boundary theory , 2000 .

[22]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[23]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .