Testing metric properties

Finite metric spaces, and in particular tree metrics play an important role in various disciplines such as evolutionary biology and statistics. A natural family of problems concerning metrics is deciding, given a matrix <italic>M</italic>, whether or not it is a distance metric of a certain predetermined type. Here we consider the following relaxed version of such decision problems: For any given matrix <italic>M</italic> and parameter <italic>\eps</italic>, we are interested in determining, by probing <italic>M</italic>, whether <italic>M</italic> has a particular metric property <italic>P</italic>, or whether it is <italic>ε far</italic> from having the property. In <italic>ε far</italic> we mean that more than an ε-fraction of the entries of <italic>M</italic> must be modified so that it obtains the property. The algorithm may query the matrix on entries <italic>M[i,j]</italic> of its choice, and is allowed a constant probability of error. We describe algorithms for testing Euclidean metrics, tree metrics and ultrametrics. Furthermore, we present an algorithm that tests whether a matrix <italic>M<italic> is an approximate ultrametric. In all cases the query complexity and running time are polynomial in <italic>1 &egr<italic> and independent of the size of the matrix. Finally, our algorithms can be used to solve relaxed versions of the corresponding search problems in time that is sub-linear in the size of the matrix.

[1]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[2]  P. Erdös On an extremal problem in graph theory , 1970 .

[3]  W. A. Beyer,et al.  Additive evolutionary trees. , 1977, Journal of theoretical biology.

[4]  W. H. Day Computational complexity of inferring phylogenies from dissimilarity matrices. , 1987, Bulletin of mathematical biology.

[5]  Artur Czumaj,et al.  Property Testing in Computational Geometry , 2000, ESA.

[6]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1998, JACM.

[7]  Mikkel Thorup,et al.  On the approximability of numerical taxonomy (fitting distances by tree metrics) , 1996, SODA '96.

[8]  Alain Guénoche,et al.  Trees and proximity representations (book review) , 1992 .

[9]  L. Cavalli-Sforza,et al.  PHYLOGENETIC ANALYSIS: MODELS AND ESTIMATION PROCEDURES , 1967, Evolution; international journal of organic evolution.

[10]  Eugene L. Lawler,et al.  Determining the evolutionary tree , 1990, SODA '90.

[11]  Mirko Krvanek The Complexity of Ultrametric Partitions on Graphs , 1988, Inf. Process. Lett..

[12]  Sampath Kannan,et al.  A robust model for finding optimal evolutionary trees , 1993, Algorithmica.

[13]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[14]  Eugene L. Lawler,et al.  Determining the Evolutionary Tree Using Experiments , 1996, J. Algorithms.

[15]  Noga Alon,et al.  Regular languages are testable with a constant number of queries , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[16]  Noga Alon,et al.  Testing of clustering , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[17]  J. Felsenstein Numerical Methods for Inferring Evolutionary Trees , 1982, The Quarterly Review of Biology.

[18]  Piotr Rudnicki,et al.  A Fast Algorithm for Constructing Trees from Distance Matrices , 1989, Inf. Process. Lett..

[19]  Ronitt Rubinfeld,et al.  Spot-checkers , 1998, STOC '98.

[20]  Noga Alon,et al.  Efficient Testing of Large Graphs , 2000, Comb..

[21]  Dana Ron,et al.  Testing metric properties , 2003, Inf. Comput..

[22]  Dana Ron,et al.  Property Testing in Bounded Degree Graphs , 1997, STOC.