Characterizing phylogenetically decisive taxon coverage

Abstract Increasingly, biologists are constructing evolutionary trees on large numbers of overlapping sets of taxa, and then combining them into a ‘supertree’ that classifies all the taxa. In this paper, we ask how much coverage of the total set of taxa is required by these subsets in order to ensure that we have enough information to reconstruct the supertree uniquely. We describe two results — a combinatorial characterization of the covering subsets to ensure that at most one supertree can be constructed from the smaller trees (whichever trees these may be) and a more liberal analysis that asks only that the supertree is highly likely to be uniquely specified by the tree structure on the covering subsets.