Classification Tree Generation Constrained with Variable Weights

Trees are a useful framework for classifying entities whose attributes are, at least partially, related through a common ancestry, such as species of organisms, family members or languages. In some common applications, such as phylogenetic trees based on DNA sequences, relatedness can be inferred from the statistical analysis of unweighted attributes. In this paper we present a Constraint Programming approach that can enforce consistency between bounds on the relative weight of each trait and tree topologies, so that the user can best determine which sets of traits to use and how the entities are likely to be related.