Using clustering trees for learning phylogenetic trees

This paper presents ongoing work on an application of machine learning in phylogenetic analysis, which is the study of evolutionary relatedness among various groups of organisms. Insights in evolutionary relationships are important because they can help to determine the function of uncharacterized genes and they can be used to predict future variants of fast-growing viruses. More precisely, we focus on the following task: given a set of DNA sequences, and given that they all originate from a single sequence via successive mutations, find the phylogenetic tree that describes the evolutionary process. In Section 2, we explain what phylogenetic trees are and how they are usually built. Section 3 presents clustering trees, which will be used in Section 4 to construct phylogenetic trees. Section 5 summarizes the advantages of our approach.