Applying bioinformatics in the analysis of software variants

Analysis of software similarity is a lively research topic, particularly in the context of software maintenance and software reuse. There exist several approaches to detecting similar code inside one software system and across many systems. While working on similarity analysis of software variants, we observed many analogies between the approaches for analyzing evolution of software and of biological organisms. Hence, we applied bioinformatics concepts used in genome similarity analysis, such as alignments and phylogenetic trees, to software variants. We present the usefulness of these concepts by applying them to a group of related systems from the BSD Unix family.