Computational analysis of the human and other mammalian genomes

Working drafts are now available for the human, mouse and rat genomes, and other mammalian genome sequences are on the way. We discuss some of the key bioinformatic analysis problems presented by this data, including the problems of assembling the sequence, finding the genes and other functional elements, and reconstructing the evolutionary history of the genomes. Recent comparisons between the human and mouse genomes have revealed that approximately 5% of the human genome appears to be more conserved with the orthologous regions in mouse than can be explained assuming neutral evolution. Is this the portion of the genome under selection for specific functions? How can we use comparative genomics to further pinpoint functional elements? How accurately can we reconstruct the evolutionary history of key parts of the human genome? We briefly outline some recent work (described in more detail in Adam Siepel's talk) combining hidden Markov models, used in bioinformatics to analyse DNA from a single species, with continuous time Markov models of molecular evolution, used to reconstruct evolutionary history of several species. While still a long way from answering these questions, these methods may contribute to such investigations.