Data, health, and algorithmics: computational challenges for biomedicine

In the decade following the completion of the Human Genome Project in 2000, the cost of sequencing DNA fell by a factor of around a million, and continues to fall. Applications of sequencing in health include precise diagnosis of infection and disease, lifestyle management, and development of highly targeted treatments. However, the volume and complexity of the data produced by these technologies presents a severe computational challenge. Breakthroughs in methods for search, storage, and analysis are required to keep pace with the flow of data, and to make use of the changes in biomedical knowledge that these technologies are creating. This keynote is an overview of some of these technologies and the new computational obstacles they have engendered, and reviews examples of algorithmic innovations and approaches currently being explored. These illustrate both the kinds of solutions that are required and the challenges that must be addressed to allow this data to be fully exploited.