Building Population-Specific Reference Genomes: A Case Study of Vietnamese Reference Genome

The human reference genome is an essential tool for studying human genomes. The standard reference genome is constructed from genomes of a few donors. The 1000 genomes project has revealed a huge amount of genetic differences between diverse populations. It is therefore naturally questioned whether the standard reference genome can work well for all human genome studies or population-specific reference genomes are needed accordingly. In this paper, we present a pipeline for constructing and evaluating a population-specific reference genome. The pipeline was examined on building the Vietnamese reference genome from 100 Kinh Vietnamese genomes obtained from the 1000 genomes project. Experiments showed that the resulting Vietnamese reference genome was better than the standard reference genome at analyzing Vietnamese genomic data. It helped improve the quality of short reads mapping and genotype calling for Vietnamese genomes. The pipeline is applicable for building and evaluating other population-specific reference genomes. For the first time the Vietnamese reference genome, which is now available for further Vietnamese genome studies, was successfully built.