Identifying Maximal Perfect Haplotype Blocks

The concept of maximal perfect haplotype blocks is introduced as a simple pattern allowing to identify genomic regions that show signatures of natural selection. The model is formally defined and a simple algorithm is presented to find all perfect haplotype blocks in a set of phased chromosome sequences. Application to three whole chromosomes from the 1000 genomes project phase 3 data set shows the potential of the concept as an effective approach for quick detection of selection in large sets of thousands of genomes.

[1]  Veli Mäkinen,et al.  Minimum Segmentation for Pan-genomic Founder Reconstruction in Linear Time , 2018, WABI.

[2]  Ross M. McConnell,et al.  Linear-time modular decomposition of directed graphs , 2005, Discret. Appl. Math..

[3]  Veli Mäkinen,et al.  Linear time minimum segmentation enables scalable founder reconstruction , 2019, Algorithms for Molecular Biology.

[4]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[5]  Erika Check Hayden,et al.  Technology: The $1,000 genome , 2014, Nature.

[6]  Richard Durbin,et al.  Efficient haplotype matching and storage using the positional Burrows–Wheeler transform (PBWT) , 2014, Bioinform..

[7]  Brent S. Pedersen,et al.  cyvcf2: fast, flexible variant analysis with Python , 2017, Bioinform..

[8]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[9]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[10]  Laure Ségurel,et al.  On the Evolution of Lactase Persistence in Humans. , 2017, Annual review of genomics and human genetics.

[11]  M. Slatkin,et al.  Inferring Selection Intensity and Allele Age from Multilocus Haplotype Structure , 2013, G3: Genes, Genomes, Genetics.

[12]  Jody Hey,et al.  A Hidden Markov Model for Investigating Recent Positive Selection through Haplotype Structure , 2014, bioRxiv.