Chromosomal haplotypes by genetic phasing of human families.

Assignment of alleles to haplotypes for nearly all the variants on all chromosomes can be performed by genetic analysis of a nuclear family with three or more children. Whole-genome sequence data enable deterministic phasing of nearly all sequenced alleles by permitting assignment of recombinations to precise chromosomal positions and specific meioses. We demonstrate this process of genetic phasing on two families each with four children. We generate haplotypes for all of the children and their parents; these haplotypes span all genotyped positions, including rare variants. Misassignments of phase between variants (switch errors) are nearly absent. Our algorithm can also produce multimegabase haplotypes for nuclear families with just two children and can handle families with missing individuals. We implement our algorithm in a suite of software scripts (Haploscribe). Haplotypes and family genome sequences will become increasingly important for personalized medicine and for fundamental biology.

[1]  Leon W. Cohen,et al.  Conference Board of the Mathematical Sciences , 1963 .

[2]  V. Bansal,et al.  The importance of phase information for human genomics , 2011, Nature Reviews Genetics.

[3]  Zhaohui S. Qin,et al.  A comparison of phasing algorithms for trios and unrelated individuals. , 2006, American journal of human genetics.

[4]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[5]  E. Wijsman A deductive method of haplotype analysis in pedigrees. , 1987, American journal of human genetics.

[6]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[7]  Peter Donnelly,et al.  A comparison of bayesian methods for haplotype reconstruction from population genotype data. , 2003, American journal of human genetics.

[8]  Andrew C. Adey,et al.  Haplotype-resolved genome sequencing of a Gujarati Indian individual , 2011, Nature Biotechnology.

[9]  Xin Li Haplotype Inference from Pedigree Data and Population Data , 2010 .

[10]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[11]  D. Waggoner Mechanisms of disease: epigenesis. , 2007, Seminars in pediatric neurology.

[12]  Russell Schwartz,et al.  Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem , 2002, Briefings Bioinform..

[13]  D. Qian,et al.  Minimum-recombinant haplotyping in pedigrees. , 2002, American journal of human genetics.

[14]  J. Roach,et al.  Pairwise end sequencing: a unified approach to genomic mapping and sequencing. , 1995, Genomics.

[15]  Elizabeth A. Thompson,et al.  Statistical inference from genetic data on pedigrees , 2003 .

[16]  A. Whittemore,et al.  Allele-sharing among affected relatives: non-parametric methods for identifying genes , 2001, Statistical methods in medical research.

[17]  M. Simon,et al.  Analysis of the 1.1-Mb Human α/δ T-Cell Receptor Locus with Bacterial Artificial Chromosome Clones , 1997 .

[18]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[19]  Genomics: No half measures for haplotypes , 2011, Nature Reviews Genetics.

[20]  K. P. Donnelly,et al.  The probability that related individuals share some section of genome identical by descent. , 1983, Theoretical population biology.

[21]  Robert M. Plenge,et al.  Defining the Role of the MHC in Autoimmunity: A Review and Pooled Analysis , 2008, PLoS genetics.

[22]  T. Niu Algorithms for inferring haplotypes , 2004, Genetic epidemiology.

[23]  Stephen R Quake,et al.  Whole-genome molecular haplotyping of single cells , 2011, Nature Biotechnology.