ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

BackgroundUse of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs). By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required.ResultsWe developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo.ConclusionsParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.

[1]  Hiroshi Sato,et al.  Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction , 2002, Nature Genetics.

[2]  Yusuke Nakamura,et al.  Genome-wide association study of hematological and biochemical traits in a Japanese population , 2010, Nature Genetics.

[3]  Rolf Rabenseifner,et al.  Hybrid Parallel Programming on HPC Platforms , 2003 .

[4]  Yusuke Nakamura,et al.  An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis , 2003, Nature Genetics.

[5]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[6]  Naoyuki Kamatani,et al.  ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing , 2010, Source Code for Biology and Medicine.

[7]  Zhaohui S. Qin,et al.  A comparison of phasing algorithms for trios and unrelated individuals. , 2006, American journal of human genetics.

[8]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[9]  P. Tam The International HapMap Consortium. The International HapMap Project (Co-PI of Hong Kong Centre which responsible for 2.5% of genome) , 2003 .

[10]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[11]  Yusuke Nakamura,et al.  ITPKC functional polymorphism associated with Kawasaki disease susceptibility and formation of coronary artery aneurysms , 2008, Nature Genetics.

[12]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[13]  Michael Krawczak,et al.  A comprehensive evaluation of SNP genotype imputation , 2009, Human Genetics.

[14]  Naoyuki Kamatani,et al.  ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing , 2009, Source Code for Biology and Medicine.

[15]  Andrey V. Mardanov,et al.  Complete Sequence of the Duckweed (Lemna minor) Chloroplast Genome: Structural Organization and Phylogenetic Relationships to Other Angiosperms , 2008, Journal of Molecular Evolution.

[16]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[17]  Yusuke Nakamura,et al.  [BioBank Japan project]. , 2005, Nihon rinsho. Japanese journal of clinical medicine.

[18]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[19]  Mark Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[20]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.