Grid Based Genome Wide Studies on Atrial Flutter

The Genetic Linkage Analysis of SNP (Single Nucleotide Polymorphism) markers permits the discovery of genetic correlations in complex diseases following their transmission through family generations. However, all major algorithms proposed in the literature require high computational power and memory availability, making large data sets very hard to analyze on a single CPU. A facility for achieving a Whole-Genome Linkage Analysis has been set up as a web application upon a highly distributed infrastructure: the EGEE Grid. Test cases have been run with 10,000 to one million SNPs per Chip and, after validation, the application has been effectively used for a study on cardiac conduction disorders.

[1]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[2]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[3]  Gabriele Antonio Trombetti Enabling computationally intensive bioinformatics applications on the Grid platform , 2008 .

[4]  Luciano Milanesi,et al.  Data handling strategies for high throughput pyrosequencers , 2007, BMC Bioinformatics.

[5]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[6]  Michael J. Lewis,et al.  Grid Resource Availability Prediction-Based Scheduling and Task Replication , 2009, Journal of Grid Computing.

[7]  Andy M. Yip,et al.  Gene network interconnectedness and the generalized topological overlap measure , 2007, BMC Bioinformatics.

[8]  Gregory D. Abowd,et al.  Human-Computer Interaction (3rd Edition) , 2003 .

[9]  José M. Alonso,et al.  A Grid Computing-Based Approach for the Acceleration of Simulations in Cardiology , 2008, IEEE Transactions on Information Technology in Biomedicine.

[10]  Dan Gusfield,et al.  On the Complexity of Fundamental Computational Problems in Pedigree Analysis , 2003, J. Comput. Biol..

[11]  Johan Montagnat,et al.  Grid-enabled Virtual Screening Against Malaria , 2006, Journal of Grid Computing.

[12]  Reda Alhajj,et al.  Replica Placement Strategies in Data Grid , 2008, Journal of Grid Computing.

[13]  Igor Sfiligoi,et al.  glideinWMS - A generic pilot-based Workload Management System , 2008 .

[14]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[15]  Emmanouel A. Varvarigos,et al.  Statistical Analysis and Modeling of Jobs in a Grid Environment , 2007, Journal of Grid Computing.

[16]  Malin Andersen,et al.  The use of grid computing to drive data-intensive genetic research , 2007, European Journal of Human Genetics.

[17]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[18]  E. Lander,et al.  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results , 1995, Nature Genetics.

[19]  Ola Hössjer,et al.  A general method for linkage disequilibrium correction for multipoint linkage and association , 2008, Genetic epidemiology.

[20]  Dan Geiger,et al.  Maximum Likelihood Haplotyping for General Pedigrees , 2005, Human Heredity.

[21]  Igor Sfiligoi,et al.  Addressing the pilot security problem with gLExec , 2008 .

[22]  Thomas Phan,et al.  Parallel Simulation of Large-Scale Parallel Applications , 2001, Int. J. High Perform. Comput. Appl..

[23]  David Fernández-Baca,et al.  Allocating Modules to Processors in a Distributed System , 1989, IEEE Trans. Software Eng..

[24]  Michael Thomas,et al.  Data Intensive and Network Aware (DIANA) Grid Scheduling , 2007, Journal of Grid Computing.

[25]  Valeria V. Krzhizhanovskaya,et al.  Dynamic workload balancing of parallel applications with user-level scheduling on the Grid , 2009, Future Gener. Comput. Syst..

[26]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[27]  Charles Loomis,et al.  Scheduling for Responsive Grids , 2008, Journal of Grid Computing.

[28]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[29]  Jules Hernández-Sánchez,et al.  A web application to perform linkage disequilibrium and linkage analyses on a computational grid , 2009, Bioinform..

[30]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[31]  T Maeno,et al.  PanDA: distributed production and distributed analysis system for ATLAS , 2008 .

[32]  N. Jacq,et al.  Grid-Enabled High-Throughput In Silico Screening Against Influenza A Neuraminidase , 2006, IEEE Transactions on NanoBioscience.

[33]  Albert L. Waldo,et al.  Inter-relationships of atrial fibrillation and atrial flutter mechanisms and clinical implications. , 2008, Journal of the American College of Cardiology.

[34]  Domenico Talia,et al.  Modeling and Supporting Grid Scheduling , 2008, Journal of Grid Computing.

[35]  L Kruglyak,et al.  Efficient multipoint linkage analysis through reduction of inheritance space. , 2001, American journal of human genetics.

[36]  Cecchi Marco,et al.  The gLite workload management system , 2008 .

[37]  Moreno Marzolla,et al.  The gLite Workload Management System , 2008, GPC.

[38]  C. Loomis,et al.  Interactive and Real-Time Applications on the EGEE Grid Infrastructure , 2010 .

[39]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.