HUGenomics: A support to personalized medicine research

In the coming years, human genome research will likely transform medical practices. Genome-wide association studies (GWAS) are an example of the research effort made to allowing scientists to identify genes involved in human disease, reaction to treatments or symptom severity. Indeed, the unique genetic profile of an individual and the knowledge of molecular basis of diseases are leading to the development of personalized medicines and therapies, but the exponential growth of available genomic data requires a computational effort that may limit the progress of personalized medicine. Within this context, we propose the development of a novel hardware and software integrated system, named HUGenomics. The framework aims at becoming an advanced support for personalized medicine research. Thanks to more efficient algorithms and data integration from different biological sources, HUGenomics aims at simplifying the interpretation of biological information and facilitating genomic research process by means of both computational and data visualization tools.

[1]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[2]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[3]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[4]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[5]  Davide Eynard,et al.  Search Computing: Managing Complex Search Queries , 2010, IEEE Internet Computing.

[6]  Dilpreet Singh,et al.  A survey on platforms for big data analytics , 2014, Journal of Big Data.

[7]  Marco D. Santambrogio,et al.  Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAs using OpenCL , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[8]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[9]  Daniel G MacArthur,et al.  The promise and reality of personal genomics , 2009, Genome Biology.

[10]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[11]  Laxmikant V. Kalé,et al.  NAMD: a Parallel, Object-Oriented Molecular Dynamics Program , 1996, Int. J. High Perform. Comput. Appl..

[12]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[13]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[14]  Lars Wienbrandt,et al.  Fast Genome-Wide Third-order SNP Interaction Tests with Information Gain on a Low-cost Heterogeneous Parallel FPGA-GPU Computing Architecture , 2017, ICCS.

[15]  Lawrence Hunter,et al.  AMIA Board white paper: definition of biomedical informatics and specification of core competencies for graduate education in the discipline , 2012, J. Am. Medical Informatics Assoc..

[16]  Marco Masseroli,et al.  Explorative search of distributed bio-data to answer complex biomedical questions , 2014, BMC Bioinformatics.

[17]  Bertil Schmidt,et al.  Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Marco D. Santambrogio,et al.  ProFAX: A hardware acceleration of a protein folding algorithm , 2016, 2016 IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI).

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  E. Mardis The $1,000 genome, the $100,000 analysis? , 2010, Genome Medicine.

[21]  Pieter van Rooyen Bridging tech and biotech , 2015, Nature Biotechnology.

[22]  Brendan J. Frey,et al.  Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets , 2016, Proceedings of the IEEE.

[23]  Marco D. Santambrogio,et al.  ReCPU: A parallel and pipelined architecture for regular expression matching , 2007, 2007 IFIP International Conference on Very Large Scale Integration.

[24]  Tin Wee Tan,et al.  Large-scale analysis of antigenic diversity of T-cell epitopes in dengue virus , 2006, BMC Bioinformatics.

[25]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[26]  Jean-Claude Latombe,et al.  Efficient Energy Computation for Monte Carlo Simulation of Proteins , 2003, WABI.

[27]  Marco D. Santambrogio,et al.  On How to Improve FPGA-Based Systems Design Productivity via SDAccel , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[28]  Yufeng Shen,et al.  CANOES: detecting rare copy number variants from whole exome sequencing data , 2014, Nucleic acids research.

[29]  Russ B. Altman,et al.  Bioinformatics challenges for personalized medicine , 2011, Bioinform..

[30]  John D. Davis,et al.  BLAS Comparison on FPGA, CPU and GPU , 2010, 2010 IEEE Computer Society Annual Symposium on VLSI.

[31]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[32]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .