Needs Assessment for Research Use of High-Throughput Sequencing at a Large Academic Medical Center

Next Generation Sequencing (NGS) methods are driving profound changes in biomedical research, with a growing impact on patient care. Many academic medical centers are evaluating potential models to prepare for the rapid increase in NGS information needs. This study sought to investigate (1) how and where sequencing data is generated and analyzed, (2) research objectives and goals for NGS, (3) workforce capacity and unmet needs, (4) storage capacity and unmet needs, (5) available and anticipated funding resources, and (6) future challenges. As a precursor to informed decision making at our institution, we undertook a systematic needs assessment of investigators using survey methods. We recruited 331 investigators from over 60 departments and divisions at the University of Pittsburgh Schools of Health Sciences and had 140 respondents, or a 42% response rate. Results suggest that both sequencing and analysis bottlenecks currently exist. Significant educational needs were identified, including both investigator-focused needs, such as selection of NGS methods suitable for specific research objectives, and program-focused needs, such as support for training an analytic workforce. The absence of centralized infrastructure was identified as an important institutional gap. Key principles for organizations managing this change were formulated based on the survey responses. This needs assessment provides an in-depth case study which may be useful to other academic medical centers as they identify and plan for future needs.

[1]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[2]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[3]  Andrea Pitasi,et al.  The Fourth Paradigm , 2014 .

[4]  Lin Liu,et al.  Comparison of Next-Generation Sequencing Systems , 2012, Journal of biomedicine & biotechnology.

[5]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[6]  T. Hampton,et al.  The Cancer Genome Atlas , 2020, Indian Journal of Medical and Paediatric Oncology.

[7]  Borja Sotomayor,et al.  Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses , 2014, J. Biomed. Informatics.

[8]  Robert L. Grossman,et al.  Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets , 2014, J. Am. Medical Informatics Assoc..

[9]  David Sexton,et al.  The Need for Centralization of Computational Biology Resources , 2009, PLoS Comput. Biol..

[10]  A. Franke,et al.  DNA methylome analysis using short bisulfite sequencing data , 2012, Nature Methods.

[11]  Subha Madhavan,et al.  A case study for cloud based high throughput analysis of NGS data using the globus genomics system , 2014, Computational and structural biotechnology journal.

[12]  Frank A. Pasquale,et al.  Protecting Health Privacy in an Era of Big Data Processing and Cloud Computing , 2014 .

[13]  Ola Spjuth,et al.  Lessons learned from implementing a national infrastructure in Sweden for storage and analysis of next-generation sequencing data , 2013, GigaScience.

[14]  E. Mardis,et al.  Revolutionizing cancer care with next-generation sequencing: an interview with Elaine Mardis , 2014, Disease Models & Mechanisms.

[15]  Giuseppe Ateniese,et al.  "To Share or not to Share" in Client-Side Encrypted Clouds , 2014, ISC.

[16]  Joel T Dudley,et al.  In silico research in the era of cloud computing , 2010, Nature Biotechnology.

[17]  Babak Falsafi,et al.  To Share or Not To Share? , 2007, VLDB.

[18]  Brendan W. Vaughan,et al.  The 1000 Genomes Project: data management and community access , 2012, Nature Methods.

[19]  Alexander A. Morgan,et al.  Translational bioinformatics in the cloud: an affordable alternative , 2010, Genome Medicine.

[20]  W. Kibbe,et al.  Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing , 2014, Cancer informatics.

[21]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[22]  Monya Baker,et al.  Next-generation sequencing: adjusting to data overload , 2010, Nature Methods.

[23]  G. Ginsburg,et al.  The path to personalized medicine. , 2002, Current opinion in chemical biology.

[24]  Erika Check Hayden,et al.  Technology: The $1,000 genome , 2014, Nature.

[25]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[26]  Eric S. Lander,et al.  Hi-C: A Method to Study the Three-dimensional Architecture of Genomes. , 2010, Journal of visualized experiments : JoVE.

[27]  Vivien Marx Genomics in the clouds , 2013, Nature Methods.

[28]  B. Di Camillo,et al.  Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. , 2015, Briefings in functional genomics.

[29]  Elaine R. Mardis,et al.  A decade’s perspective on DNA sequencing technology , 2011, Nature.

[30]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[31]  Zhong Wang,et al.  Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.

[32]  Mustafa Tekin,et al.  The promise of whole-exome sequencing in medical genetics , 2013, Journal of Human Genetics.

[33]  Joel Dudley,et al.  A Quick Guide for Developing Effective Bioinformatics Programming Skills , 2009, PLoS Comput. Biol..

[34]  Pauline C Ng,et al.  Whole genome sequencing. , 2010, Methods in molecular biology.

[35]  C. Arnaud The $1,000 genome , 2005 .

[36]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[37]  Alex M. Fichtenholtz,et al.  Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing , 2013, Nature Biotechnology.

[38]  Raymond K. Auerbach,et al.  The real cost of sequencing: higher than you think! , 2011, Genome Biology.

[39]  Jared Yanovich,et al.  The data supercell , 2012, XSEDE '12.

[40]  Sahil R. Kalra,et al.  Big Challenges? Big Data … , 2015 .

[41]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[42]  Fran Lewitter,et al.  Establishing a Successful Bioinformatics Core Facility Team , 2009, PLoS Comput. Biol..

[43]  L. Stein The case for cloud computing in genome informatics , 2010, Genome Biology.

[44]  Russell Schwartz,et al.  Bioinformatics Curriculum Guidelines: Toward a Definition of Core Competencies , 2014, PLoS Comput. Biol..

[45]  Leslie G Biesecker,et al.  Diagnostic clinical genome and exome sequencing. , 2014, The New England journal of medicine.