GenomeCAT: a versatile tool for the analysis and integrative visualization of DNA copy number variants

BackgroundThe analysis of DNA copy number variants (CNV) has increasing impact in the field of genetic diagnostics and research. However, the interpretation of CNV data derived from high resolution array CGH or NGS platforms is complicated by the considerable variability of the human genome. Therefore, tools for multidimensional data analysis and comparison of patient cohorts are needed to assist in the discrimination of clinically relevant CNVs from others.ResultsWe developed GenomeCAT, a standalone Java application for the analysis and integrative visualization of CNVs. GenomeCAT is composed of three modules dedicated to the inspection of single cases, comparative analysis of multidimensional data and group comparisons aiming at the identification of recurrent aberrations in patients sharing the same phenotype, respectively. Its flexible import options ease the comparative analysis of own results derived from microarray or NGS platforms with data from literature or public depositories. Multidimensional data obtained from different experiment types can be merged into a common data matrix to enable common visualization and analysis. All results are stored in the integrated MySQL database, but can also be exported as tab delimited files for further statistical calculations in external programs.ConclusionsGenomeCAT offers a broad spectrum of visualization and analysis tools that assist in the evaluation of CNVs in the context of other experiment data and annotations. The use of GenomeCAT does not require any specialized computer skills. The various R packages implemented for data analysis are fully integrated into GenomeCATs graphical user interface and the installation process is supported by a wizard. The flexibility in terms of data import and export in combination with the ability to create a common data matrix makes the program also well suited as an interface between genomic data from heterogeneous sources and external software tools. Due to the modular architecture the functionality of GenomeCAT can be easily extended by further R packages or customized plug-ins to meet future requirements.

[1]  E. Lander,et al.  Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma , 2007, Proceedings of the National Academy of Sciences.

[2]  Céline Rouveirol,et al.  VAMP: Visualization and analysis of array-CGH, transcriptome and other molecular profiles , 2006, Bioinform..

[3]  Christian J Stoeckert,et al.  STAC: A method for testing the significance of DNA copy number aberrations across multiple array-CGH experiments. , 2006, Genome research.

[4]  Ting Wang,et al.  The UCSC Cancer Genomics Browser , 2009, Nature Methods.

[5]  Haoyang Cai,et al.  arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies , 2012, PloS one.

[6]  Ingrid K. Glad,et al.  CGH-Explorer: a program for analysis of array-CGH data , 2005, Bioinform..

[7]  Joaquín Dopazo,et al.  BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments , 2006, Nucleic Acids Res..

[8]  Wolfgang Huber,et al.  Ringo – an R/Bioconductor package for analyzing ChIP-chip readouts , 2007, BMC Bioinformatics.

[9]  Lars Feuk,et al.  The Database of Genomic Variants: a curated collection of structural variation in the human genome , 2013, Nucleic Acids Res..

[10]  Wei Chen,et al.  CGHPRO – A comprehensive data analysis tool for array CGH , 2005, BMC Bioinformatics.

[11]  E. S. Venkatraman,et al.  A faster circular binary segmentation algorithm for the analysis of array CGH data , 2007, Bioinform..

[12]  Jana Marie Schwarz,et al.  CNVinspector: a web-based tool for the interactive evaluation of copy number variations in single patients and in cohorts , 2013, Journal of Medical Genetics.

[13]  Martin Sill,et al.  SEURAT: Visual analytics for the integrated analysis of microarray data , 2010, BMC Medical Genomics.

[14]  Nuria Lopez-Bigas,et al.  Gitools: Analysis and Visualisation of Genomic Data Using Interactive Heat-Maps , 2011, PloS one.

[15]  Michele Ceccarelli,et al.  Finding recurrent copy number alterations preserving within-sample homogeneity , 2011, Bioinform..

[16]  Calum MacAulay,et al.  MD-SeeGH: a platform for integrative analysis of multi-dimensional genomic data , 2008, BMC Bioinformatics.

[17]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[18]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[19]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[20]  Keith Ryden,et al.  OpenGIS ® Implementation Specification for Geographic information - Simple feature access - Part 1:Common architecture , 2005 .

[21]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[22]  Ajay N. Jain,et al.  Hidden Markov models approach to the analysis of array CGH data , 2004 .

[23]  B. Trask,et al.  Segmental duplications: organization and impact within the current human genome project assembly. , 2001, Genome research.

[24]  L. Wessels,et al.  Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions , 2008, Nature.

[25]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[26]  David T. Miller,et al.  Chromosomal microarray impacts clinical management , 2014, Clinical genetics.

[27]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[28]  Ann E. Loraine,et al.  The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets , 2009, Bioinform..