YMAP: a pipeline for visualization of copy number variation and loss of heterozygosity in eukaryotic pathogens

The design of effective antimicrobial therapies for serious eukaryotic pathogens requires a clear understanding of their highly variable genomes. To facilitate analysis of copy number variations, single nucleotide polymorphisms and loss of heterozygosity events in these pathogens, we developed a pipeline for analyzing diverse genome-scale datasets from microarray, deep sequencing, and restriction site associated DNA sequence experiments for clinical and laboratory strains of Candida albicans, the most prevalent human fungal pathogen. The YMAP pipeline (http://lovelace.cs.umn.edu/Ymap/) automatically illustrates genome-wide information in a single intuitive figure and is readily modified for the analysis of other pathogens with small genomes.

[1]  P. Kozinn,et al.  Transmission of P32-labeled Candida albicans to newborn mice at birth. , 1960, A.M.A. journal of diseases of children.

[2]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[3]  J. Hicks,et al.  Unusually large telomeric repeats in the yeast Candida albicans , 1993, Molecular and cellular biology.

[4]  F. Sherman,et al.  Variations in the number of ribosomal DNA units in morphological mutants and normal strains of Candida albicans and in normal strains of Saccharomyces cerevisiae , 1993, Journal of bacteriology.

[5]  T. C. White,et al.  Development of fluconazole resistance in Candida albicans causing disseminated infection in a patient undergoing marrow transplantation. , 1997, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[6]  T. Kobayashi,et al.  Expansion and contraction of ribosomal DNA repeats in Saccharomyces cerevisiae: requirement of replication fork blocking (Fob1) protein and the role of RNA polymerase I. , 1998, Genes & development.

[7]  F. Sherman,et al.  Monosomy of a specific chromosome determines L-sorbose utilization: a novel regulatory mechanism in Candida albicans. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Alexander D. Johnson,et al.  White-Opaque Switching in Candida albicans Is Controlled by Mating-Type Locus Homeodomain Proteins and Allows Efficient Mating , 2002, Cell.

[9]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[10]  Sun-Yuan Kung,et al.  Accurate detection of aneuploidies in array CGH and gene expression microarray data , 2004, Bioinform..

[11]  Marek S. Skrzypek,et al.  The Candida Genome Database (CGD), a community resource for Candida albicans gene and protein information , 2004, Nucleic Acids Res..

[12]  S. Noble,et al.  Strains and Strategies for Large-Scale Gene Deletion Studies of the Diploid Human Fungal Pathogen Candida albicans , 2005, Eukaryotic Cell.

[13]  P. T. Magee,et al.  Effect of the Major Repeat Sequence on Chromosome Loss in Candida albicans , 2005, Eukaryotic Cell.

[14]  M. Baum,et al.  Formation of functional centromeric chromatin is specified epigenetically in Candida albicans , 2006, Proceedings of the National Academy of Sciences.

[15]  Christina A. Cuomo,et al.  Assembly of the Candida albicans genome into sixteen supercontigs aligned on the eight chromosomes , 2007, Genome Biology.

[16]  J. Berman,et al.  Aneuploidy and Isochromosome Formation in Drug-Resistant Candida albicans , 2006, Science.

[17]  Judith Berman,et al.  Haplotype Mapping of a Diploid Non-Meiotic Organism Using Existing and Induced Aneuploidies , 2007, PLoS genetics.

[18]  Alexander D. Johnson,et al.  The Parasexual Cycle in Candida albicans Provides an Alternative Pathway to Meiosis for the Formation of Recombinant Strains , 2008, PLoS biology.

[19]  J. Berman,et al.  An isochromosome confers drug resistance in vivo by amplification of two genes, ERG11 and TAC1 , 2008, Molecular microbiology.

[20]  J. Berman,et al.  Neocentromeres Form Efficiently at Multiple Possible Loci in Candida albicans , 2009, PLoS genetics.

[21]  Manuel A. S. Santos,et al.  Evolution of pathogenicity and sexual reproduction in eight Candida genomes , 2009, Nature.

[22]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[23]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[24]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[25]  Zhiyong Lu,et al.  Database resources of the National Center for Biotechnology Information , 2010, Nucleic Acids Res..

[26]  David H. Laidlaw,et al.  Online Submission ID : 1199 Gremlin : An Interactive Visualization Model for Analyzing Genomic Rearrangements , 2010 .

[27]  J. Berman,et al.  High-Resolution SNP/CGH Microarrays Reveal the Accumulation of Loss of Heterozygosity in Commonly Used Candida albicans Strains , 2011, G3: Genes | Genomes | Genetics.

[28]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[29]  John Quackenbush,et al.  Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV , 2011, Bioinform..

[30]  Fangqing Zhao,et al.  inGAP-sv: a novel scheme to identify and visualize structural variation from paired end mapping data , 2011, Nucleic Acids Res..

[31]  Matthew Z. Anderson,et al.  The Three Clades of the Telomere-Associated TLO Gene Family of Candida albicans Have Different Splicing, Localization, and Expression Features , 2012, Eukaryotic Cell.

[32]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[33]  Marco Brandizi,et al.  The BioSample Database (BioSD) at the European Bioinformatics Institute , 2011, Nucleic Acids Res..

[34]  K. Kinzler,et al.  FAST-SeqS: A Simple and Efficient Method for the Detection of Aneuploidy by Massively Parallel Sequencing , 2012, PloS one.

[35]  Gavin Sherlock,et al.  Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure , 2013, Genome Biology.

[36]  Justin C. Fay,et al.  Genomic Sequence Diversity and Population Structure of Saccharomyces cerevisiae Assessed by RAD-seq , 2013, G3: Genes, Genomes, Genetics.

[37]  Richard J. Bennett,et al.  The ‘obligate diploid’ Candida albicans forms mating-competent haploids , 2013, Nature.

[38]  M. Dolled-Filhart,et al.  Computational and Bioinformatics Frameworks for Next-Generation Whole Exome and Genome Sequencing , 2013, TheScientificWorldJournal.

[39]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[40]  S. Redaelli,et al.  CEQer: A Graphical Tool for Copy Number and Allelic Imbalance Detection from Whole-Exome Sequencing Data , 2013, PloS one.

[41]  Michael R. Speicher,et al.  A survey of tools for variant analysis of next-generation genome sequencing data , 2013, Briefings Bioinform..

[42]  Yves D'Aubenton-Carafa,et al.  CIRCUS: a package for Circos display of structural genome variations from paired-end and mate-pair sequencing data , 2014, BMC Bioinformatics.

[43]  M. Reinders,et al.  WISECONDOR: detection of fetal aberrations from shallow sequencing maternal plasma based on a within-sample comparison scheme , 2013, Nucleic acids research.