Teolenn: an efficient and customizable workflow to design high-quality probes for microarray experiments

Despite the development of new high-throughput sequencing techniques, microarrays are still attractive tools to study small genome organisms, thanks to sample multiplexing and high-feature densities. However, the oligonucleotide design remains a delicate step for most users. A vast array of software is available to deal with this problem, but each program is developed with its own strategy, which makes the choice of the best solution difficult. Here we describe Teolenn, a universal probe design workflow developed with a flexible and customizable module organization allowing fixed or variable length oligonucleotide generation. In addition, our software is able to supply quality scores for each of the designed probes. In order to assess the relevance of these scores, we performed a real hybridization using a tiling array designed against the Trichoderma reesei fungus genome. We show that our scoring pipeline correlates with signal quality for 97.2% of all the designed probes, allowing for a posteriori comparisons between quality scores and signal intensities. This result is useful in discarding any bad scoring probes during the design step in order to get high-quality microarrays. Teolenn is available at http://transcriptome.ens.fr/teolenn/.

[1]  F. Cohen,et al.  Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray , 2003, Genome Biology.

[2]  Bernard Henrissat,et al.  Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina) , 2008, Nature Biotechnology.

[3]  M. Zuker,et al.  OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach. , 2003, Nucleic acids research.

[4]  Henrik Bjørn Nielsen,et al.  OligoWiz 2.0—integrating sequence feature annotation into the design of microarray probes , 2005, Nucleic Acids Res..

[5]  Kun-Mao Chao,et al.  Aligning two sequences within a specified diagonal band , 1992, Comput. Appl. Biosci..

[6]  J. SantaLucia,et al.  A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Nicolas Servant,et al.  Goulphar: rapid access and expertise for standard two-color microarray normalization methods , 2006, BMC Bioinformatics.

[8]  Alexander Schliep,et al.  Efficient Computational Design of Tiling Arrays Using a Shortest Path Approach , 2007, WABI.

[9]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[10]  Eric K. Nordberg,et al.  YODA: selecting signature oligonucleotides , 2005, Bioinform..

[11]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[12]  Michael Zuker,et al.  UNAFold: software for nucleic acid folding and hybridization. , 2008, Methods in molecular biology.

[13]  Gary D. Stormo,et al.  Selection of optimal DNA oligos for gene expression arrays , 2001, Bioinform..

[14]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[15]  Lukas Hartl,et al.  The Hypocrea jecorina (Trichoderma reesei) hypercellulolytic mutant RUT C30 lacks a 85 kb (29 gene-encoding) region of the wild-type genome , 2008, BMC Genomics.

[16]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[17]  Jiasen Lu,et al.  Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. , 2000, Nucleic acids research.

[18]  Kay Hofmann,et al.  Microarray Probe Selection Strategies , 2001, Briefings Bioinform..

[19]  Dan Gusfield,et al.  Algorithms in Bioinformatics , 2002, Lecture Notes in Computer Science.

[20]  B. Wold,et al.  Sequence census methods for functional genomics , 2008, Nature Methods.

[21]  M. Gerstein,et al.  Design optimization methods for genomic DNA tiling arrays. , 2005, Genome research.

[22]  Paul Flicek,et al.  Optimized design and assessment of whole genome tiling arrays , 2007, ISMB/ECCB.

[23]  Bernard Henrissat,et al.  Corrigendum: Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina) , 2008, Nature Biotechnology.

[24]  Yonatan Aumann,et al.  Optimization of probe coverage for high-resolution oligonucleotide aCGH , 2007, Bioinform..

[25]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[26]  J. Shendure The beginning of the end for microarrays? , 2008, Nature Methods.

[27]  Sophie Lemoine,et al.  An evaluation of custom microarray applications: the oligonucleotide design challenge , 2009, Nucleic acids research.