Probe selection algorithms with applications in the analysis of microbial communities

We propose two efficient heuristics for minimizing the number of oligonucleotide probes needed for analyzing populations of ribosomal RNA gene (rDNA) clones by hybridization experiments on DNA microarrays. Such analyses have applications in the study of microbial communities. Unlike in the classical SBH (sequencing by hybridization) procedure, where multiple probes are on a DNA chip, in our applications we perform a series of experiments, each one consisting of applying a single probe to a DNA microarray containing a large sample of rDNA sequences from the studied population. The overall cost of the analysis is thus roughly proportional to the number of experiments, underscoring the need for minimizing the number of probes. Our algorithms are based on two well-known optimization techniques, i.e. simulated annealing and Lagrangian relaxation, and our preliminary tests demonstrate that both algorithms are able to find satisfactory probe sets for real rDNA data.

[1]  N. Pace A molecular view of microbial diversity and the biosphere. , 1997, Science.

[2]  R Herwig,et al.  Comparative gene expression profiling by oligonucleotide fingerprinting. , 1998, Nucleic acids research.

[3]  Jonathan Arnold,et al.  PCAP: probe choice and analysis package - a set of programs to aid in choosing synthetic oligomers for contig mapping , 1993, Comput. Appl. Biosci..

[4]  Gary D. Stormo,et al.  Selecting optimum DNA oligos for microarrays , 2000, Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering.

[5]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[6]  S. Meier-Ewert,et al.  Toward the gene catalogue of sea urchin development: the construction and analysis of an unfertilized egg cDNA library highly normalized by oligonucleotide fingerprinting. , 1999, Genomics.

[7]  R. Drmanac,et al.  Processing of cDNA and genomic kilobase-size clones for massive screening, mapping and sequencing by hybridization. , 1994, BioTechniques.

[8]  V. Torsvik,et al.  High diversity in DNA of soil bacteria , 1990, Applied and environmental microbiology.

[9]  Ralf Herwig,et al.  A data-analysis pipeline for large-scale gene expression analysis , 2000, RECOMB '00.

[10]  A. Uitterlinden,et al.  Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA , 1993, Applied and environmental microbiology.

[11]  R. Drmanac,et al.  Gene-representing cDNA clusters defined by hybridization of 57,419 clones from infant brain libraries with short oligonucleotide probes. , 1996, Genomics.

[12]  Hans H. Cheng,et al.  Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA , 1997, Applied and environmental microbiology.

[13]  K. Schleifer,et al.  Phylogenetic identification and in situ detection of individual microbial cells without cultivation. , 1995, Microbiological reviews.

[14]  Matteo Fischetti,et al.  A Heuristic Method for the Set Covering Problem , 1999, Oper. Res..

[15]  J. Borneman,et al.  Molecular microbial diversity of an agricultural soil in Wisconsin , 1996, Applied and environmental microbiology.

[16]  C. Woese,et al.  Bacterial evolution , 1987, Microbiological reviews.

[17]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[18]  S. Meier-Ewert,et al.  Application of robotic technology to automated sequence fingerprint analysis by oligonucleotide hybridisation. , 1994, Journal of biotechnology.

[19]  Richard M. Karp,et al.  The traveling-salesman problem and minimum spanning trees: Part II , 1971, Math. Program..

[20]  N. Pace,et al.  Remarkable archaeal diversity detected in a Yellowstone National Park hot spring environment. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. Borneman,et al.  Molecular microbial diversity in soils from eastern Amazonia: evidence for unusual microorganisms and microbial population shifts associated with deforestation , 1997, Applied and environmental microbiology.

[22]  Hans Lehrach,et al.  Information theoretical probe selection for hybridisation experiments , 2000, Bioinform..

[23]  H. Lehrach,et al.  Preselection of shotgun clones by oligonucleotide fingerprinting: an efficient and high throughput strategy to reduce redundancy in large-scale sequencing projects. , 1998, Nucleic acids research.

[24]  R. Drmanac,et al.  cDNA screening by array hybridization. , 1999, Methods in enzymology.

[25]  S. Giovannoni,et al.  Genetic diversity in Sargasso Sea bacterioplankton , 1990, Nature.

[26]  Y. Fu,et al.  On the design of genome mapping experiments using short synthetic oligonucleotides. , 1992, Biometrics.

[27]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[28]  David B. Shmoys,et al.  A unified approach to approximation algorithms for bottleneck problems , 1986, JACM.

[29]  D. M. Ward,et al.  Ribosomal RNA Analysis of Microorganisms as They Occur in Nature , 1992 .