POXO: a web-enabled tool series to discover transcription factor binding sites

We present POXO, a comprehensive tool series to discover transcription factor binding sites from co-expressed genes (). POXO manages tasks such as functional evaluation and grouping of genes, sequence retrieval, pattern discovery and pattern verification. It also allows users to tailor analytical pipelines from these tools, with single mouse clicks. One typical pipeline of POXO begins by examining the biological functions that a set of co-expressed genes are involved in. In this examination, the functional coherence of the gene set is evaluated and representative functions are associated with the gene set. This examination can also be used to group genes into functionally similar subsets, if several biological processes are affected in the experiment. The next step in the pipeline is then to discover over-represented nucleotide patterns from the upstream sequences of the selected gene sets. This enables to investigate the possibility that the genes are co-regulated by common cis-elements. If over-represented patterns are found, similar ones can then be clustered together and be verified. The performance of POXO is demonstrated by analysing expression data from pathogen treated Arabidopsis thaliana. In this example, POXO detected activated gene sets and suggested transcription factors responsible for their regulation.

[1]  Jacques van Helden,et al.  Regulatory Sequence Analysis Tools , 2003, Nucleic Acids Res..

[2]  David J. Arenillas,et al.  oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes , 2005, Nucleic acids research.

[3]  Petri Törönen,et al.  Theme discovery from gene lists for identification and viewing of multiple functional groups , 2005, BMC Bioinformatics.

[4]  Sang Yeol Lee,et al.  Pathogen- and NaCl-Induced Expression of the SCaM-4 Promoter Is Mediated in Part by a GT-1 Box That Interacts with a GT-1-Like Transcription Factor1 , 2004, Plant Physiology.

[5]  Liisa Holm,et al.  POBO, transcription factor binding site verification with bootstrapping , 2004, Nucleic Acids Res..

[6]  Wyeth W. Wasserman,et al.  A new generation of JASPAR, the open-access repository for transcription factor binding site profiles , 2005, Nucleic Acids Res..

[7]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[8]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[9]  Martin Vingron,et al.  T-Reg Comparator: an analysis tool for the comparison of position weight matrices , 2005, Nucleic Acids Res..

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[12]  Yoshihiro Ugawa,et al.  Plant cis-acting regulatory DNA elements (PLACE) database: 1999 , 1999, Nucleic Acids Res..

[13]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[14]  B. De Moor,et al.  Toucan: deciphering the cis-regulatory logic of coregulated genes. , 2003, Nucleic acids research.

[15]  M. Tompa,et al.  Discovery of novel transcription factor binding sites by statistical overrepresentation. , 2002, Nucleic acids research.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  Dong Wang,et al.  Induction of Protein Secretory Pathway Is Required for Systemic Acquired Resistance , 2005, Science.

[18]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[19]  Benedict Paten,et al.  The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates , 2005, Genome Biology.

[20]  Frederick M Ausubel,et al.  Arabidopsis local resistance to Botrytis cinerea involves salicylic acid and camalexin and requires EDS4 and PAD2, but not SID2, EDS5 or PAD4. , 2003, The Plant journal : for cell and molecular biology.

[21]  Nanfei Xu,et al.  Multiple auxin response modules in the soybean SAUR 15A promoter , 1997 .

[22]  G. Hagen,et al.  Aux/IAA proteins repress expression of reporter genes containing natural and highly active synthetic auxin response elements. , 1997, The Plant cell.

[23]  K. Shinozaki,et al.  Interaction between two cis-acting elements, ABRE and DRE, in ABA-dependent expression of Arabidopsis rd29A gene in response to dehydration and high-salinity stresses. , 2003, The Plant journal : for cell and molecular biology.

[24]  Martin J. Mueller,et al.  Signal signature and transcriptome changes of Arabidopsis during pathogen and insect attack. , 2005, Molecular plant-microbe interactions : MPMI.

[25]  Liisa Holm,et al.  POCO: discovery of regulatory patterns from promoters of oppositely expressed gene sets , 2005, Nucleic Acids Res..

[26]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[27]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[28]  Andreas Prlic,et al.  Ensembl 2006 , 2005, Nucleic Acids Res..

[29]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[30]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[31]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.