Transcription regulatory region analysis using signal detection and fuzzy clustering

MOTIVATION Presently available programs for the recognition of potential transcription factor binding sites in genomic sequences generally yield a huge amount of output. These output lists have to be filtered to obtain biologically significant elements, which is highly laborious work to be done manually. RESULTS We developed a strategy for systematic verification and improvement of the underlying profiles, and for their contextual analysis by a fuzzy clustering approach using non-redundant libraries of search profiles as a prerequisite. AVAILABILITY The tools mentioned in the paper are available upon request. CONTACT ewi@gbf.de

[1]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  H. Karas,et al.  TRANSFAC database as a bridge between sequence data libraries and biological function. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[3]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[4]  E. Wingender CLASSIFICATION SCHEME OF EUKARYOTIC TRANSCRIPTION FACTORS , 1997 .

[5]  T. Werner,et al.  Finding protein-binding sites in DNA sequences: the next generation. , 1997, Trends in biochemical sciences.

[6]  K Frech,et al.  Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids. , 1993, Nucleic acids research.

[7]  T. Heinemeyer,et al.  TRANSFAC, TRRD and COMPEL: towards a federated database system on transcriptional regulation , 1997, Nucleic Acids Res..

[8]  P. Bucher Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. , 1990, Journal of molecular biology.

[9]  D. S. Prestridge Predicting Pol II promoter sequences using transcription factor binding sites. , 1995, Journal of molecular biology.

[10]  T. Werner,et al.  MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. , 1995, Nucleic acids research.

[11]  J. Locker,et al.  A dictionary of transcription control sequences. , 1990, DNA sequence : the journal of DNA sequencing and mapping.

[12]  Silke Meyer,et al.  Compilation of vertebrate-encoded transcription factors , 1992, Nucleic Acids Res..

[13]  K Frech,et al.  Software for the analysis of DNA sequence elements of transcription , 1997, Comput. Appl. Biosci..

[14]  Gary D. Stormo,et al.  MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices , 1995, Comput. Appl. Biosci..