论文信息 - MEME: discovering and analyzing DNA and protein sequence motifs

MEME: discovering and analyzing DNA and protein sequence motifs

MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel ‘signals’ in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource () and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

[1] Bin Li,et al. Limitations and potentials of current motif discovery algorithms , 2005, Nucleic acids research.

[2] Charles Elkan,et al. The Value of Prior Knowledge in Discovering Motifs with MEME , 1995, ISMB.

[3] Charles Elkan,et al. Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Mach. Learn..

[4] T. D. Schneider,et al. Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[5] Ian Foster,et al. The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[6] D. Botstein,et al. Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[7] Michael Gribskov,et al. Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[8] J. Wootton,et al. Analysis of compositionally biased regions in sequence databases. , 1996, Methods in enzymology.

[9] Pavel A. Pevzner,et al. Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[10] Shmuel Pietrokovski,et al. Recent enhancements to the Blocks Database servers , 1997, Nucleic Acids Res..

[11] William Stafford Noble,et al. Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[12] Wyeth W. Wasserman,et al. JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[13] Charles Elkan,et al. Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[14] Jianwen Fang,et al. Discover protein sequence signatures from protein-protein interaction data , 2005, BMC Bioinformatics.

[15] William Stafford Noble,et al. Searching for statistically significant regulatory modules , 2003, ECCB.