Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes

SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1. In this article, we utilize a web version of SCOPE2 to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4 and has been used in other studies5-8. The three algorithms that comprise SCOPE are BEAM9, which finds non-degenerate motifs (ACCGGT), PRISM10, which finds degenerate motifs (ASCGWT), and SPACER11, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11.

[1]  Robert H. Gross,et al.  SPACER: identification of cis-regulatory elements with non-contiguous critical residues , 2007, Bioinform..

[2]  Robert H. Gross,et al.  A novel ensemble learning method for de novo computational identification of DNA binding sites , 2007, BMC Bioinformatics.

[3]  Robert H. Gross,et al.  SCOPE: a web server for practical de novo motif discovery , 2007, Nucleic Acids Res..

[4]  F. Robert,et al.  Genomewide Location Analysis of Candida albicans Upc2p, a Regulator of Sterol Metabolism and Azole Drug Resistance , 2008, Eukaryotic Cell.

[5]  Deepak Sharma,et al.  RegAnalyst: a web interface for the analysis of regulatory motifs, networks and pathways , 2009, Nucleic Acids Res..

[6]  Jos B. T. M. Roerdink,et al.  DISCLOSE : DISsection of CLusters Obtained by SEries of transcriptome data using functional annotations and putative transcription factor binding sites , 2008, BMC Bioinformatics.

[7]  Jos B. T. M. Roerdink,et al.  MOTIFATOR: detection and characterization of regulatory motifs using prokaryote transcriptome data , 2009, Bioinform..

[8]  Victor G Corces,et al.  Three subclasses of a Drosophila insulator show distinct and cell type-specific genomic distributions. , 2009, Genes & development.

[9]  Robert H. Gross,et al.  Bounded search for de novo identification of degenerate cis-regulatory elements , 2006, BMC Bioinformatics.

[10]  G. Boucher,et al.  Identification of the Candida albicans Cap1p Regulon , 2009, Eukaryotic Cell.

[11]  Robert H. Gross,et al.  BEAM: A Beam Search Algorithm for the Identification of Cis-Regulatory Elements in Groups of Genes , 2006, J. Comput. Biol..