Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort

Transcription factors control which information in a genome becomes transcribed to produce RNAs that function in the biological systems of cells and organisms. Reliable and comprehensive information about transcription factors is invaluable for large-scale network-based studies. However, existing transcription factor knowledge bases are still lacking in well-documented functional information. Here, we provide guidelines for a curation strategy, which constitutes a robust framework for using the controlled vocabularies defined by the Gene Ontology Consortium to annotate specific DNA binding transcription factors (DbTFs) based on experimental evidence reported in literature. Our standardized protocol and workflow for annotating specific DNA binding RNA polymerase II transcription factors is designed to document high-quality and decisive evidence from valid experimental methods. Within a collaborative biocuration effort involving the user community, we are now in the process of exhaustively annotating the full repertoire of human, mouse and rat proteins that qualify as DbTFs in as much as they are experimentally documented in the biomedical literature today. The completion of this task will significantly enrich Gene Ontology-based information resources for the research community. Database URL: www.tfcheckpoint.org

[1]  Prudence Mutowo-Meullenet,et al.  Use of Gene Ontology Annotation to understand the peroxisome proteome in humans , 2013, Database J. Biol. Databases Curation.

[2]  Robert Stevens,et al.  Gene Ontology Consortium , 2014 .

[3]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[4]  Debra L. Fulton,et al.  TFCat: the curated catalog of mouse and human transcription factors , 2009, Genome Biology.

[5]  R. Tjian,et al.  Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. , 1989, Science.

[6]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[7]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[8]  Yang Liu,et al.  Mouse Brain Organization Revealed Through Direct Genome-Scale TF Expression Analysis , 2004, Science.

[9]  Tsviya Olender,et al.  GeneCardsTM 2002: towards a complete, object-oriented, human gene compendium , 2002, Bioinform..

[10]  Kristen Jepsen,et al.  Deconstructing repression: evolving models of co-repressor action , 2010, Nature Reviews Genetics.

[11]  Denis Noble,et al.  A theory of biological relativity: no privileged level of causation , 2012, Interface Focus.

[12]  Michael Lietz,et al.  How mammalian transcriptional repressors work. , 2004, European journal of biochemistry.

[13]  Juan M. Vaquerizas,et al.  A census of human transcription factors: function, expression and evolution , 2009, Nature Reviews Genetics.

[14]  Debra L. Fulton,et al.  The Transcription Factor Encyclopedia , 2012, Genome Biology.

[15]  Sanghyuk Lee,et al.  MicroRNA genes are transcribed by RNA polymerase II , 2004, The EMBO journal.

[16]  Julio Collado-Vides,et al.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation , 2007, Nucleic Acids Res..

[17]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[18]  Christopher B. Burge,et al.  c-Myc Regulates Transcriptional Pause Release , 2010, Cell.

[19]  Sabina Leonelli,et al.  How the gene ontology evolves , 2011, BMC Bioinformatics.

[20]  Albertha J. M. Walhout,et al.  Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping. , 2006, Genome research.

[21]  Marek S. Skrzypek,et al.  Improved Gene Ontology Annotation for Biofilm Formation, Filamentous Growth, and Phenotypic Switching in Candida albicans , 2012, Eukaryotic Cell.

[22]  Alexander E. Kel,et al.  Transcription Regulatory Regions Database (TRRD): its status in 2000 , 2000, Nucleic Acids Res..

[23]  Eleazar Eskin,et al.  Using Network Component Analysis to Dissect Regulatory Networks Mediated by Transcription Factors in Yeast , 2009, PLoS Comput. Biol..

[24]  John T. Lis,et al.  Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans , 2012, Nature Reviews Genetics.

[25]  Vikki M. Weake,et al.  Inducible gene expression: diverse regulatory mechanisms , 2010, Nature Reviews Genetics.

[26]  Obi L. Griffith,et al.  ORegAnno: an open-access community-driven resource for regulatory annotation , 2007, Nucleic Acids Res..

[27]  Michael Hu,et al.  A dynamic expression survey identifies transcription factors relevant in mouse digestive tract development , 2006, Development.

[28]  Evelyn Susanto,et al.  A Proteomics Approach for the Identification of DNA Binding Activities Observed in the Electrophoretic Mobility Shift Assay* , 2002, Molecular & Cellular Proteomics.

[29]  Shane J. Neph,et al.  An expansive human regulatory lexicon encoded in transcription factor footprints , 2012, Nature.

[30]  Andrey N. Naumochkin,et al.  Transcription Regulatory Regions Database (TRRD): its status in 2002 , 2002, Nucleic Acids Res..

[31]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[32]  C. Chiang,et al.  The General Transcription Machinery and General Cofactors , 2006, Critical reviews in biochemistry and molecular biology.

[33]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..