In silico characterization and evolutionary analyses of CCAAT binding proteins in the lycophyte plant Selaginella moellendorffii genome: A growing comparative genomics resource

NF-Y transcription factors encoded by HAP gene family, composed of three subunits (HAP2/NF-YA, HAP3/NF-YB and HAP5/NF-YC), are capable of transcriptional regulation of target genes with high specificity by binding to the CCAAT-containing promoter sequences. Here, we have characterized duplicated HAP genes in Selaginella moellendorffii and explored some features that might be involved in the regulation of gene expression and their function. Subsequently, the evolutionary relationships of LEC1-type of HAP3 genes have been studied starting from lycophytes to angiosperm to reveal the details of conservation and diversification of these genes during plant evolution. Computational analyses demonstrated the variation in length of cis-regulatory region of HAP3 duplicates in S. moellendorffii containing three thermodynamically stable and evolutionarily conserved RNA secondary structures. The homology modeling of NF-Y proteins, secondary structural details, DNA binding large positive patches, binding affinity of H2A-H2B interactive residues of NF-YC subunits on the duplicated NF-YB subunits, conserved domain analyses and protein structural alignments indicated that gene duplication process of HAP genes in S. moellendorffii, followed by structural diversification, provide specific hints about their functional specificity under various circumstances for the survival of this lycophytic plant. We have identified several conserved motifs in LEC1 proteins among all plant lineages during evolution.

[1]  S. Antonarakis,et al.  Gene duplication: a drive for phenotypic diversity and cause of human disease. , 2007, Annual review of genomics and human genetics.

[2]  Robert B Goldberg,et al.  LEAFY COTYLEDON1-LIKE Defines a Class of Regulators Essential for Embryo Development Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.006973. , 2003, The Plant Cell Online.

[3]  A new insight into the phylogeny of vascular cryptogams with special reference to Selaginella and Isoetes inferred from nuclear ITS/5.8S rDNA sequences , 2014, Journal of Plant Biochemistry and Biotechnology.

[4]  A. Niebel,et al.  CCAAT-box binding transcription factors in plants: Y so many? , 2013, Trends in plant science.

[5]  J. Riechmann,et al.  A genomic perspective on plant transcription factors. , 2000, Current opinion in plant biology.

[6]  C. Benoist,et al.  Evolutionary variation of the CCAAT-binding transcription factor NF-Y. , 1992, Nucleic acids research.

[7]  Jingchu Luo,et al.  Duplication and functional diversification of HAP3 genes leading to the origin of the seed-developmental regulatory gene, LEAFY COTYLEDON1 (LEC1), in nonseed plant genomes. , 2008, Molecular biology and evolution.

[8]  M. Nei,et al.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. , 2011, Molecular biology and evolution.

[9]  Zengyan Xie,et al.  Asymmetric evolution of duplicate genes encoding the CCAAT-binding factor NF-Y in plant genomes. , 2004, The New phytologist.

[10]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[11]  H. Kamada,et al.  Identification and characterization of carrot HAP factors that form a complex with the embryo-specific transcription factor C-LEC1. , 2007, Journal of experimental botany.

[12]  Jianzhi Zhang,et al.  Gene Complexity and Gene Duplicability , 2005, Current Biology.

[13]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[14]  Yael Mandel-Gutfreund,et al.  Patch Finder Plus (PFplus): A web server for extracting and displaying positive electrostatic patches on protein surfaces , 2007, Nucleic Acids Res..

[15]  Jianhua Zhu,et al.  The Arabidopsis NFYA5 Transcription Factor Is Regulated Transcriptionally and Posttranscriptionally to Promote Drought Resistance[W] , 2008, The Plant Cell Online.

[16]  C. Gissi,et al.  Untranslated regions of mRNAs , 2002, Genome Biology.

[17]  N. Bhardwaj,et al.  Kernel-based machine learning protocol for predicting DNA-binding proteins , 2005, Nucleic acids research.

[18]  J. Banks Selaginella and 400 million years of separation. , 2009, Annual review of plant biology.

[19]  Yael Mandel-Gutfreund,et al.  Annotating nucleic acid-binding function based on protein structure. , 2003, Journal of molecular biology.

[20]  Kristen K. Dang,et al.  Tissue-Specific Expression Patterns of Arabidopsis NF-Y Transcription Factors Suggest Potential for Extensive Combinatorial Complexity1[W][OA] , 2008, Plant Physiology.

[21]  Eugene V Koonin,et al.  A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. , 2004, Trends in genetics : TIG.

[22]  S. Maity,et al.  Biochemical analysis of the B subunit of the heteromeric CCAAT-binding factor. A DNA-binding domain and a subunit interaction domain are specified by two separate segments. , 1992, The Journal of biological chemistry.

[23]  A. Wagner,et al.  Decoupled evolution of coding region and mRNA expression patterns after gene duplication: implications for the neutralist-selectionist debate. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Eugene V Koonin,et al.  Duplicated genes evolve slower than singletons despite the initial rate increase , 2004, BMC Evolutionary Biology.

[25]  Eduardo Garcia Urdiales,et al.  Accurate Prediction of Peptide Binding Sites on Protein Surfaces , 2009, PLoS Comput. Biol..

[26]  S. Takeda,et al.  Arabidopsis NF-YB subunits LEC1 and LEC1-LIKE activate transcription by interacting with seed-specific ABRE-binding factors. , 2009, The Plant journal : for cell and molecular biology.

[27]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[28]  R. Mantovani,et al.  NF-Y Associates with H3-H4 Tetramers and Octamers by Multiple Mechanisms , 1999, Molecular and Cellular Biology.

[29]  S. von Arnold,et al.  Embryogenic potential and expression of embryogenesis-related genes in conifers are affected by treatment with a histone deacetylase inhibitor , 2011, Planta.

[30]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[31]  N. Kurata,et al.  Identification, characterization and interaction of HAP family genes in rice , 2008, Molecular Genetics and Genomics.

[32]  Sergei L. Kosakovsky Pond,et al.  Not so different after all: a comparison of methods for detecting amino acid sites under selection. , 2005, Molecular biology and evolution.

[33]  C. Zheng,et al.  PwHAP5, a CCAAT-binding transcription factor, interacts with PwFKBP12 and plays a role in pollen tube growth orientation in Picea wilsonii , 2011, Journal of experimental botany.

[34]  Ivo L. Hofacker,et al.  The RNAz web server: prediction of thermodynamically stable and evolutionarily conserved RNA structures , 2007, Nucleic Acids Res..

[35]  J. Harada,et al.  LECs go crazy in embryo development. , 2008, Trends in plant science.

[36]  Chih-Chieh Chen,et al.  (PS)2: protein structure prediction server , 2006, Nucleic Acids Res..

[37]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[38]  Robert B Goldberg,et al.  Arabidopsis LEAFY COTYLEDON1 represents a functionally specialized subunit of the CCAAT binding transcription factor , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Sergei L. Kosakovsky Pond,et al.  HyPhy: hypothesis testing using phylogenies , 2005, Bioinform..

[40]  Steven Maere,et al.  Genome duplication and the origin of angiosperms. , 2005, Trends in ecology & evolution.

[41]  A. Hughes,et al.  Gene duplication and the origin of novel proteins. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[42]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[43]  Ernesto Picardi,et al.  UTRdb and UTRsite (RELEASE 2010): a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs , 2009, Nucleic Acids Res..