Raalin, a transcript enriched in the honey bee brain, is a remnant of genomic rearrangement in hymenoptera

We identified a predicted compact cysteine‐rich sequence in the honey bee genome that we called ‘Raalin’. Raalin transcripts are enriched in the brain of adult honey bee workers and drones, with only minimum expression in other tissues or in pre‐adult stages. Open‐reading frame (ORF) homologues of Raalin were identified in the transcriptomes of fruit flies, mosquitoes and moths. The Raalin‐like gene from Drosophila melanogaster encodes for a short secreted protein that is maximally expressed in the adult brain with negligible expression in other tissues or pre‐imaginal stages. Raalin‐like sequences have also been found in the recently sequenced genomes of six ant species, but not in the jewel wasp Nasonia vitripennis. As in the honey bee, the Raalin‐like sequences of ants do not have an ORF. A comparison of the genome region containing Raalin in the genomes of bees, ants and the wasp provides evolutionary support for an extensive genome rearrangement in this sequence. Our analyses identify a new family of ancient cysteine‐rich short sequences in insects in which insertions and genome rearrangements may have disrupted this locus in the branch leading to the Hymenoptera. The regulated expression of this transcript suggests that it has a brain‐specific function.

[1]  Shuai Zhan,et al.  The Monarch Butterfly Genome Yields Insights into Long-Distance Migration , 2011, Cell.

[2]  E. M. Muro,et al.  Functional evidence of post-transcriptional regulation by pseudogenes. , 2011, Biochimie.

[3]  P. Fraser,et al.  No-Nonsense Functions for Long Noncoding RNAs , 2011, Cell.

[4]  R. Shiekhattar,et al.  Long non-coding RNAs and enhancers. , 2011, Current opinion in genetics & development.

[5]  L. Keller,et al.  The genome of the fire ant Solenopsis invicta , 2011, Proceedings of the National Academy of Sciences.

[6]  M. Muers,et al.  Functional genomics: The modENCODE guide to the genome , 2011, Nature Reviews Genetics.

[7]  Shuli Kang,et al.  Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network , 2011, Nucleic acids research.

[8]  Christine G. Elsik,et al.  Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera , 2010, Nucleic Acids Res..

[9]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[10]  Claire Fraser-Liggett,et al.  Sequencing of Culex quinquefasciatus Establishes a Platform for Mosquito Comparative Genomics , 2010, Science.

[11]  Michal Linial,et al.  A predictor for toxin-like proteins exposes cell modulator candidates within viral genomes , 2010, Bioinform..

[12]  Jun Wang,et al.  Genomic Comparison of the Ants Camponotus floridanus and Harpegnathos saltator , 2010, Science.

[13]  Howard Y. Chang,et al.  Long Noncoding RNA as Modular Scaffold of Histone Modification Complexes , 2010, Science.

[14]  N. Friedman,et al.  Comprehensive comparative analysis of strand-specific RNA sequencing methods , 2010, Nature Methods.

[15]  Lin-Yu Tseng,et al.  DBCP: a web server for disulfide bonding connectivity pattern prediction without the prior knowledge of the bonding state of cysteines , 2010, Nucleic Acids Res..

[16]  T. Hughes,et al.  Most “Dark Matter” Transcripts Are Associated With Known Genes , 2010, PLoS biology.

[17]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[18]  Erich Bornberg-Bauer,et al.  Functional and Evolutionary Insights from the Genomes of Three Parasitoid Nasonia Species , 2010, Science.

[19]  Doina Caragea,et al.  BeetleBase in 2010: revisions to provide comprehensive genomic information for Tribolium castaneum , 2009, Nucleic Acids Res..

[20]  Nathan Linial,et al.  Codon usage is associated with the evolutionary age of genes in metazoan genomes , 2009, BMC Evolutionary Biology.

[21]  D. Normile Insect genetics. Sequencing 40 silkworm genomes unravels history of cultivation. , 2009, Science.

[22]  D. Spector,et al.  Long noncoding RNAs: functional surprises from the RNA world. , 2009, Genes & development.

[23]  Michal Linial,et al.  ClanTox: a classifier of short animal toxins , 2009, Nucleic Acids Res..

[24]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[25]  Kei-Hoi Cheung,et al.  Pseudofam: the pseudogene families database , 2008, Nucleic Acids Res..

[26]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[27]  Tim R. Mercer,et al.  Differentiating Protein-Coding and Noncoding RNA: Challenges and Ambiguities , 2008, PLoS Comput. Biol..

[28]  Piero Carninci Non-coding RNA transcription: turning on neighbours , 2008, Nature Cell Biology.

[29]  Peer Bork,et al.  The Genome of the Model Beetle and Pest Tribolium Castaneum Vertebrate-specific Orthologues Insect-specific Orthologues Homology Undetectable Similarity , 2022 .

[30]  Jean-Michel Claverie,et al.  Phylogeny.fr: robust phylogenetic analysis for the non-specialist , 2008, Nucleic Acids Res..

[31]  R. Drysdale FlyBase : a database for the Drosophila research community. , 2008, Methods in molecular biology.

[32]  J. Coulombe-Huntington,et al.  Intron loss and gain in Drosophila. , 2007, Molecular biology and evolution.

[33]  Manolis Kellis,et al.  Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. , 2007, Genome research.

[34]  Colin N. Dewey,et al.  Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures , 2007, Nature.

[35]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[36]  Andrew K. Jones,et al.  Insect genomes: challenges and opportunities for Neuroscience , 2007, Invertebrate Neuroscience.

[37]  G. Bloch,et al.  Genes encoding putative Takeout/juvenile hormone binding proteins in the honeybee (Apis mellifera) and modulation by age and juvenile hormone of the takeout-like gene GB19811. , 2007, Insect biochemistry and molecular biology.

[38]  Evgeny M. Zdobnov,et al.  Genome Sequence of Aedes aegypti, a Major Arbovirus Vector , 2007, Science.

[39]  J. Dow,et al.  Using FlyAtlas to identify better Drosophila melanogaster models of human disease , 2007, Nature Genetics.

[40]  Sachi Inagaki,et al.  Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA , 2007, Nature Cell Biology.

[41]  T. Gingeras,et al.  Genome-wide transcription and the implications for genomic organization , 2007, Nature Reviews Genetics.

[42]  Michal Linial,et al.  Novel families of toxin-like peptides in insects and mammals: a computational approach. , 2007, Journal of molecular biology.

[43]  F. Costa,et al.  Non-coding RNAs: lost in translation? , 2007, Gene.

[44]  Madeline A. Crosby,et al.  FlyBase: genomes by the dozen , 2006, Nucleic Acids Res..

[45]  Y. Shemesh,et al.  Molecular and phylogenetic analyses reveal mammalian-like clockwork in the honey bee (Apis mellifera) and shed new light on the molecular evolution of the circadian clock. , 2006, Genome research.

[46]  Ying Wang,et al.  Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[47]  Jun Kawai,et al.  The Abundance of Short Proteins in the Mammalian Proteome , 2006, PLoS genetics.

[48]  Lars Arvestad,et al.  Genome-Wide Survey for Biologically Functional Pseudogenes , 2006, PLoS Comput. Biol..

[49]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[50]  The Chinese Human Genome Sequencing Consortium Insights into social insects from the genome of the honeybee Apis mellifera , 2006 .

[51]  A. Hüttenhofer,et al.  Non-coding RNAs: hope or hype? , 2005, Trends in genetics : TIG.

[52]  Gerald M Rubin,et al.  Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Dawei Li,et al.  A Draft Sequence for the Genome of the Domesticated Silkworm ( Bombyx mori ) , 2004 .

[54]  Tin Wee Tan,et al.  SDPMOD: an automated comparative modeling server for small disulfide-bonded proteins , 2004, Nucleic Acids Res..

[55]  Masaru Tomita,et al.  A new role for expressed pseudogenes as ncRNA: regulation of mRNA stability of its homologous coding gene , 2004, Journal of Molecular Medicine.

[56]  Antony V. Cox,et al.  The Ensembl Web site: mechanics of a genome browser. , 2004, Genome research.

[57]  J. Mattick RNA regulation: a new genetics? , 2004, Nature Reviews Genetics.

[58]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[59]  Jean-Christophe Gelly,et al.  The KNOTTIN website and database: a new information system dedicated to the knottin scaffold , 2004, Nucleic Acids Res..

[60]  G. Robinson,et al.  Gene Expression Profiles in the Brain Predict Behavior in Individual Honey Bees , 2003, Science.

[61]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[62]  Michal Linial,et al.  How incorrect annotations evolve--the case of short ORFs. , 2003, Trends in biotechnology.

[63]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[64]  Jian Wang,et al.  The Genome Sequence of the Malaria Mosquito Anopheles gambiae , 2002, Science.

[65]  G. Robinson,et al.  Chronobiology: Reversal of honeybee behavioural rhythms , 2001, Nature.

[66]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[67]  L. Duret,et al.  Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[68]  C. Sander,et al.  Characterization of new proteins found by analysis of short open reading frames from the full yeast genome , 1997, Yeast.

[69]  V. Pande,et al.  On the theory of folding kinetics for short proteins. , 1997, Folding & design.

[70]  C. Roumestand,et al.  On the Convergent Evolution of Animal Toxins , 1997, The Journal of Biological Chemistry.

[71]  A. Means,et al.  Tissue-specific expression of a chicken calmodulin pseudogene lacking intervening sequences. , 1983, Proceedings of the National Academy of Sciences of the United States of America.