Distinguishing protein-coding and noncoding genes in the human genome

Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of ≈24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation: the alternative hypothesis is that most of these ORFs are actually valid human genes that reflect gene innovation in the primate lineage or gene loss in the other lineages. Here, we reject this hypothesis by carefully analyzing the nonconserved ORFs—specifically, their properties in other primates. We show that the vast majority of these ORFs are random occurrences. The analysis yields, as a by-product, a major revision of the current human catalogs, cutting the number of protein-coding genes to ≈20,500. Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes.

[1]  Oliver H. Tam,et al.  Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes , 2008, Nature.

[2]  M. Fournier,et al.  The small nucleolar RNAs. , 1995, Annual review of biochemistry.

[3]  Qinghua Liu,et al.  Dicer-1 and R3D1-L catalyze microRNA maturation in Drosophila. , 2005, Genes & development.

[4]  Nathan M. Young,et al.  Hominoid evolution: synthesizing disparate data , 2004 .

[5]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[6]  Ram Samudrala,et al.  Mouse transcriptome: Neutral evolution of ‘non-coding’ complementary DNAs , 2004, Nature.

[7]  Alexander D. Johnson,et al.  Genetics of Candida albicans, a diploid human fungal pathogen. , 2007, Annual review of genetics.

[8]  Karl P Nightingale,et al.  Histone modifications: signalling receptors and potential elements of a heritable epigenetic code. , 2006, Current opinion in genetics & development.

[9]  Yvan Saeys,et al.  In search of the small ones: improved prediction of short exons in vertebrates, plants, fungi and protists , 2007, Bioinform..

[10]  George Karypis,et al.  Data clustering in life sciences , 2005, Molecular biotechnology.

[11]  Akira Ishizuka,et al.  Distinct roles for Argonaute proteins in small RNA-directed RNA cleavage pathways. , 2004, Genes & development.

[12]  Pedro Beltrão,et al.  Comparative evolutionary genomics unveils the molecular mechanism of reassignment of the CTG codon in Candida spp. , 2003, Genome research.

[13]  G. Coop,et al.  An evolutionary view of human recombination , 2007, Nature Reviews Genetics.

[14]  Nancy F. Hansen,et al.  Comparative analyses of multi-species sequences from targeted genomic regions , 2003, Nature.

[15]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[16]  M. Maiden,et al.  Population Structure and Properties of Candida albicans, as Determined by Multilocus Sequence Typing , 2005, Journal of Clinical Microbiology.

[17]  Manolis Kellis,et al.  Conservation of small RNA pathways in platypus Material Supplemental , 2008 .

[18]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[19]  E. Izaurralde,et al.  Getting to the Root of miRNA-Mediated Gene Silencing , 2008, Cell.

[20]  James E. Galagan,et al.  Comparative Gene Prediction using Conditional Random Fields , 2006, NIPS.

[21]  J. Graves,et al.  Resolution and evolution of the duck-billed platypus karyotype with an X1Y1X2Y2X3Y3X4Y4X5Y5 male sex chromosome constitution. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.

[23]  Clifford A. Meyer,et al.  Genome-wide analysis of estrogen receptor binding sites , 2006, Nature Genetics.

[24]  Sue Povey,et al.  The HGNC Database in 2008: a resource for the human genome , 2007, Nucleic Acids Res..

[25]  G. Langner,et al.  Electroreception and electrolocation in platypus , 1986, Nature.

[26]  Jeannie T. Lee,et al.  Tsix, a gene antisense to Xist at the X-inactivation centre , 1999, Nature Genetics.

[27]  T. Graves,et al.  The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes , 2003, Nature.

[28]  G. C. Mayer,et al.  The platypus is not a rodent: DNA hybridization, amniote phylogeny and the palimpsest theory. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[29]  C. Sander,et al.  A novel class of small RNAs bind to MILI protein in mouse testes , 2006, Nature.

[30]  Yong-shu He,et al.  [Structural variation in the human genome]. , 2009, Yi chuan = Hereditas.

[31]  R. Carthew,et al.  Methods and Materials , 1956, Eco-Art Therapy in Practice.

[32]  T. Wolfsberg,et al.  DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays , 2006, Nature Methods.

[33]  E. Diener,et al.  Immune System in a Monotreme: Studies on the Australian Echidna (Tachyglossus aculeatus) , 1965, Nature.

[34]  Ning Li,et al.  Identification of microRNAs from different tissues of chicken embryo and adult chicken , 2006, FEBS letters.

[35]  Scott G. Filler,et al.  The Hyphal-Associated Adhesin and Invasin Als3 of Candida albicans Mediates Iron Acquisition from Host Ferritin , 2008, PLoS pathogens.

[36]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[37]  Jane M J Lin,et al.  Identification and Characterization of Cell Type–Specific and Ubiquitous Chromatin Regulatory Structures in the Human Genome , 2007, PLoS genetics.

[38]  Miriam K. Konkel,et al.  Genome analysis of the platypus reveals unique signatures of evolution , 2008, Nature.

[39]  T. Andrews,et al.  The Ensembl automatic gene annotation system. , 2004, Genome research.

[40]  Y. Matsuda,et al.  Two mouse piwi-related genes: miwi and mili , 2001, Mechanisms of Development.

[41]  Howard Y. Chang,et al.  Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Yoichi Matsuda,et al.  Mili, a mammalian member of piwi family gene, is essential for spermatogenesis , 2004, Development.

[43]  B. Fry From genome to "venome": molecular origin and evolution of the snake venom proteome inferred from phylogenetic analysis of toxin sequences and related body proteins. , 2005, Genome research.

[44]  Jeannie T. Lee,et al.  Sex chromosome silencing in the marsupial male germ line , 2007, Proceedings of the National Academy of Sciences.

[45]  A. E. Tsong,et al.  Evolution of a Combinatorial Transcriptional Circuit A Case Study in Yeasts , 2003, Cell.

[46]  T. Speed,et al.  Characterization of the opossum immune genome provides insights into the evolution of the mammalian immune system. , 2007, Genome research.

[47]  N. M. Hollingsworth,et al.  The Mus81/Mms4 endonuclease acts independently of double-Holliday junction resolution to promote a distinct subset of crossovers during meiosis in budding yeast. , 2003, Genetics.

[48]  Kate E. Jones,et al.  The delayed rise of present-day mammals , 1990, Nature.

[49]  Anton J. Enright,et al.  A Slicer-independent role for Argonaute 2 in hematopoiesis and the microRNA pathway. , 2007, Genes & development.

[50]  C. Ponting,et al.  Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. , 2007, Genome research.

[51]  Y. Kohara,et al.  Retrotransposon Silencing by DNA Methylation Can Drive Mammalian Genomic Imprinting , 2007, PLoS genetics.

[52]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[53]  R. Plasterk,et al.  Micro RNAs in Animal Development , 2006, Cell.

[54]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[55]  P. Yaciuk,et al.  Identification of a Novel SNF2/SWI2 Protein Family Member, SRCAP, Which Interacts with CREB-binding Protein* , 1999, The Journal of Biological Chemistry.

[56]  K. Weiss,et al.  Mineralized tissue and vertebrate evolution: The secretory calcium-binding phosphoprotein gene cluster , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[57]  S. Filler,et al.  Als3 Is a Candida albicans Invasin That Binds to Cadherins and Induces Endocytosis by Host Cells , 2007, PLoS biology.

[58]  George Newport,et al.  The diploid genome sequence of Candida albicans. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[59]  N. Vinckenbosch,et al.  Evolutionary fate of retroposed gene copies in the human genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[60]  L. Maquat,et al.  Mechanistic links between nonsense-mediated mRNA decay and pre-mRNA splicing in mammalian cells. , 2005, Current opinion in cell biology.

[61]  George Newport,et al.  A Human-Curated Annotation of the Candida albicans Genome , 2005, PLoS genetics.

[62]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[63]  N. Gemmell,et al.  Determining Platypus Relationships , 1995 .

[64]  A. Gnirke,et al.  Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome , 2002, Genome Biology.

[65]  Xiaodong Wang,et al.  R2D2, a Bridge Between the Initiation and Effector Steps of the Drosophila RNAi Pathway , 2003, Science.

[66]  Titia Sijen,et al.  Secondary siRNAs Result from Unprimed RNA Synthesis and Form a Distinct Class , 2007, Science.

[67]  A. Orth,et al.  Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[68]  A. Boyde,et al.  Marsupial and monotreme enamel structure. , 1987, Scanning microscopy.

[69]  Kevin Struhl,et al.  Relationships between p63 binding, DNA sequence, transcription activity, and biological function in human cells. , 2006, Molecular cell.

[70]  J. M. Thomson,et al.  Argonaute2 Is the Catalytic Engine of Mammalian RNAi , 2004, Science.

[71]  D. Griffin,et al.  Conservation of chromosome arrangement and position of the X in mammalian sperm suggests functional significance , 2004, Chromosome Research.

[72]  Mary Ann Handel,et al.  The XY body: a specialized meiotic chromatin domain. , 2004, Experimental cell research.

[73]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[74]  B. Cormack,et al.  A family of glycosylphosphatidylinositol-linked aspartyl proteases is required for virulence of Candida glabrata , 2007, Proceedings of the National Academy of Sciences.

[75]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[76]  Elaine R. Mardis,et al.  Application of a superword array in genome assembly , 2006, Nucleic acids research.

[77]  E. Sontheimer,et al.  Distinct Roles for Drosophila Dicer-1 and Dicer-2 in the siRNA/miRNA Silencing Pathways , 2004, Cell.

[78]  Anthony T Papenfuss,et al.  Defensins and the convergent evolution of platypus and reptile venom genes. , 2008, Genome research.

[79]  M. Tuite,et al.  The CUG codon is decoded in vivo as serine and not leucine in Candida albicans. , 1995, Nucleic acids research.

[80]  Judith Berman,et al.  Haplotype Mapping of a Diploid Non-Meiotic Organism Using Existing and Induced Aneuploidies , 2007, PLoS genetics.

[81]  Beate Wittbrodt,et al.  Differences in vertebrate microRNA expression , 2006, Proceedings of the National Academy of Sciences.

[82]  Ingrid Lafontaine,et al.  Comparative genomics in hemiascomycete yeasts: evolution of sex, silencing, and subtelomeres. , 2005, Molecular biology and evolution.

[83]  Fran Lewitter,et al.  Intragenic tandem repeats generate functional variability , 2005, Nature Genetics.

[84]  S. Eddy A Model of the Statistical Power of Comparative Genome Sequence Analysis , 2005, PLoS biology.

[85]  James W. Thomas Faculty Opinions recommendation of Distinguishing protein-coding and noncoding genes in the human genome. , 2007 .

[86]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[87]  M. Kiefmann,et al.  Retroposed Elements as Archives for the Evolutionary History of Placental Mammals , 2006, PLoS biology.

[88]  Thomas E. Royce,et al.  Global Identification of Human Transcribed Sequences with Genome Tiling Arrays , 2004, Science.

[89]  S. Cohen,et al.  Genome-Wide Analysis of mRNAs Regulated by Drosha and Argonaute Proteins in Drosophila melanogaster , 2006, Molecular and Cellular Biology.

[90]  E. Devor,et al.  In vitro and in silico annotation of conserved and nonconserved microRNAs in the genome of the marsupial Monodelphis domestica. , 2007, The Journal of heredity.

[91]  Xiaomin Zhao,et al.  Discovering the secrets of the Candida albicans agglutinin-like sequence (ALS) gene family--a sticky pursuit. , 2008, Medical mycology.

[92]  M. Renfree Monotreme and marsupial reproduction. , 1995, Reproduction, fertility, and development.

[93]  D. Soll,et al.  High-frequency switching of colony morphology in Candida albicans. , 1985, Science.

[94]  R. Medcalf,et al.  The t-PA -7351C>T enhancer polymorphism decreases Sp1 and Sp3 protein binding affinity and transcriptional responsiveness to retinoic acid. , 2004, Blood.

[95]  姜祈傑 「Science」與「Nature」之科學計量分析 , 2008 .

[96]  Jürgen Brosius,et al.  Retroposed SNOfall--a mammalian-wide comparison of platypus snoRNAs. , 2008, Genome research.

[97]  J. Graves,et al.  Bmc Evolutionary Biology the Evolution of Imprinting: Chromosomal Mapping of Orthologues of Mammalian Imprinted Domains in Monotreme and Marsupial Mammals , 2007 .

[98]  Michael R. Green,et al.  Transcriptional regulatory elements in the human genome. , 2006, Annual review of genomics and human genetics.

[99]  N. Lau,et al.  Characterization of the piRNA Complex from Rat Testes , 2006, Science.

[100]  S. Cohen,et al.  microRNA functions. , 2007, Annual review of cell and developmental biology.

[101]  Rudolf Jaenisch,et al.  Characterization of a highly variable eutherian microRNA gene. , 2005, RNA.

[102]  Leah Barrera,et al.  A high-resolution map of active promoters in the human genome , 2005, Nature.

[103]  J. L. Argueso,et al.  Competing Crossover Pathways Act During Meiosis in Saccharomyces cerevisiae , 2004, Genetics.

[104]  J. Meyne,et al.  Ordered tandem arrangement of chromosomes in the sperm heads of monotreme mammals. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[105]  J. Graves,et al.  In the platypus a meiotic chain of ten sex chromosomes shares genes with the bird Z and mammal X chromosomes , 2004, Nature.

[106]  Vladimir Gvozdev,et al.  A Distinct Small RNA Pathway Silences Selfish Genetic Elements in the Germline , 2006, Science.

[107]  D. Soll,et al.  In Candida albicans, white-opaque switchers are homozygous for mating type. , 2002, Genetics.

[108]  I. Hatada,et al.  Unregulated Expression of the Imprinted Genes H19 andIgf2r in Mouse Uniparental Fetuses* , 2002, The Journal of Biological Chemistry.

[109]  Jürgen Brosius,et al.  Evolution of small nucleolar RNAs in nematodes , 2006, Nucleic acids research.

[110]  L. Morel,et al.  The lipocalin sperm coating lizard epididymal secretory protein family: mRNA structural analysis and sequential expression during the annual cycle of the lizard, Lacerta vivipara. , 2000, Journal of molecular endocrinology.

[111]  M. Wakefield,et al.  DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions. , 2007, Genomics.

[112]  X. Chen,et al.  The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells , 2006, Nature Genetics.

[113]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[114]  Feng Gao,et al.  Comparison of various algorithms for recognizing short coding sequences of human genes , 2004, Bioinform..

[115]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[116]  E. Eichler,et al.  Recent duplication, domain accretion and the dynamic mutation of the human genome. , 2001, Trends in genetics : TIG.

[117]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[118]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[119]  Manolis Kellis,et al.  Discrete Small RNA-Generating Loci as Master Regulators of Transposon Activity in Drosophila , 2007, Cell.

[120]  Christina A. Cuomo,et al.  Assembly of the Candida albicans genome into sixteen supercontigs aligned on the eight chromosomes , 2007, Genome Biology.

[121]  O. Oftedal The Origin of Lactation as a Water Source for Parchment-Shelled Eggs , 2002, Journal of Mammary Gland Biology and Neoplasia.

[122]  Yong-Su Jin,et al.  Genome sequence of the lignocellulose-bioconverting and xylose-fermenting yeast Pichia stipitis , 2007, Nature Biotechnology.

[123]  D. Haussler,et al.  Article Identification and Characterization of Multi-Species Conserved Sequences , 2022 .

[124]  Koby Crammer,et al.  Global Discriminative Learning for Higher-Accuracy Computational Gene Prediction , 2007, PLoS Comput. Biol..

[125]  J. Galagan,et al.  Conrad: gene prediction using conditional random fields. , 2007, Genome research.

[126]  Manolis Kellis,et al.  Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs. , 2007, Genome research.

[127]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[128]  Michael Q. Zhang,et al.  Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome , 2007, Cell.

[129]  Wei Yan,et al.  Cloning and expression profiling of testis-expressed microRNAs. , 2007, Developmental biology.

[130]  Haifan Lin,et al.  A novel class of small RNAs in mouse spermatogenic cells. , 2006, Genes & development.

[131]  P. Milburn,et al.  A pharmacological and biochemical investigation of the venom from the platypus (Ornithorhynchus anatinus). , 1995, Toxicon : official journal of the International Society on Toxinology.

[132]  Rolf Apweiler,et al.  VARSPLIC: alternatively-spliced protein sequences derived from SWISS-PROT and TrEMBL , 2000, Bioinform..

[133]  Kuniaki Saito,et al.  Processing of Pre-microRNAs by the Dicer-1–Loquacious Complex in Drosophila Cells , 2005, PLoS biology.

[134]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[135]  F. Sanger,et al.  Sequence and organization of the human mitochondrial genome , 1981, Nature.

[136]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[137]  A. Schneemann,et al.  Essential function in vivo for Dicer-2 in host defense against RNA viruses in drosophila , 2006, Nature Immunology.

[138]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[139]  Wigard P Kloosterman,et al.  In situ detection of miRNAs in animal embryos using LNA-modified oligonucleotide probes , 2005, Nature Methods.

[140]  A. Johnson,et al.  Evidence for mating of the "asexual" yeast Candida albicans in a mammalian host. , 2000, Science.

[141]  J. Chrivia,et al.  The Chromatin Remodeling Protein, SRCAP, Is Critical for Deposition of the Histone Variant H2A.Z at Promoters* , 2007, Journal of Biological Chemistry.

[142]  S. Lockhart,et al.  Lodderomyces elongisporus Masquerading as Candida parapsilosis as a Cause of Bloodstream Infections , 2007, Journal of Clinical Microbiology.

[143]  M. Bucan,et al.  Promoter features related to tissue specificity as measured by Shannon entropy , 2005, Genome Biology.

[144]  Francesca Chiaromonte,et al.  Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. , 2005, Genome research.

[145]  A. Brivanlou,et al.  Signal Transduction and the Control of Gene Expression , 2002, Science.

[146]  Andrew Fire,et al.  Distinct Populations of Primary and Secondary Effectors During RNAi in C. elegans , 2007, Science.

[147]  J. Brosius,et al.  Waves of genomic hitchhikers shed light on the evolution of gamebirds (Aves: Galliformes) , 2007, BMC Evolutionary Biology.

[148]  L. Duret,et al.  Recombination drives the evolution of GC-content in the human genome. , 2004, Molecular biology and evolution.

[149]  D. Maglott,et al.  Chapter 20 . Using the Map Viewer to Explore Genomes , 2002 .

[150]  Nancy F. Hansen,et al.  Genomic evidence for a complete sexual cycle in Candida albicans , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[151]  H. Horvitz,et al.  MicroRNA Expression in Zebrafish Embryonic Development , 2005, Science.

[152]  P. Temple-Smith,et al.  Seasonal breeding biology of the platypus, Ornithorhynchus anatinus (Shaw, 1799), with special reference to the male , 1973 .

[153]  Ju-Kyung Yu,et al.  Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley , 2005, BMC Genomics.

[154]  G. Butler,et al.  A Genome Sequence Survey Shows that the Pathogenic Yeast Candida parapsilosis Has a Defective MTLa1 Allele at Its Mating Type Locus , 2005, Eukaryotic Cell.

[155]  S. Searle,et al.  The Ensembl analysis pipeline. , 2004, Genome research.

[156]  Chris Sander,et al.  The developmental miRNA profiles of zebrafish as determined by small RNA cloning. , 2005, Genes & development.

[157]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[158]  P. T. Magee,et al.  Genome-Wide Single-Nucleotide Polymorphism Map for Candida albicans , 2004, Eukaryotic Cell.

[159]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[160]  Phillip D. Zamore,et al.  Sorting of Drosophila Small Silencing RNAs , 2007, Cell.

[161]  A. Arnold,et al.  Dosage compensation is less effective in birds than in mammals , 2007, Journal of biology.

[162]  T. Graves,et al.  Bird-like sex chromosomes of platypus imply recent origin of mammal sex chromosomes. , 2008, Genome research.

[163]  Radu Dobrin,et al.  Dissecting self-renewal in stem cells with RNA interference , 2006, Nature.

[164]  T. Graves,et al.  Characterizing the chromosomes of the platypus (Ornithorhynchus anatinus) , 2007, Chromosome Research.

[165]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[166]  D. Haussler,et al.  UCSC genome browser tutorial. , 2008, Genomics.

[167]  C. Sander,et al.  A Mammalian microRNA Expression Atlas Based on Small RNA Library Sequencing , 2007, Cell.

[168]  Toshiaki Watanabe,et al.  Identification and characterization of two novel classes of small RNAs in the mouse germline: retrotransposon-derived siRNAs in oocytes and germline small RNAs in testes. , 2006, Genes & development.

[169]  Eugene Berezikov,et al.  Cloning and expression of new microRNAs from zebrafish , 2006, Nucleic acids research.

[170]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[171]  Ayyalusamy Ramamoorthy,et al.  LL-37, the only human member of the cathelicidin family of antimicrobial peptides. , 2006, Biochimica et biophysica acta.

[172]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[173]  C. Myers,et al.  Homeobox D10 induces phenotypic reversion of breast tumor cells in a three-dimensional culture model. , 2005, Cancer research.

[174]  Manolis Kellis,et al.  Reliable prediction of regulator targets using 12 Drosophila genomes. , 2007, Genome research.

[175]  Colin N. Dewey,et al.  Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004, Nature.

[176]  Hao Li,et al.  The Evolution of Combinatorial Gene Regulation in Fungi , 2008, PLoS biology.

[177]  M. Zanetti Cathelicidins, multifunctional peptides of the innate immunity , 2004, Journal of leukocyte biology.

[178]  Allen D. Delaney,et al.  Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing , 2007, Nature Methods.

[179]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[180]  A. J. Schroeder,et al.  Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. , 2007, Genome research.

[181]  M. Pfaller,et al.  Epidemiology of Invasive Candidiasis: a Persistent Public Health Problem , 2007, Clinical Microbiology Reviews.

[182]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[183]  C. Ponting,et al.  An analysis of the gene complement of a marsupial, Monodelphis domestica: evolution of lineage-specific genes and giant chromosomes. , 2007, Genome research.

[184]  T. Rowe,et al.  The oldest platypus and its bearing on divergence timing of the platypus and echidna clades , 2008, Proceedings of the National Academy of Sciences.

[185]  S. Batalov,et al.  A Strategy for Probing the Function of Noncoding RNAs Finds a Repressor of NFAT , 2005, Science.

[186]  Simon Easteal,et al.  Rates of genome evolution and branching order from whole genome analysis. , 2007, Molecular biology and evolution.

[187]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[188]  Webb Miller,et al.  Using genomic data to unravel the root of the placental mammal phylogeny. , 2007, Genome research.

[189]  Y. Sakaki,et al.  Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes , 2008, Nature.

[190]  Adam M. Gustafson,et al.  microRNA-Directed Phasing during Trans-Acting siRNA Biogenesis in Plants , 2005, Cell.

[191]  A. Denli,et al.  Normal microRNA Maturation and Germ-Line Stem Cell Maintenance Requires Loquacious, a Double-Stranded RNA-Binding Domain Protein , 2005, PLoS biology.

[192]  P. Kitts Genome Assembly and Annotation Process , 2002 .

[193]  B. Dujon,et al.  Genome evolution in yeasts , 2004, Nature.

[194]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[195]  W. Theurkauf,et al.  Biogenesis and germline functions of piRNAs , 2007, Development.

[196]  J. Felsenstein An alternating least squares approach to inferring phylogenies from pairwise distances. , 1997, Systematic biology.

[197]  Wei Li,et al.  Model-based analysis of two-color arrays (MA2C) , 2007, Genome Biology.

[198]  Bronwen L. Aken,et al.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences , 2007, Nature.

[199]  Manolis Kellis,et al.  Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. , 2007, Genome research.

[200]  C. Bult,et al.  Discrimination of Non-Protein-Coding Transcripts from Protein-Coding mRNA , 2006, RNA biology.

[201]  P. Schuster,et al.  Complete suboptimal folding of RNA and the stability of secondary structures. , 1999, Biopolymers.

[202]  Eugene Berezikov,et al.  A Role for Piwi and piRNAs in Germ Cell Maintenance and Transposon Silencing in Zebrafish , 2007, Cell.

[203]  Bing Su,et al.  Rapid evolution of an X-linked microRNA cluster in primates. , 2007, Genome research.

[204]  James A. Cuff,et al.  Genome sequence, comparative analysis and haplotype structure of the domestic dog , 2005, Nature.

[205]  G. Hannon,et al.  The Piwi-piRNA Pathway Provides an Adaptive Defense in the Transposon Arms Race , 2007, Science.

[206]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[207]  Leo Goodstadt,et al.  Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human , 2006, PLoS Comput. Biol..

[208]  R. Aitken,et al.  New insights into the molecular mechanisms of sperm-egg interaction , 2007, Cellular and Molecular Life Sciences.

[209]  D. Bartel,et al.  MicroRNA-Directed Cleavage of HOXB8 mRNA , 2004, Science.

[210]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[211]  Mitchell D. Probasco,et al.  Feeder-independent culture of human embryonic stem cells , 2006, Nature Methods.

[212]  Kevin P. Byrne,et al.  Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts , 2006, Nature.

[213]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[214]  K. Struhl Transcriptional noise and the fidelity of initiation by RNA polymerase II , 2007, Nature Structural &Molecular Biology.

[215]  James M. Wilson,et al.  Cathelicidins - a family of multifunctional antimicrobial peptides , 2003, Cellular and Molecular Life Sciences CMLS.

[216]  G. Hannon,et al.  Processing of primary microRNAs by the Microprocessor complex , 2004, Nature.

[217]  A. Caudy,et al.  Role for a bidentate ribonuclease in the initiation step of RNA interference , 2001 .

[218]  S. Kasif,et al.  Human-mouse gene identification by comparative evidence integration and evolutionary analysis. , 2003, Genome research.

[219]  J. Heitman,et al.  Sex and virulence of human pathogenic fungi. , 2007, Advances in genetics.

[220]  Jianzhi Zhang,et al.  Largest vertebrate vomeronasal type 1 receptor gene repertoire in the semiaquatic platypus. , 2007, Molecular biology and evolution.

[221]  Pedro J. Batista,et al.  Analysis of the C. elegans Argonaute Family Reveals that Distinct Argonautes Act Sequentially during RNAi , 2006, Cell.

[222]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[223]  James G. R. Gilbert,et al.  The vertebrate genome annotation (Vega) database , 2004, Nucleic Acids Res..

[224]  Z. Weng,et al.  A Global Map of p53 Transcription-Factor Binding Sites in the Human Genome , 2006, Cell.

[225]  D. Haussler,et al.  Retrocopy contributions to the evolution of the human genome , 2008, BMC Genomics.

[226]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[227]  B. Tuch,et al.  Interlocking Transcriptional Feedback Loops Control White-Opaque Switching in Candida albicans , 2007, PLoS biology.

[228]  J. Byrd,et al.  M6P/IGF2R imprinting evolution in mammals. , 2000, Molecular cell.

[229]  P. Temple-Smith,et al.  Uncertain breeding: a short history of reproduction in monotremes. , 2001, Reproduction, fertility, and development.

[230]  G. Helt,et al.  Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution , 2005, Science.

[231]  David Haussler,et al.  The UCSC Genome Browser Database: 2008 update , 2007, Nucleic Acids Res..

[232]  Jeannie T. Lee,et al.  Polycomb Proteins Targeted by a Short Repeat RNA to the Mouse X Chromosome , 2008, Science.

[233]  Ronald H. A. Plasterk,et al.  Mouse microRNA profiles determined with a new and sensitive cloning method , 2006, Nucleic acids research.

[234]  N. Gow,et al.  Candida albicans Iff11, a Secreted Protein Required for Cell Wall Structure and Virulence , 2007, Infection and Immunity.

[235]  Phillip D. Zamore,et al.  Drosophila microRNAs Are Sorted into Functionally Distinct Argonaute Complexes after Production by Dicer-1 , 2007, Cell.

[236]  T. Ohta Slightly Deleterious Mutant Substitutions in Evolution , 1973, Nature.

[237]  M. Ghannoum,et al.  Temporal analysis of Candida albicans gene expression during biofilm development. , 2007, Microbiology.

[238]  Michael R. Brent,et al.  Using Multiple Alignments to Improve Gene Prediction , 2005, RECOMB.

[239]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[240]  T. Yatskievych,et al.  MicroRNA expression during chick embryo development , 2006, Developmental dynamics : an official publication of the American Association of Anatomists.

[241]  D. Soll,et al.  Opaque cells signal white cells to form biofilms in Candida albicans , 2006, The EMBO journal.

[242]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[243]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[244]  Shane C. Dillon,et al.  The landscape of histone modifications across 1% of the human genome in five human cell lines. , 2007, Genome research.

[245]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[246]  Simon C. Potter,et al.  An overview of Ensembl. , 2004, Genome research.

[247]  Carolyn J. Brown,et al.  A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome , 1991, Nature.

[248]  A. Schedl,et al.  Aniridia-associated translocations, DNase hypersensitivity, sequence comparison and transgenic analysis redefine the functional domain of PAX6. , 2001, Human molecular genetics.

[249]  Myles A Brown,et al.  Spatial and temporal recruitment of androgen receptor and its coactivators involves chromosomal looping and polymerase tracking. , 2005, Molecular cell.

[250]  Kuniaki Saito,et al.  A Slicer-Mediated Mechanism for Repeat-Associated siRNA 5' End Formation in Drosophila , 2007, Science.

[251]  A. E. Tsong,et al.  Evolution of alternative transcriptional circuits with identical logic , 2006, Nature.

[252]  J. A. Marshall Graves,et al.  Two monotreme cell lines, derived from female platypuses (Ornithorhynchus anatinus; Monotremata, mammalia) , 1984, In Vitro.

[253]  J. Pettigrew,et al.  Electroreception in monotremes. , 1999, The Journal of experimental biology.

[254]  G. Hannon,et al.  MIWI2 is essential for spermatogenesis and repression of transposons in the mouse male germline. , 2007, Developmental cell.

[255]  T. Tuschl,et al.  Human Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. , 2004, Molecular cell.

[256]  David L. Steffen,et al.  The DNA sequence of the human X chromosome , 2005, Nature.

[257]  P. Wassarman,et al.  Features that affect secretion and assembly of zona pellucida glycoproteins during mammalian oogenesis. , 2007, Society of Reproduction and Fertility supplement.

[258]  J. Rinn,et al.  The transcriptional activity of human Chromosome 22. , 2003, Genes & development.

[259]  Nathaniel D. Heintzman,et al.  The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome , 2007, Cellular and Molecular Life Sciences.

[260]  W. H. Caldwell The Embryology of Monotremata and Marsupialia. Part I. [Abstract] , 1887 .

[261]  M. Pfaller,et al.  Variations in DNA subtype and antifungal susceptibility among clinical isolates of Candida tropicalis. , 1997, Diagnostic microbiology and infectious disease.

[262]  P. T. Magee,et al.  Construction of an SfiI macrorestriction map of the Candida albicans genome , 1993, Journal of bacteriology.

[263]  E. Dees,et al.  The product of the H19 gene may function as an RNA , 1990, Molecular and cellular biology.

[264]  Kuniaki Saito,et al.  Specific association of Piwi with rasiRNAs derived from retrotransposon and heterochromatic regions in the Drosophila genome. , 2006, Genes & development.

[265]  J. Fickett,et al.  Assessment of protein coding measures. , 1992, Nucleic acids research.

[266]  E. Parisi,et al.  Sex- and tissue-specific expression of aspartic proteinases in Danio rerio (zebrafish). , 2000, Gene.

[267]  J. Trowsdale,et al.  Comparative Genomics of Natural Killer Cell Receptor Gene Clusters , 2005, PLoS genetics.

[268]  Richard D Emes,et al.  Comparison of the genomes of human and mouse lays the foundation of genome zoology. , 2003, Human molecular genetics.

[269]  V. Kim,et al.  The nuclear RNase III Drosha initiates microRNA processing , 2003, Nature.

[270]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology , 2004, Nucleic Acids Res..

[271]  Ravi Sachidanandam,et al.  A germline-specific class of small RNAs binds mammalian Piwi proteins , 2006, Nature.

[272]  Ravi Sachidanandam,et al.  Developmentally Regulated piRNA Clusters Implicate MILI in Transposon Control , 2007, Science.

[273]  B. Trask,et al.  Segmental duplications: organization and impact within the current human genome project assembly. , 2001, Genome research.

[274]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[275]  Chuong B. Do,et al.  CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction , 2007, Genome Biology.

[276]  Gerald J. Wyckoff,et al.  Rapid evolution of male reproductive genes in the descent of man , 2000, Nature.

[277]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[278]  Zuoren Yu,et al.  MicroRNA Mirn122a Reduces Expression of the Posttranscriptionally Regulated Germ Cell Transition Protein 2 (Tnp2) Messenger RNA (mRNA) by mRNA Cleavage1 , 2005, Biology of reproduction.

[279]  P. Sharp,et al.  Cre-lox-regulated conditional RNA interference from transgenes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[280]  R. Aharonov,et al.  Identification of hundreds of conserved and nonconserved human microRNAs , 2005, Nature Genetics.

[281]  Howard Y. Chang,et al.  Functional Demarcation of Active and Silent Chromatin Domains in Human HOX Loci by Noncoding RNAs , 2007, Cell.

[282]  Christopher M. Player,et al.  Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans , 2006, Cell.

[283]  Jean L. Chang,et al.  An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[284]  J. Graves,et al.  Origin and evolution of spermatogenesis genes on the human sex chromosomes. , 2007, Society of Reproduction and Fertility supplement.

[285]  Florian Caiment,et al.  RNAi-Mediated Allelic trans-Interaction at the Imprinted Rtl1/Peg11 Locus , 2005, Current Biology.

[286]  B. Rost,et al.  Distinguishing Protein-Coding from Non-Coding RNAs through Support Vector Machines , 2006, PLoS genetics.