Patterns of coding variation in the complete exomes of three Neandertals

Significance We use a hybridization approach to enrich the DNA from the protein-coding fraction of the genomes of two Neandertal individuals from Spain and Croatia. By analyzing these two exomes together with the genome sequence of a Neandertal from Siberia we show that the genetic diversity of Neandertals was lower than that of present-day humans and that the pattern of coding variation suggests that Neandertal populations were small and isolated from one another. We also show that genes involved in skeletal morphology have changed more than expected on the Neandertal evolutionary lineage whereas genes involved in pigmentation and behavior have changed more on the modern human lineage. We present the DNA sequence of 17,367 protein-coding genes in two Neandertals from Spain and Croatia and analyze them together with the genome sequence recently determined from a Neandertal from southern Siberia. Comparisons with present-day humans from Africa, Europe, and Asia reveal that genetic diversity among Neandertals was remarkably low, and that they carried a higher proportion of amino acid-changing (nonsynonymous) alleles inferred to alter protein structure or function than present-day humans. Thus, Neandertals across Eurasia had a smaller long-term effective population than present-day humans. We also identify amino acid substitutions in Neandertals and present-day humans that may underlie phenotypic differences between the two groups. We find that genes involved in skeletal morphology have changed more in the lineage leading to Neandertals than in the ancestral lineage common to archaic and modern humans, whereas genes involved in behavior and pigmentation have changed more on the modern human lineage.

[1]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.

[2]  B. Kolmerer,et al.  The complete primary structure of human nebulin and its correlation to muscle structure. , 1995, Journal of molecular biology.

[3]  P. Bork,et al.  Identification and mutation analysis of the complete gene for Chediak–Higashi syndrome , 1996, Nature Genetics.

[4]  F. Fouque,et al.  Mutations in PDX1, the human lipoyl-containing component X of the pyruvate dehydrogenase-complex gene on chromosome 11p1, in congenital lactic acidosis. , 1997, American journal of human genetics.

[5]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[6]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[7]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.

[8]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[9]  J. Weissenbach,et al.  Perlecan, the major proteoglycan of basement membranes, is altered in patients with Schwartz-Jampel syndrome (chondrodystrophic myotonia) , 2000, Nature Genetics.

[10]  S. Zhang,et al.  Impaired elastic-fiber assembly by fibroblasts from patients with either Morquio B disease or infantile GM1-gangliosidosis is linked to deficiency in the 67-kD spliced variant of beta-galactosidase. , 2000, American journal of human genetics.

[11]  W. Wilcox,et al.  Dyssegmental dysplasia, Silverman-Handmaker type, is caused by functional null mutations of the perlecan gene , 2001, Nature Genetics.

[12]  Justin C. Fay,et al.  Positive and negative selection on the human genome. , 2001, Genetics.

[13]  J. Clayton-Smith,et al.  Cohen syndrome is caused by mutations in a novel gene, COH1, encoding a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport. , 2003, American journal of human genetics.

[14]  N. Prescott,et al.  Fraser syndrome and mouse blebbed phenotype caused by mutations in FRAS1/Fras1 encoding a putative extracellular matrix protein , 2003, Nature Genetics.

[15]  Tom Strachan,et al.  NIPBL, encoding a homolog of fungal Scc2-type sister chromatid cohesion proteins and fly Nipped-B, is mutated in Cornelia de Lange syndrome , 2004, Nature Genetics.

[16]  Adenylosuccinase deficiency: Clinical and biochemical findings in 5 Czech patients , 1997, Journal of Inherited Metabolic Disease.

[17]  I. Krantz,et al.  Cornelia de Lange syndrome is caused by mutations in NIPBL, the human homolog of Drosophila melanogaster Nipped-B , 2004, Nature Genetics.

[18]  A. Munnich,et al.  Identification of mutations in CUL7 in 3-M syndrome , 2005, Nature Genetics.

[19]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[20]  Ryan D. Hernandez,et al.  Natural selection on protein-coding genes in the human genome , 2005, Nature.

[21]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[22]  Erhard Rahm,et al.  FUNC: a package for detecting significant associations between gene sets and ontological annotations , 2007, BMC Bioinformatics.

[23]  Thomas L. Casavant,et al.  First Exons and Introns - A Survey of GC Content and Gene Structure in the Human Genome , 2006, Silico Biol..

[24]  T. O'Brien,et al.  Population distribution of the functional caspase‐12 allele , 2006, Human mutation.

[25]  D. Green,et al.  Enhanced bacterial clearance and sepsis resistance in caspase-12-deficient mice , 2006, Nature.

[26]  Jianzhi Zhang,et al.  Gene Losses during Human Origins , 2006, PLoS biology.

[27]  K. Sekiguchi,et al.  Breakdown of the reciprocal stabilization of QBRICK/Frem1, Fras1, and Frem2 at the basement membrane provokes Fraser syndrome-like defects , 2006, Proceedings of the National Academy of Sciences.

[28]  N. Rohland,et al.  Comparison and optimization of ancient DNA extraction. , 2007, BioTechniques.

[29]  Z. Xuan,et al.  Genome-wide in situ exon capture for selective resequencing , 2007, Nature Genetics.

[30]  Christian Gieger,et al.  Correlation between Genetic and Geographic Structure in Europe , 2008, Current Biology.

[31]  M. Zenker,et al.  Fraser syndrome due to homozygosity for a splice site mutation of FREM2 , 2008, American journal of medical genetics. Part A.

[32]  Alistair N. Hume,et al.  Melanosomes at a glance , 2008, Journal of Cell Science.

[33]  E. Birney,et al.  Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. , 2008, Genome research.

[34]  Philip L. F. Johnson,et al.  A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput Sequencing , 2008, Cell.

[35]  Ryan D. Hernandez,et al.  Proportionally more deleterious genetic variation in European than in African populations , 2008, Nature.

[36]  E. Birney,et al.  Genome-wide nucleotide-level mammalian ancestor reconstruction. , 2008, Genome research.

[37]  Thomas Mailund,et al.  Rapid Neighbour-Joining , 2008, WABI.

[38]  P. Scambler,et al.  Molecular study of 33 families with Fraser syndrome new data and mutation review , 2008, American journal of medical genetics. Part A.

[39]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[40]  Jonathan M. Mudge,et al.  The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. , 2009, Genome research.

[41]  P. Green,et al.  Widespread Genomic Signatures of Natural Selection in Hominid Evolution , 2009, PLoS genetics.

[42]  Martin Kircher,et al.  Improved base calling for the Illumina Genome Analyzer using machine learning strategies , 2009, Genome Biology.

[43]  I. Desguerre,et al.  Misleading behavioural phenotype with adenylosuccinate lyase deficiency , 2009, European Journal of Human Genetics.

[44]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[45]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[46]  J. Arsuaga,et al.  Kebara 2: new insights regarding the most complete Neandertal thorax. , 2009, Journal of human evolution.

[47]  Adrian W. Briggs,et al.  Targeted Retrieval and Analysis of Five Neandertal mtDNA Genomes , 2009, Science.

[48]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[49]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[50]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[51]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[52]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[53]  Philip L. F. Johnson,et al.  Genetic history of an archaic hominin group from Denisova Cave in Siberia , 2010, Nature.

[54]  Matthias Meyer,et al.  Illumina sequencing library preparation for highly multiplexed target capture and sequencing. , 2010, Cold Spring Harbor protocols.

[55]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[56]  Serafim Batzoglou,et al.  Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++ , 2010, PLoS Comput. Biol..

[57]  Nicholas G. Martin,et al.  Digital Quantification of Human Eye Color Highlights Genetic Association of Three New Loci , 2010, PLoS genetics.

[58]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[59]  Huanming Yang,et al.  Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants , 2010, Nature Genetics.

[60]  Philip L. F. Johnson,et al.  Targeted Investigation of the Neandertal Genome by Array-Based Sequence Capture , 2010, Science.

[61]  P. Green,et al.  Genomic signatures of germline gene expression. , 2010, Genome research.

[62]  Adrian W. Briggs,et al.  Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA , 2009, Nucleic acids research.

[63]  K. Shimoke,et al.  Appearance of Nuclear-sorted Caspase-12 Fragments in Cerebral Cortical and Hippocampal Neurons in Rats Damaged by Autologous Blood Clot Embolic Brain Infarctions , 2011, Cellular and Molecular Neurobiology.

[64]  D. Liang,et al.  A novel GPR143 splicing mutation in a Chinese family with X-linked congenital nystagmus , 2011, Molecular vision.

[65]  I. Groote The Neanderthal lower arm. , 2011 .

[66]  J. Harrow,et al.  The GENCODE exome: sequencing the complete human exome , 2011, European Journal of Human Genetics.

[67]  Lior Pachter,et al.  RESEARCH ARTICLE Open Access Identification and correction of systematic error in high-throughput sequence data , 2022 .

[68]  Hui Jiang,et al.  Comprehensive comparison of three commercial human whole-exome capture platforms , 2011, Genome Biology.

[69]  Jun Wang,et al.  Extensive X-linked adaptive evolution in central chimpanzees , 2012, Proceedings of the National Academy of Sciences.

[70]  D. Haussler,et al.  ENCODE whole-genome data in the UCSC Genome Browser: update 2012 , 2011, Nucleic Acids Res..

[71]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[72]  Sebastian Bauer,et al.  The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process , 2011, Nucleic acids research.

[73]  S. Cai,et al.  A Novel Nonsense Mutation of the GPR143 Gene Identified in a Chinese Pedigree with Ocular Albinism , 2012, PloS one.

[74]  Adrian W. Briggs,et al.  A High-Coverage Genome Sequence from an Archaic Denisovan Individual , 2012, Science.

[75]  M. Netea,et al.  The Loss of Functional Caspase-12 in Europe Is a Pre-Neolithic Event , 2012, PloS one.

[76]  Martin Kircher,et al.  Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform , 2011, Nucleic acids research.

[77]  P. Kramer,et al.  Lumbar lordosis of extinct hominins. , 2012, American journal of physical anthropology.

[78]  E. Parra,et al.  Exploring signatures of positive selection in pigmentation candidate genes in populations of East Asian ancestry , 2013, BMC Evolutionary Biology.

[79]  Qiaomei Fu,et al.  DNA analysis of an early modern human from Tianyuan Cave, China , 2013, Proceedings of the National Academy of Sciences.

[80]  Philip L. F. Johnson,et al.  The complete genome sequence of a Neanderthal from the Altai Mountains , 2013 .

[81]  C. Lalueza-Fox,et al.  A new date for the neanderthals from El Sidrón cave (Asturias, Northern Spain) , 2013 .

[82]  J. Pritchard,et al.  The deleterious mutation load is insensitive to recent population history , 2013, Nature Genetics.