Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome

Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 single nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). This study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts.

[1]  K. Gevaert,et al.  Proteomics methods to study methionine oxidation. , 2014, Mass spectrometry reviews.

[2]  C. Dalglish Archaeology, the Public and the Recent Past , 2013 .

[3]  A. Krogh,et al.  Ancient human genome sequence of an extinct Palaeo-Eskimo , 2010, Nature.

[4]  Sun Choi,et al.  Novel Oxidative Modifications in Redox-Active Cysteine Residues* , 2010, Molecular & Cellular Proteomics.

[5]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[6]  David O. Carter,et al.  Soil Analysis in Forensic Taphonomy : Chemical and Biological Effects of Buried Human Remains , 2008 .

[7]  K. Kidd,et al.  Developing a SNP panel for forensic identification of individuals. , 2006, Forensic science international.

[8]  Ruedi Aebersold,et al.  Comprehensive proteomics. , 2011, Current opinion in biotechnology.

[9]  C. Bustamante,et al.  The Divergence of Neandertal and Modern Human Y Chromosomes , 2016, American journal of human genetics.

[10]  Bruce Budowle,et al.  Correlation of microscopic and mitochondrial DNA hair comparisons. , 2002, Journal of forensic sciences.

[11]  John M. Butler,et al.  Fundamentals of Forensic DNA Typing , 2009 .

[12]  Eric W. Deutsch,et al.  Combining Results of Multiple Search Engines in Proteomics* , 2013, Molecular & Cellular Proteomics.

[13]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[14]  R. Moritz,et al.  Current algorithmic solutions for peptide-based proteomics data generation and identification. , 2013, Current opinion in biotechnology.

[15]  I. Evett,et al.  Interpreting DNA Evidence: Statistical Genetics for Forensic Scientists , 1998 .

[16]  Wayne P. Maddison,et al.  Genetic Data Analysis: Methods for Discrete Population Genetic Data , 1991 .

[17]  Joshua M. Korn,et al.  Discovery and genotyping of genome structural polymorphism by sequencing on a population scale , 2011, Nature Genetics.

[18]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[19]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[20]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[21]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[22]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[23]  M. Omary,et al.  'Hard' and 'soft' principles defining the structure, function and regulation of keratin intermediate filaments. , 2002, Current opinion in cell biology.

[24]  J L Bada,et al.  Preservation of key biomolecules in the fossil record: current knowledge and future challenges. , 1999, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[25]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[26]  P. Bowden The human type II keratin gene cluster on chromosome 12q13.13: final count or hidden secrets? , 2005, The Journal of investigative dermatology.

[27]  A. M. Pollard,et al.  Selective biodegradation in hair shafts derived from archaeological, forensic and experimental contexts , 2007, The British journal of dermatology.

[28]  D. McNevin,et al.  STR genotyping of exogenous hair shaft DNA , 2007 .

[29]  A. Wilson,et al.  The decomposition of hair in the buried body environment , 2008 .

[30]  Bing Zhang,et al.  Protein identification using customized protein sequence databases derived from RNA-Seq data. , 2012, Journal of proteome research.

[31]  M. Collins,et al.  Preservation of ancient DNA in thermally damaged archaeological bone , 2009, Naturwissenschaften.

[32]  Science and the Dead A guideline for the destructive sampling of archaeological human remains for scientifi c analysis , 2013 .

[33]  E. Balanovska,et al.  Ancient DNA Reveals Prehistoric Gene-Flow from Siberia in the Complex Human Population History of North East Europe , 2013, PLoS genetics.

[34]  M. Buckley,et al.  Proteome degradation in fossils: investigating the longevity of protein survival in ancient bone , 2014, Rapid communications in mass spectrometry : RCM.

[35]  David M. Rocke,et al.  Differentiating Inbred Mouse Strains from Each Other and Those with Single Gene Mutations Using Hair Proteomics , 2012, PloS one.

[36]  J M Curran,et al.  Assessing uncertainty in DNA evidence caused by sampling effects. , 2002, Science & justice : journal of the Forensic Science Society.

[37]  T. Lindahl Instability and decay of the primary structure of DNA , 1993, Nature.

[38]  R. Rice Proteomic analysis of hair shaft and nail plate. , 2011, Journal of cosmetic science.

[39]  David Fenyö,et al.  Mass spectrometric protein identification using the global proteome machine. , 2010, Methods in molecular biology.

[40]  Jonathan Scott Friedlaender,et al.  A Human Genome Diversity Cell Line Panel , 2002, Science.

[41]  Bonnie Berger,et al.  Ancient human genomes suggest three ancestral populations for present-day Europeans , 2013, Nature.

[42]  R. Hendrickson,et al.  Detection and validation of non-synonymous coding SNPs from orthogonal analysis of shotgun proteomics data. , 2007, Journal of proteome research.

[43]  Caixia Li,et al.  Developing a novel panel of genome-wide ancestry informative markers for bio-geographical ancestry estimates. , 2014, Forensic science international. Genetics.

[44]  M. Mann,et al.  Exponentially Modified Protein Abundance Index (emPAI) for Estimation of Absolute Protein Amount in Proteomics by the Number of Sequenced Peptides per Protein*S , 2005, Molecular & Cellular Proteomics.

[45]  Á. Carracedo,et al.  Development of a novel forensic STR multiplex for ancestry analysis and extended identity testing , 2013, Electrophoresis.

[46]  D. Altman,et al.  Statistics Notes: Diagnostic tests 2: predictive values , 1994, BMJ.

[47]  John S. Cottrell,et al.  Protein identification using MS/MS data. , 2011, Journal of proteomics.

[48]  V. Doronichev,et al.  Revised age of late Neanderthal occupation and the end of the Middle Paleolithic in the northern Caucasus , 2011, Proceedings of the National Academy of Sciences.

[49]  Jerry Nedelman,et al.  Book review: “Bayesian Data Analysis,” Second Edition by A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin Chapman & Hall/CRC, 2004 , 2005, Comput. Stat..

[50]  Proteins help solve taxonomy riddle , 2013, Nature.

[51]  M. Gilbert,et al.  Hair and Nail , 2007 .

[52]  R. Rice,et al.  Proteome Analysis of Human Hair Shaft , 2006, Molecular & Cellular Proteomics.

[53]  R. J. Mitchell,et al.  Forensic trace DNA: a review , 2010, Investigative Genetics.

[54]  Scientific Working Group on Materials Analysis Position on Hair Evidence , 2009, Journal of forensic sciences.

[55]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[56]  S. Pääbo,et al.  Genetic analyses from ancient DNA. , 2004, Annual review of genetics.

[57]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[58]  W. G. Hill,et al.  Genetic Data Analysis II . By Bruce S. Weir, Sunderland, Massachusetts. Sinauer Associates, Inc.445 pages. ISBN 0-87893-902-4. , 1996 .

[59]  David M. Rocke,et al.  Human hair shaft proteomic profiling: individual differences, site specificity and cuticle analysis , 2014, PeerJ.

[60]  A. van Dorsselaer,et al.  Proteomic tools for the investigation of human hair structural proteins and evidence of weakness sites on hair keratin coil segments. , 2012, Analytical biochemistry.

[61]  D. McNevin,et al.  Short tandem repeat (STR) genotyping of keratinised hair. Part 1. Review of current status and knowledge gaps. , 2005, Forensic science international.

[62]  Bok-Ghee Han,et al.  Development of SNP-based human identification system , 2010, International Journal of Legal Medicine.

[63]  B. Stankiewicz,et al.  Protein preservation and DNA retrieval from ancient tissues. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[64]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[65]  J. Buckleton,et al.  Logical implications of applying the principles of population genetics to the interpretation of DNA profiling evidence. , 2002, Forensic science international.

[66]  H. Bandelt,et al.  The Archaeogenetics of Europe , 2010, Current Biology.

[67]  H. Winter,et al.  Characterization of new members of the human type II keratin gene family and a general evaluation of the keratin gene domain on chromosome 12q13.13. , 2005, The Journal of investigative dermatology.

[68]  D. McNevin,et al.  Short tandem repeat (STR) genotyping of keratinised hair. Part 2. An optimised genomic DNA extraction procedure reveals donor dependence of STR profiles. , 2005, Forensic science international.

[69]  Ruedi Aebersold,et al.  Building and searching tandem mass (MS/MS) spectral libraries for peptide identification in proteomics. , 2011, Methods.

[70]  Liqun Luo,et al.  A molecular basis for classic blond hair color in Europeans , 2014, Nature Genetics.

[71]  J. Gillespie,et al.  Methods and future prospects for forensic identification of hairs by electrophoresis. , 1985, Journal - Forensic Science Society.

[72]  Terry Melton,et al.  Forensic mitochondrial DNA analysis of 691 casework hairs. , 2005, Journal of forensic sciences.

[73]  Common DNA variants predict tall stature in Europeans , 2014, Human Genetics.

[74]  Stephen L. Hauser,et al.  Genome-wide patterns of population structure and admixture in West Africans and African Americans , 2009, Proceedings of the National Academy of Sciences.

[75]  Juan Pablo Albar,et al.  Generalized Method for Probability-based Peptide and Protein Identification from Tandem Mass Spectrometry Data and Sequence Database Searching* , 2008, Molecular & Cellular Proteomics.

[76]  Jolon M. Dyer,et al.  Modeling deamidation in sheep α-keratin peptides and application to archeological wool textiles. , 2014, Analytical chemistry.

[77]  Min-Sung Kim,et al.  Structural basis for heteromeric assembly and perinuclear organization of keratin filaments , 2012, Nature Structural &Molecular Biology.

[78]  Yong Wang,et al.  An Aboriginal Australian Genome Reveals Separate Human Dispersals into Asia , 2011, Science.

[79]  S. Gabriel,et al.  Analysis of 6,515 exomes reveals a recent origin of most human protein-coding variants , 2012, Nature.

[80]  H. Winter,et al.  The human type I keratin gene family: characterization of new hair follicle specific members and evaluation of the chromosome 17q21.2 gene domain. , 2004, Differentiation; research in biological diversity.

[81]  D. Foran,et al.  A simplified method for mitochondrial DNA extraction from head hair shafts. , 2005, Journal of forensic sciences.

[82]  Sueshige Seta,et al.  Forensic Hair Investigation , 1988 .

[83]  Charlotte L. Oskam,et al.  The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils , 2012, Proceedings of the Royal Society B: Biological Sciences.

[84]  N. Robinson,et al.  Protein deamidation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[85]  J. Austin,et al.  A quantitative assessment of a reliable screening technique for the STR analysis of telogen hair roots. , 2013, Forensic science international. Genetics.

[86]  T. Disotell Archaic human genomics. , 2012, American journal of physical anthropology.

[87]  Sue Black,et al.  Forensic human identification:an introduction , 2006 .

[88]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[89]  Cary T. Oien Forensic Hair Comparison: Background Information for Interpretation , 2009 .

[90]  J. Ehleringer,et al.  Hair as a Geochemical Recorder: Ancient to Modern , 2014 .

[91]  Michael R. Shortreed,et al.  Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences. , 2014, Journal of proteome research.

[92]  Philip L. F. Johnson,et al.  Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse , 2013, Nature.

[93]  E. Willerslev,et al.  DNA from keratinous tissue. Part I: hair and nail. , 2012, Annals of anatomy = Anatomischer Anzeiger : official organ of the Anatomische Gesellschaft.

[94]  Law. Policy Executive Summary of the National Academies of Science Reports, Strengthening Forensic Science in the United States: A Path Forward , 2009 .