Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples

BackgroundUsing DNA to determine an individual's ancestry from among human populations is generally interesting and useful for many purposes, including admixture mapping, controlling for population structure in disease or trait association studies and forensic ancestry inference. However, to estimate ancestry, including possible admixture within an individual, as well as heterogeneity within a group of individuals, allele frequencies are necessary for what are believed to be the contributing populations. For this purpose, panels of ancestry informative markers (AIMs) have been developed.ResultsWe are presenting our work on one such panel, composed of 128 ancestry informative single-nucleotide polymorphisms (AISNPs) already proposed in the literature. Compared to previous studies of these AISNPs, we have studied three times the number of individuals (4,871) in three times as many population samples (119). We have validated this panel for many ancestry assignment and admixture studies, especially those that were the rationale for the original selection of the 128 SNPs: African Americans and Mexican Americans. At the same time, the limitations of the panel for distinguishing ancestry and quantifying admixture among Eurasian populations are noted.ConclusionWe demonstrate the simultaneous importance of the specific set of population samples and their relative sample sizes in the use of the structure program to determine which groups cluster together and consequently influence the ability of a marker panel to infer ancestry. We demonstrate the strengths and weaknesses of this particular panel of AISNPs in a global context.

[1]  Rui Mei,et al.  Large-scale SNP analysis reveals clustered and continuous patterns of human genetic variation , 2005, Human Genomics.

[2]  A. Amorim,et al.  Assessing individual interethnic admixture and population substructure using a 48–insertion‐deletion (INSEL) ancestry‐informative marker (AIM) panel , 2010, Human mutation.

[3]  Noah A. Rosenberg Algorithms for Selecting Informative Marker Panels for Population Assignment , 2005, J. Comput. Biol..

[4]  R. Mei,et al.  A genomewide admixture mapping panel for Hispanic/Latino populations. , 2007, American journal of human genetics.

[5]  Mark D Shriver,et al.  Measuring European population stratification with microarray genotype data. , 2007, American journal of human genetics.

[6]  M. Feldman,et al.  Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation , 2008 .

[7]  Li Jin,et al.  Skin pigmentation, biogeographical ancestry and admixture mapping , 2003, Human Genetics.

[8]  Amit R. Indap,et al.  Genes mirror geography within Europe , 2008, Nature.

[9]  R. Ward,et al.  Informativeness of genetic markers for inference of ancestry. , 2003, American journal of human genetics.

[10]  S. Wright,et al.  Evolution and the Genetics of Populations: Volume 2, The Theory of Gene Frequencies , 1968 .

[11]  L. Jorde,et al.  Genetic variation, classification and 'race' , 2004, Nature Genetics.

[12]  Gabriel Silva,et al.  Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America , 2009, Human mutation.

[13]  Stephen L. Hauser,et al.  Genome-wide patterns of population structure and admixture in West Africans and African Americans , 2009, Proceedings of the National Academy of Sciences.

[14]  N. Rosenberg distruct: a program for the graphical display of population structure , 2003 .

[15]  C. Mulligan,et al.  Efficient population assignment and outlier detection in human populations using biallelic markers chosen by principal component-based rankings. , 2010, BioTechniques.

[16]  Jonathan Scott Friedlaender,et al.  A Human Genome Diversity Cell Line Panel , 2002, Science.

[17]  Naomi R. Wray,et al.  Genetic Differences between Five European Populations , 2010, Human Heredity.

[18]  Rahul C. Deo,et al.  An Admixture Scan in 1,484 African American Women with Breast Cancer , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[19]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[20]  Michael W. Mahoney,et al.  PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations , 2007, PLoS genetics.

[21]  Noah A. Rosenberg,et al.  CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure , 2007, Bioinform..

[22]  S. Kalinowski,et al.  The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure , 2011, Heredity.

[23]  Á. Carracedo,et al.  Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs. , 2007, Forensic science international. Genetics.

[24]  John Novembre,et al.  Inferring genetic ancestry: opportunities, challenges, and implications. , 2010, American journal of human genetics.

[25]  D. Cox,et al.  A genomewide admixture map for Latino populations. , 2007, American journal of human genetics.

[26]  Gabriel Silva,et al.  An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels , 2009, BMC Genetics.

[27]  Kenneth K. Kidd,et al.  SNPs for a universal individual identification panel , 2010, Human Genetics.

[28]  John Novembre,et al.  The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. , 2008, American journal of human genetics.

[29]  L. Criswell,et al.  Markers informative for ancestry demonstrate consistent megabase-length linkage disequilibrium in the African American population , 2003, Human Genetics.

[30]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[31]  D. Grattapaglia,et al.  Genetic composition of Brazilian population samples based on a set of twenty‐eight ancestry informative SNPs , 2009, American journal of human biology : the official journal of the Human Biology Council.

[32]  K. Kidd,et al.  Use of autosomal loci for clustering individuals and populations of East Asian origin , 2005, Human Genetics.

[33]  Mark Shriver,et al.  A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications , 2008, Human mutation.

[34]  Manfred Kayser,et al.  Proportioning whole-genome single-nucleotide-polymorphism diversity for the identification of geographic population structure and genetic ancestry. , 2006, American journal of human genetics.

[35]  M. Skipper Allele Frequency Database , 2003, Nature Reviews Genetics.

[36]  David Goldman,et al.  Using ancestry-informative markers to define populations and detect population stratification , 2006, Journal of psychopharmacology.

[37]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[38]  K. Kidd,et al.  Candidate SNPs for a universal individual identification panel , 2007, Human Genetics.

[39]  Kei-Hoi Cheung,et al.  ALFRED - the ALlele FREquency Database , 2003 .

[40]  D. Reich,et al.  Results from a prostate cancer admixture mapping study in African-American men , 2009, Human Genetics.