AncestrySNPminer: a bioinformatics tool to retrieve and develop ancestry informative SNP panels.

A wealth of genomic information is available in public and private databases. However, this information is underutilized for uncovering population specific and functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a faster and cost effective approach for investigating the number of SNPs that are informative for ancestry. In this study, we present AncestrySNPminer, the first web-based bioinformatics tool specifically designed to retrieve Ancestry Informative Markers (AIMs) from genomic data sets and link these informative markers to genes and ontological annotation classes. The tool includes an automated and simple "scripting at the click of a button" functionality that enables researchers to perform various population genomics statistical analyses methods with user friendly querying and filtering of data sets across various populations through a single web interface. AncestrySNPminer can be freely accessed at https://research.cchmc.org/mershalab/AncestrySNPminer/login.php.

[1]  L. Jin,et al.  Ethnic-affiliation estimation by use of population-specific DNA markers. , 1997, American journal of human genetics.

[2]  John P A Ioannidis,et al.  Required sample size and nonreplicability thresholds for heterogeneous genetic associations , 2008, Proceedings of the National Academy of Sciences.

[3]  Lili Ding,et al.  Comparison of measures of marker informativeness for ancestry and admixture mapping , 2011, BMC Genomics.

[4]  M. Feldman,et al.  The application of molecular genetic approaches to the study of human evolution , 2003, Nature Genetics.

[5]  T. Baye Inter-chromosomal variation in the pattern of human population genetic structure , 2011, Human Genomics.

[6]  Lisa J. Martin,et al.  Identification of KIF3A as a Novel Candidate Gene for Childhood Asthma Using RNA Expression and Population Allelic Frequencies Differences , 2011, PloS one.

[7]  Hongzhe Li,et al.  Ethnic-difference markers for use in mapping by admixture linkage disequilibrium. , 2002, American journal of human genetics.

[8]  R. Ward,et al.  Informativeness of genetic markers for inference of ancestry. , 2003, American journal of human genetics.

[9]  Scott M. Williams,et al.  A high-density admixture map for disease gene discovery in african americans. , 2004, American journal of human genetics.

[10]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[11]  Andrew Kusiak,et al.  Data mining and genetic algorithm based gene/SNP selection , 2004, Artif. Intell. Medicine.

[12]  Ned Glick,et al.  Data Mining and Knowledge Discovery in Databases – An Overview , 1999 .

[13]  A W F Edwards,et al.  Human genetic diversity: Lewontin's fallacy. , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[14]  E. Zubritsky SNP mining. The rush is on. , 1999, Analytical chemistry.

[15]  R. Mei,et al.  A genomewide admixture mapping panel for Hispanic/Latino populations. , 2007, American journal of human genetics.

[16]  Anu Pinnamaneni,et al.  Database Mining in the Human Genome Initiative , 2000 .

[17]  Michael Olivier,et al.  Genomic and geographic distribution of private SNPs and pathways in human populations. , 2009, Personalized medicine.

[18]  Chenna Ramu,et al.  cgimodel: CGI Programming Made Easy with Python , 2000 .

[19]  Lisa J. Martin,et al.  Population structure analysis using rare and common functional variants , 2011, BMC proceedings.

[20]  D. Cox,et al.  A genomewide admixture map for Latino populations. , 2007, American journal of human genetics.

[21]  E. Zubritsky Focus: SNP mining , 1999 .

[22]  R. Wilke,et al.  Mapping genes that predict treatment outcome in admixed populations , 2010, The Pharmacogenomics Journal.