Accuracy of Four Heuristics for the Full Sibship Reconstruction Problem in the Presence of Genotype Errors

The full sibship reconstruction (FSR) problem is the problem of inferring all groups of full siblings from a given population sample using genetic marker data without parental information. The FSR problem remains a significant challenge for computational biology, since an exact solution for the problem has not been found. The new algorithm, named SIMPSON-assisted Descending Ratio (SDR), is devised combining a new Simpson index based O(n2) algorithm (MS2) and the existing Descending Ratio (DR) algorithm. The SDR algorithm outperforms the SIMPSON, MS2, and DR algorithms in accuracy and robustness when tested on a variety of sample family structures. The accuracy error is measured as the percentage of incorrectly assigned individuals. The robustness of the FSR algorithms is assessed by simulating a 2% mutation rate per locus (a 1% rate per allele).

[1]  R. Crozier,et al.  Queen number, queen cycling and queen loss: the evolution of complex multiple queen societies in the social wasp genus Ropalidia , 2004, Behavioral Ecology and Sociobiology.

[2]  Dmitry A. Konovalov,et al.  Modified SIMPSON O(n3) algorithm for the full sibship reconstruction problem , 2005, Bioinform..

[3]  A Bootstrap Assessment of Variability in Pedigree Reconstruction Based on Genetic Markers , 2001, Biometrics.

[4]  G. Luikart,et al.  Statistical analysis of microsatellite DNA data. , 1999, Trends in ecology & evolution.

[5]  C. Field,et al.  Accuracy, efficiency and robustness of four algorithms allowing full sibship reconstruction from DNA marker data , 2004, Molecular ecology.

[6]  Dmitry A. Konovalov,et al.  Partition-distance via the assignment problem , 2005, Bioinform..

[7]  Dan Gusfield,et al.  Partition-distance: A problem and class of perfect graphs arising in clustering , 2002, Inf. Process. Lett..

[8]  Jinliang Wang,et al.  Sibship reconstruction from genetic data with typing errors. , 2004, Genetics.

[9]  H. Ellegren Microsatellite mutations in the germline: implications for evolutionary inference. , 2000, Trends in genetics : TIG.

[10]  Combinatorial Reconstruction of Sibling Groups , 2005 .

[11]  D. Queller,et al.  Computer software for performing likelihood tests of pedigree relationship using genetic markers , 1999, Molecular ecology.

[12]  C. Field,et al.  Estimation of Single-Generation Sibling Relationships Based on DNA Markers , 1999 .

[13]  W. G. Hill,et al.  Estimating quantitative genetic parameters using sibships reconstructed from marker data. , 2000, Genetics.

[14]  W. Amos,et al.  Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion , 2004, Molecular ecology.

[15]  Dmitry A. Konovalov,et al.  kingroup: a program for pedigree relationship reconstruction and kin group assignments using genetic markers , 2004 .

[16]  B. Smith,et al.  Accurate partition of individuals into full-sib families from genetic data without parental information. , 2001, Genetics.

[17]  S. Creel,et al.  Population size estimation in Yellowstone wolves with error‐prone noninvasive microsatellite genotypes , 2003, Molecular ecology.

[18]  C. Herbinger,et al.  Analysis of parentage determination in Atlantic salmon (Salmo salar) using microsatellites , 1998 .

[19]  B. May,et al.  A graph‐theoretic approach to the partition of individuals into full‐sib families , 2003, Molecular ecology.