The Complexity of Checking Consistency of Pedigree Information and Related Problems

Consistency checking is a fundamental computational problem in genetics. Given a pedigree and information on the genotypes (of some) of the individuals in it, the aim of consistency checking is to determine whether these data are consistent with the classic Mendelian laws of inheritance. This problem arose originally from the geneticists' need to filter their input data from erroneous information, and is well motivated from both a biological and a sociological viewpoint. This paper shows that consistency checking is NP-complete, even with focus on a single gene and in the presence of three alleles. Several other results on the computational complexity of problems from genetics that are related to consistency checking are also offered. In particular, it is shown that checking the consistency of pedigrees over two alleles, and of pedigrees without loops, can be done in polynomial time.

[1]  Mick Hamer Back to your roots , 2002 .

[2]  Mihalis Yannakakis,et al.  On the complexity of protein folding (extended abstract) , 1998, STOC '98.

[3]  Jun Gu,et al.  Algorithms for the satisfiability (SAT) problem: A survey , 1996, Satisfiability Problem: Theory and Applications.

[4]  Dan Gusfield,et al.  On the Complexity of Fundamental Computational Problems in Pedigree Analysis , 2003, J. Comput. Biol..

[5]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[6]  Frank Thomson Leighton,et al.  Protein folding in the hydrophobic-hydrophilic (HP) is NP-complete , 1998, RECOMB '98.

[7]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[8]  T. Strachan,et al.  Human Molecular Genetics 2 , 1997 .

[9]  J R O'Connell,et al.  PedCheck: a program for identification of genotype incompatibilities in linkage analysis. , 1998, American journal of human genetics.

[10]  J. O’Connell,et al.  An optimal algorithm for automatic genotype elimination. , 1999, American journal of human genetics.

[11]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[12]  Daniel F. Gudbjartsson,et al.  Allegro, a new computer program for multipoint linkage analysis , 2000, Nature genetics.

[13]  Tao Jiang,et al.  Efficient rule-based haplotyping algorithms for pedigree data , 2003, RECOMB '03.

[14]  J. Ott Analysis of Human Genetic Linkage , 1985 .

[15]  K. Lange,et al.  An algorithm for automatic genotype elimination. , 1987, American journal of human genetics.

[16]  William S. Klug,et al.  Concepts of Genetics , 1999 .

[17]  Jeanette C Papp,et al.  Detection and integration of genotyping errors in statistical genetics. , 2002, American journal of human genetics.

[18]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.