Comparing Pedigree Graphs

Pedigree graphs, or family trees, are typically constructed by an expensive process of examining genealogical records to determine which pairs of individuals are parent and child. New methods to automate this process take as input genetic data from a set of extant individuals and reconstruct ancestral individuals. There is a great need to evaluate the quality of these methods by comparing the estimated pedigree to the true pedigree. In this article, we consider two main pedigree comparison problems. The first is the pedigree isomorphism problem, for which we present a linear-time algorithm for leaf-labeled pedigrees. The second is the pedigree edit distance problem, for which we present (1) several algorithms that are fast and exact in various special cases, and (2) a general, randomized heuristic algorithm. In the negative direction, we first prove that the pedigree isomorphism problem is as hard as the general graph isomorphism problem, and that the sub-pedigree isomorphism problem is NP-hard. We then show that the pedigree edit distance problem is APX-hard in general and NP-hard on leaf-labeled pedigrees. We use simulated pedigrees to compare our edit-distance algorithms to each other as well as to a branch-and-bound algorithm that always finds an optimal solution.

[1]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[2]  L Sun,et al.  Statistical tests for detection of misspecified relationships by use of genome-screen data. , 2000, American journal of human genetics.

[3]  Bonnie Kirkpatrick Haplotypes versus genotypes on pedigrees , 2010, Algorithms for Molecular Biology.

[4]  Tao Jiang,et al.  An exact solution for finding minimum recombinant haplotype configurations on pedigrees with missing data by integer linear programming , 2004, RECOMB.

[5]  Xin Li,et al.  Efficient identification of identical-by-descent status in pedigrees with many untyped individuals , 2010, Bioinform..

[6]  W. Art Chaovalitwongse,et al.  Reconstructing sibling relationships in wild populations , 2007, ISMB/ECCB.

[7]  C. Bustamante,et al.  A Single IGF1 Allele Is a Major Determinant of Small Size in Dogs , 2007, Science.

[8]  C. Field,et al.  Estimation of Single-Generation Sibling Relationships Based on DNA Markers , 1999 .

[9]  M. Steel,et al.  Reconstructing pedigrees: a combinatorial perspective. , 2006, Journal of theoretical biology.

[10]  Dan Geiger,et al.  Maximum Likelihood Haplotyping for General Pedigrees , 2005, Human Heredity.

[11]  Tao Jiang,et al.  Some MAX SNP-Hard Results Concerning Unordered Labeled Trees , 1994, Inf. Process. Lett..

[12]  Mary Sara McPeek,et al.  Enhanced Pedigree Error Detection , 2002, Human Heredity.

[13]  R. Elston Pedigree analysis in human genetics. , 1987 .

[14]  Bhalchandra D. Thatte,et al.  Combinatorics of Pedigrees I: Counterexamples to a Reconstruction Question , 2006, SIAM J. Discret. Math..

[15]  Yufeng Wu,et al.  A practical method for exact computation of subtree prune and regraft distance , 2009, Bioinform..

[16]  Tao Jiang,et al.  On the minimum common integer partition problem , 2006, TALG.

[17]  Mike Steel,et al.  Reconstructing pedigrees: a stochastic perspective. , 2007, Journal of theoretical biology.

[18]  M. Boehnke,et al.  Accurate inference of relationships in sib-pair linkage studies. , 1997, American journal of human genetics.

[19]  Kristen Anderson How Well Does Paternity Confidence Match Actual Paternity? , 2006, Current Anthropology.

[20]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[21]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[22]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[23]  Chandra R. Chegireddy,et al.  Algorithms for finding K-best perfect matchings , 1987, Discret. Appl. Math..

[24]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[25]  Ryuhei Uehara,et al.  Graph isomorphism completeness for chordal bipartite graphs and strongly chordal graphs , 2005, Discret. Appl. Math..

[26]  M. McPeek,et al.  Quantitative-trait homozygosity and association mapping and empirical genomewide significance in large, complex pedigrees: fasting serum-insulin level in the Hutterites. , 2002, American journal of human genetics.

[27]  G. Coop,et al.  High-Resolution Mapping of Crossovers Reveals Extensive Variation in Fine-Scale Recombination Patterns Among Humans , 2008, Science.

[28]  Christopher Meek,et al.  Speeding up HMM algorithms for genetic linkage analysis via chain reductions of the state space , 2009, Bioinform..

[29]  B. Browning,et al.  On reducing the statespace of hidden Markov models for the identity by descent process. , 2002, Theoretical population biology.

[30]  Pedro V. Silva,et al.  General Derivation of the Sets of Pedigrees with the Same Kinship Coefficients , 2010, Human Heredity.

[31]  Jonathan M. Wright,et al.  Early growth performance of Atlantic salmon full-sib families reared in single family tanks versus in mixed family tanks , 1999 .

[32]  R. Steele Optimization , 2005 .

[33]  Dan Gusfield,et al.  On the Complexity of Fundamental Computational Problems in Pedigree Analysis , 2003, J. Comput. Biol..

[34]  Bonnie Kirkpatrick,et al.  Pedigree Reconstruction Using Identity by Descent , 2011, RECOMB.

[35]  Steffen L. Lauritzen,et al.  Graphical Models for Genetic Analyses , 2003 .

[36]  Daniel G. Brown,et al.  Discovering Kinship through Small Subsets , 2010, WABI.

[37]  T. Speed,et al.  Identifying nineteenth century genealogical links from genotypes , 2005, Human Genetics.

[38]  Nobuji Saito,et al.  NP-Completeness of the Hamiltonian Cycle Problem for Bipartite Graphs , 1980 .

[39]  G. Rhodes,et al.  Human sperm competition: testis size, sperm production and rates of extrapair copulations , 2004, Animal Behaviour.

[40]  E M Wijsman,et al.  Meta-analysis of 32 genome-wide linkage studies of schizophrenia , 2009, Molecular Psychiatry.