An algorithm based on graph theory for the assembly of contigs in physical mapping of DNA

An algorithm is described for mapping DNA contigs based on an interval graph (IG) representation. In general terms, the input to the algorithm is a set of binary overlapping relations among finite intervals spread along a real line, from which the algorithm generates sets of ordered overlapping fragments spanning that line. The implications of a more general case of the IG, called a probe interval graph (PIG), in which only a subset of cosmids are used as probes, are also discussed. In the specific case of cosmids hybridizing to regions of a YAC, the algorithm takes cross-hybridization information using the cosmids as probes, and orders them along the YAC; if gaps exist due to insufficient coverage of cosmid contigs along the length of the YAC, repetitive use of the algorithm generates sets of ordered overlapping fragments. Both the IG and the PIG can expose problems caused by false overlaps, such as hybridizations due to repetitive elements. The algorithm, has been coded in C; CPU time is essentially linear with respect to the number of cosmids analyzed. Results are presented for the application of a PIG to cosmid contig assembly along a human chromosome 13-specific YAC. An alignment of 67 cosmids spanning a YAC took 0.28 seconds of CPU time on a Convex 220 computer.

[1]  Robert E. Tarjan,et al.  Algorithmic Aspects of Vertex Elimination on Graphs , 1976, SIAM J. Comput..

[2]  M. Waterman,et al.  Optimizing restriction fragment fingerprinting methods for ordering large genomic libraries. , 1990, Genomics.

[3]  Robert E. Tarjan,et al.  Algorithmic aspects of vertex elimination , 1975, STOC.

[4]  P. D. de Jong,et al.  A high-resolution, fluorescence-based, semiautomated method for DNA fingerprinting. , 1989, Genomics.

[5]  F. Hecht,et al.  Construction of long-range restriction maps in human DNA using pulsed field gel electrophoresis. , 1987, Gene analysis techniques.

[6]  Rolf H. Möhring,et al.  An Incremental Linear-Time Algorithm for Recognizing Interval Graphs , 1989, SIAM J. Comput..

[7]  A Grigoriev,et al.  Algorithms and software tools for ordering clone libraries: application to the mapping of the genome of Schizosaccharomyces pombe. , 1993, Nucleic acids research.

[8]  T. Marr,et al.  A 13 kb resolution cosmid map of the 14 Mb fission yeast genome by nonrandom sequence-tagged site mapping , 1993, Cell.

[9]  P. D. de Jong,et al.  Constructing chromosome- and region-specific cosmid maps of the human genome. , 1989, Genome.

[10]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[11]  Kellogg S. Booth,et al.  Testing for the Consecutive Ones Property, Interval Graphs, and Graph Planarity Using PQ-Tree Algorithms , 1976, J. Comput. Syst. Sci..

[12]  A. Coulson,et al.  Toward a physical map of the genome of the nematode Caenorhabditis elegans. , 1986, Proceedings of the National Academy of Sciences of the United States of America.