Parallel computation for chromosome reconstruction on a cluster of workstations

Reconstructing a physical map of a chromosome from a genomic library presents a central computational problem in genetics. Physical map reconstruction in the presence of errors is a problem of high computational complexity which provides the motivation for parallel computing. Parallelization strategies for a maximum likelihood estimation-based approach to physical map reconstruction are presented. The estimation procedure entails gradient descent search for determining the optimal spacings between probes for a given probe ordering. The optimal probe ordering is determined using a stochastic optimization algorithm. A two-tier parallelization strategy is proposed wherein the gradient descent search is parallelized at the lower level and the stochastic optimization algorithm is simultaneously parallelized at the higher level. Implementation and experimental results on a distributed memory multiprocessor cluster running the Parallel Virtual Machine (PVM) environment are presented.

[1]  William H. Press,et al.  Numerical recipes in C , 2002 .

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[4]  Michael Creutz,et al.  Microcanonical Monte Carlo Simulation , 1983 .

[5]  W. A. Ericson Introduction to Mathematical Statistics, 4th Edition , 1972 .

[6]  M. Pernice,et al.  PVM: Parallel Virtual Machine - A User's Guide and Tutorial for Networked Parallel Computing [Book Review] , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[7]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[8]  W. Kern,et al.  Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing (Emile Aarts and Jan Korst) , 1991, SIAM Rev..

[9]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[10]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  Y. Fu,et al.  On the design of genome mapping experiments using short synthetic oligonucleotides. , 1992, Biometrics.

[13]  John D. Kececioglu,et al.  Reconstructing distances in physical maps of chromosomes with nonoverlapping probes , 2000, RECOMB '00.

[14]  F. Downton,et al.  Introduction to Mathematical Statistics , 1959 .