Determining molecular conformation from distance or density data

The determination of molecular structures is of growing importance in modern chemistry and biology. This thesis presents two practical, systematic algorithms for two structure determination problems. Both algorithms are branch-and-bound techniques adapted to their respective domains. The first problem is the determination of structures of multimers given rigid monomer structures and (potentially ambiguous) intermolecular distance measurements. In other words, we need to find the transformations to produce the packing interfaces. A substantial difficulty results from ambiguities in assigning intermolecular distance measurements (from NMR, for example) to particular intermolecular interfaces in the structure. We present a rapid and efficient method to simultaneously solve the packing and the assignment problems. The algorithm, AmbiPack, uses a hierarchical division of the search space and the branch-and-bound algorithm to eliminate infeasible regions of the space and focus on the remaining space. The algorithm presented is guaranteed to find all solutions to a pre-determined resolution. The second problem is building a protein model from the initial three dimensional electron density distribution (density map) from X-ray crystallography. This problem is computationally challenging because proteins are extremely flexible. Our algorithm, ConfMatch, solves this “map interpretation” problem by matching a detailed conformation of the molecule to the density map (conformational matching). This “best match” structure is defined as one which maximizes the sum of the density at atom positions. The most important idea of ConfMatch is an efficient method for computing accurate bounds for branch-and-bound search. ConfMatch relaxes the conformational matching problem, a problem which can only be solved in exponential time (NP-hard), into one which can be solved in polynomial time. The solution to the relaxed problem is a guaranteed upper bound for the conformational matching problem. In most empirical cases, these bounds are accurate enough to prune the search space dramatically, enabling ConfMatch to solve structures with more than 100 free dihedral angles. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)