TEXTAL™: Automated Crystallographic Protein Structure Determination

This paper reports on TEXTAL™, a deployed application that uses a variety of AI techniques to automate the process of determining the 3D structure of proteins by x-ray crystallography. The TEXTAL™ project was initiated in 1998, and the application is currently deployed in three ways: (1) a web-based interface called WebTex, operational since June 2002; (2) as the automated model-building component of an integrated crystallography software called PHENIX, first released in July 2003; (3) binary distributions, available since September 2004. TEXTAL™ and its sub-components are currently being used by crystallographers around the world, both in the industry and in academia. TEXTAL™ saves up to weeks of effort typically required to determine the structure of one protein; the system has proven to be particularly helpful when the quality of the data is poor, which is very often the case. Automated protein modeling systems like TEXTAL™ are critical to the structural genomics initiative, a worldwide effort to determine the 3D structure of all proteins in a high-throughput mode, thereby keeping up with the rapid growth of genomic sequence databases.

[1]  Thomas R Ioerger,et al.  Automatic modeling of protein backbones in electron-density maps via prediction of Calpha coordinates. , 2002, Acta crystallographica. Section D, Biological crystallography.

[2]  Anastassis Perrakis,et al.  Automated protein model building combined with iterative structure refinement , 1999, Nature Structural Biology.

[3]  D. Levitt,et al.  A new software routine that automates the fitting of protein X-ray crystallographic electron-density maps. , 2001, Acta crystallographica. Section D, Biological crystallography.

[4]  Thomas C. Terwilliger,et al.  Electronic Reprint Biological Crystallography Automated Main-chain Model Building by Template Matching and Iterative Fragment Extension , 2022 .

[5]  Randy J Read,et al.  Electronic Reprint Biological Crystallography Phenix: Building New Software for Automated Crystallographic Structure Determination Biological Crystallography Phenix: Building New Software for Automated Crystallographic Structure Determination , 2022 .

[6]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[7]  Lawrence Hunter,et al.  Artificial Intelligence and Molecular Biology , 1992, AI Mag..

[8]  R. Diamond A real-space refinement procedure for proteins , 1971 .

[9]  Thomas R. Ioerger,et al.  Evaluation of Geometric & Probabilistic Distance Measures To Retrieve Electron Density Patterns for Protein Structure Determination , 2004, IC-AI.

[10]  Randy J Read,et al.  Recent developments in the PHENIX software for automated crystallographic structure determination. , 2004, Journal of synchrotron radiation.

[11]  David W. Aha,et al.  Feature Weighting for Lazy Learning Algorithms , 1998 .

[12]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[13]  Thomas R. Ioerger,et al.  TEXTAL TM : Artificial Intelligence Techniques for Automated Protein Structure Determination. , 2003 .

[14]  L. Holm Database algorithm to generate protein backbone and side-chain coordinates from a Cα, trace , 1990 .

[15]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[16]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[17]  T Holton,et al.  Determining protein structure from electron-density maps using pattern matching. , 2000, Acta crystallographica. Section D, Biological crystallography.

[18]  Thomas R. Ioerger,et al.  Efficient retrieval of electron density patterns for modeling proteins by X-ray crystallography , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[19]  Thomas C. Terwilliger,et al.  Automated MAD and MIR structure solution , 1999, Acta crystallographica. Section D, Biological crystallography.

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[22]  Thomas R. Ioerger,et al.  Determining Relevant Features to Recognize Electron Density Patterns in X-ray Protein Crystallography , 2005, J. Bioinform. Comput. Biol..

[23]  L. Johnson,et al.  Methods in macromolecular crystallography , 2001 .

[24]  Richard J Morris Statistical pattern recognition for macromolecular crystallographers. , 2004, Acta crystallographica. Section D, Biological crystallography.

[25]  D C Richardson,et al.  Interpretation of electron density maps. , 1985, Methods in enzymology.

[26]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[27]  Thomas R. Ioerger,et al.  TEXTALTM: Artificial Intelligence Techniques for Automated Protein Structure Determination , 2003, IAAI.

[28]  W G Hol,et al.  A database method for automated map interpretation in protein crystallography , 1999, Proteins.

[29]  T A Jones,et al.  Electron-density map interpretation. , 1997, Methods in enzymology.

[30]  C. Sander,et al.  Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. , 1991, Journal of molecular biology.

[31]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[32]  Edward A. Feigenbaum,et al.  A correlation between crystallographic computing and artificial intelligence research , 1977 .

[33]  Wayne A Hendrickson,et al.  [28] Phase determination from multiwavelength anomalous diffraction measurements. , 1997, Methods in enzymology.

[34]  Thomas C. Terwilliger,et al.  Electronic Reprint Biological Crystallography Maximum-likelihood Density Modification , 2022 .

[35]  Janice I. Glasgow,et al.  Molecular Scene Analysis: Crystal Structure Determination Through Imagery , 1998 .

[36]  T A Jones,et al.  Errors and reproducibility in electron-density map interpretation. , 1999, Acta crystallographica. Section D, Biological crystallography.

[37]  T. Ioerger Automated detection of disulfide bridges in electron density maps using linear discriminant analysis , 2005 .

[38]  R J Read,et al.  Pushing the boundaries of molecular replacement with maximum likelihood. , 2003, Acta crystallographica. Section D, Biological crystallography.

[39]  G J Kleywegt,et al.  Template convolution to enhance or detect structural features in macromolecular electron-density maps. , 1997, Acta crystallographica. Section D, Biological crystallography.

[40]  Thomas R Ioerger,et al.  TEXTAL system: artificial intelligence techniques for automated protein model building. , 2003, Methods in enzymology.

[41]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[42]  Thomas R. Ioerger,et al.  Detecting Feature Interactions from Accuracies of Random Feature Subsets , 1999, AAAI/IAAI.