Automating tasks in protein structure determination with the clipper python module

Scripting programming languages provide the fastest means of prototyping complex functionality. Those with a syntax and grammar resembling human language also greatly enhance the maintainability of the produced source code. Furthermore, the combination of a powerful, machine‐independent scripting language with binary libraries tailored for each computer architecture allows programs to break free from the tight boundaries of efficiency traditionally associated with scripts. In the present work, we describe how an efficient C++ crystallographic library such as Clipper can be wrapped, adapted and generalized for use in both crystallographic and electron cryo‐microscopy applications, scripted with the Python language. We shall also place an emphasis on best practices in automation, illustrating how this can be achieved with this new Python module.

[1]  Gwyndaf Evans,et al.  Robust background modelling in DIALS , 2016, Journal of applied crystallography.

[2]  Martyn Winn,et al.  Recent developments in the CCP-EM software suite , 2017, Acta crystallographica. Section D, Structural biology.

[3]  Massimo Sammito,et al.  ARCIMBOLDO_LITE: single-workstation implementation and use. , 2015, Acta crystallographica. Section D, Biological crystallography.

[4]  Kevin Cowtan,et al.  The Buccaneer software for automated model building. 1. Tracing protein chains. , 2006, Acta crystallographica. Section D, Biological crystallography.

[5]  Fei Long,et al.  BALBES: a molecular-replacement pipeline , 2007, Acta crystallographica. Section D, Biological crystallography.

[6]  Ronan M Keegan,et al.  AMPLE: a cluster-and-truncate approach to solve the crystal structures of small proteins using rapidly computed ab initio models. , 2012, Acta crystallographica. Section D, Biological crystallography.

[7]  Ben M. Webb,et al.  Protein structure fitting and refinement guided by cryo-EM density. , 2008, Structure.

[8]  Nicholas K. Sauter,et al.  The Computational Crystallography Toolbox: crystallographic algorithms in a reusable software framework , 2002 .

[9]  Christian Roth,et al.  CCP4i2: the new graphical user interface to the CCP4 program suite , 2018, Acta crystallographica. Section D, Structural biology.

[10]  Philip R. Evans,et al.  How good are my data and what is the resolution? , 2013, Acta crystallographica. Section D, Biological crystallography.

[11]  Troels Blum,et al.  Transparent GPU Execution of NumPy Applications , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[12]  N. Pannu,et al.  REFMAC5 for the refinement of macromolecular crystal structures , 2011, Acta crystallographica. Section D, Biological crystallography.

[13]  S. Iwata,et al.  Clustering procedures for the optimal selection of data sets from multiple crystals in macromolecular crystallography , 2012, Acta crystallographica. Section D, Biological crystallography.

[14]  Victor S Lamzin,et al.  Auto-rickshaw: an automated crystal structure determination platform as an efficient tool for the validation of an X-ray diffraction experiment. , 2005, Acta crystallographica. Section D, Biological crystallography.

[15]  Serge X. Cohen,et al.  Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7 , 2008, Nature Protocols.

[16]  Rafael Fernandez-Leiro,et al.  A pipeline approach to single-particle processing in RELION , 2016, bioRxiv.

[17]  Keith S Wilson,et al.  Privateer: software for the conformational validation of carbohydrate structures , 2015, Nature Structural &Molecular Biology.

[18]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[19]  A. L. Patterson A Direct Method for the Determination of the Components of Interatomic Distances in Crystals , 1935 .

[20]  S. McNicholas,et al.  Presenting your structures: the CCP4mg molecular-graphics software , 2011, Acta crystallographica. Section D, Biological crystallography.

[21]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[22]  Conrad C. Huang,et al.  UCSF ChimeraX: Meeting modern challenges in visualization and analysis , 2018, Protein science : a publication of the Protein Society.

[23]  Huw T Jenkins,et al.  Fragon: rapid high-resolution structure determination from ideal protein fragments , 2018, Acta crystallographica. Section D, Structural biology.

[24]  Graeme Winter,et al.  Decision making in xia2 , 2013, Acta crystallographica. Section D, Biological crystallography.

[25]  G. Sheldrick,et al.  Practical structure solution with ARCIMBOLDO , 2012, Acta crystallographica. Section D, Biological crystallography.

[26]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[27]  Pavol Skubák,et al.  Automatic protein structure solution from weak X-ray data , 2013, Nature Communications.

[28]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2016, Current protocols in bioinformatics.

[29]  Fei Long,et al.  The PDB_REDO server for macromolecular structure model optimization , 2014, IUCrJ.

[30]  Clemens Vonrhein,et al.  Data processing and analysis with the autoPROC toolbox , 2011, Acta crystallographica. Section D, Biological crystallography.

[31]  Ole Tange,et al.  GNU Parallel: The Command-Line Power Tool , 2011, login Usenix Mag..

[32]  Nicholas K. Sauter,et al.  Diffraction-geometry refinement in the DIALS framework , 2016, Acta crystallographica. Section D, Structural biology.

[33]  A. Roseman Docking structures of domains into maps from cryo-electron microscopy using local correlation. , 2000, Acta crystallographica. Section D, Biological crystallography.

[34]  Randy J. Read,et al.  Overview of the CCP4 suite and current developments , 2011, Acta crystallographica. Section D, Biological crystallography.

[35]  Martyn D. Winn,et al.  MrBUMP: an automated pipeline for molecular replacement , 2007, Acta crystallographica. Section D, Biological crystallography.

[36]  Sjors H.W. Scheres,et al.  RELION: Implementation of a Bayesian approach to cryo-EM structure determination , 2012, Journal of structural biology.

[37]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.