论文信息 - Classification of crystallization outcomes using deep convolutional neural networks

Classification of crystallization outcomes using deep convolutional neural networks

The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications.

[1] N. S. Barnett,et al. Private communication , 1969 .

[2] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[3] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[4] P. Cutshall. Lessons for the future. , 1998, Nursing BC.

[5] A. McPherson. Crystallization of Biological Macromolecules , 1999 .

[6] Glen Spraggon,et al. Computational analysis of crystallization trials. , 2002, Acta crystallographica. Section D, Biological crystallography.

[7] Igor Jurisica,et al. Automatic classification of sub-microlitre protein-crystallization trials in 1536-well plates. , 2003, Acta crystallographica. Section D, Biological crystallography.

[8] Naomi E Chayen,et al. Turning protein crystallisation from an art into a science. , 2004, Current opinion in structural biology.

[9] Peter Kuhn,et al. Automatic classification of protein crystallization images using a curve‐tracking algorithm , 2004 .

[10] Igor Jurisica,et al. Automatic Classification and Pattern Discovery in High-throughput Protein Crystallization Trials , 2005, Journal of Structural and Functional Genomics.

[11] Hajime Asama,et al. Evaluation of protein crystallization states based on texture information derived from greyscale images. , 2005 .

[12] Petra Perner,et al. Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining , 2006 .

[13] Dong Hui Xu,et al. Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features. , 2006, Acta crystallographica. Section D, Biological crystallography.

[14] Julie Wilson. Automated Classification of Images from Crystallisation Experiments , 2006, Industrial Conference on Data Mining.

[15] Samarasena Buchala,et al. Improved classification of crystallization images using data fusion and multiple classifiers. , 2008, Acta crystallographica. Section D, Biological crystallography.

[16] Igor Jurisica,et al. Establishing a training set through the visual analysis of crystallization trials. Part I: ∼150 000 images , 2008, Acta crystallographica. Section D, Biological crystallography.

[17] Taketoshi Mishima,et al. Evaluation of protein crystallization state by sequential image classification , 2008 .

[18] Igor Jurisica,et al. Establishing a training set through the visual analysis of crystallization trials. Part II: crystal examples , 2008, Acta crystallographica. Section D, Biological crystallography.

[19] Raymond M Nagel,et al. The application and use of chemical space mapping to interpret crystallization screening results , 2008, Acta crystallographica. Section D, Biological crystallography.

[20] Yoav Freund,et al. Image-based crystal detection: a machine-learning approach , 2008, Acta crystallographica. Section D, Biological crystallography.

[21] Andrew F. Laine,et al. Leveraging genetic algorithm and neural network in automated protein crystal recognition , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[22] Igor Jurisica,et al. Protein crystallization analysis on the World Community Grid , 2009, Journal of Structural and Functional Genomics.

[23] Changming Sun,et al. DroplIT, an improved image analysis method for droplet identification in high-throughput crystallization trials , 2010 .

[24] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[25] Randy J. Read,et al. Overview of the CCP4 suite and current developments , 2011, Acta crystallographica. Section D, Biological crystallography.

[26] Y. Thielmann,et al. The ESFRI Instruct Core Centre Frankfurt: automated high-throughput crystallization suited for membrane proteins and more , 2012, Journal of Structural and Functional Genomics.

[27] Janet Newman,et al. One plate, two plates, a thousand plates. How crystallisation changes with large numbers of samples. , 2011, Methods.

[28] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.

[29] Roger A. Sayle,et al. On the need for an international effort to capture, share and use crystallization screening data , 2012, Acta crystallographica. Section F, Structural biology and crystallization communications.

[30] Igor Jurisica,et al. High-throughput protein crystallization on the World Community Grid and the GPU , 2012 .

[31] Katarina Mele,et al. Quantifying the quality of the experiments used to grow protein crystals: the iQC suite , 2014 .

[32] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33] Andrew E. Bruno,et al. Statistical Analysis of Crystallization Database Links Protein Physico-Chemical Features with Crystallization Mechanisms , 2013, PloS one.

[34] John Collins,et al. Protein crystallization image classification with elastic net , 2014, Medical Imaging.

[35] J. Newman,et al. Using Time Courses To Enrich the Information Obtained from Images of Crystallization Trials , 2014 .

[36] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[37] J. Newman,et al. Crystallization: digging into the past to learn lessons for the future. , 2015, Methods in molecular biology.

[38] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[39] George Trigeorgis,et al. Domain Separation Networks , 2016, NIPS.

[40] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41] P. Charbonneau,et al. Computational crystallization. , 2015, Archives of biochemistry and biophysics.

[42] P. Charbonneau,et al. Soft matter perspective on protein crystal assembly. , 2015, Colloids and surfaces. B, Biointerfaces.

[43] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[44] Yichuan Tang,et al. Learning Deep Convolutional Neural Networks for X-Ray Protein Crystallization Image Analysis , 2016, AAAI.

[45] O. Stegle,et al. Deep learning for computational biology , 2016, Molecular systems biology.

[46] J. Ng,et al. Lessons from ten years of crystallization experiments at the SGC , 2016, Acta crystallographica. Section D, Structural biology.

[47] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[48] Meir Glick,et al. Extending 'predict first' to the design-make-test cycle in small-molecule drug discovery. , 2017, Future medicinal chemistry.

[49] Aleksey Boyko,et al. Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[50] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[51] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[52] Zenghui Wang,et al. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.

[53] Shuheng Zhang,et al. Microfluidic platform for optimization of crystallization conditions , 2017 .

[54] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[55] Jonathan Krause,et al. Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy , 2017, Ophthalmology.

[56] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57] Marko Ristic,et al. Cinder: keeping crystallographers app-y. , 2018, Acta crystallographica. Section F, Structural biology communications.