Deep data analytics for genetic engineering of diatoms linking genotype to phenotype via machine learning

Genome engineering for materials synthesis is a promising avenue for manufacturing materials with unique properties under ambient conditions. Biomineralization in diatoms, unicellular algae that use silica to construct micron-scale cell walls with nanoscale features, is an attractive candidate for functional synthesis of materials for applications including photonics, sensing, filtration, and drug delivery. Therefore, controllably modifying diatom structure through targeted genetic modifications for these applications is a very promising field. In this work, we used gene knockdown in Thalassiosira pseudonana diatoms to create modified strains with changes to structural morphology and linked genotype to phenotype using supervised machine learning. An artificial neural network (NN) was developed to distinguish wild and modified diatoms based on the SEM images of frustules exhibiting phenotypic changes caused by a specific protein (Thaps3_21880), resulting in 94% detection accuracy. Class activation maps visualized physical changes that allowed the NNs to separate diatom strains, subsequently establishing a specific gene that controls pores. A further NN was created to batch process image data, automatically recognize pores, and extract pore-related parameters. Class interrelationship of the extracted paraments was visualized using a multivariate data visualization tool, called CrossVis, and allowed to directly link changes in morphological diatom phenotype of pore size and distribution with changes in the genotype.

[1]  Steven D. Brown,et al.  Neural network models of potential energy surfaces , 1995 .

[2]  P. Hopke,et al.  Classification of single particles by neural networks based on the computer-controlled scanning electron microscopy data , 1997 .

[3]  S. Y. Yun,et al.  A Performance evaluation of neural network models in traffic volume forecasting , 1998 .

[4]  Victor Smetacek,et al.  Architecture and material properties of diatom shells provide effective mechanical protection , 2003, Nature.

[5]  Won-Yong Lee,et al.  Empirical modeling of polymer electrolyte membrane fuel cell performance using artificial neural networks , 2004 .

[6]  Christopher S. Gaddis,et al.  Merging Biological Self-Assembly with Synthetic Chemical Tailoring: The Potential for 3-D Genetically Engineered Micro/Nano-Devices (3-D GEMS) , 2005 .

[7]  Mark Hildebrand,et al.  Prospects of manipulating diatom silica nanostructure. , 2005, Journal of nanoscience and nanotechnology.

[8]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[9]  A. Amato,et al.  LIFE CYCLE, SIZE REDUCTION PATTERNS, AND ULTRASTRUCTURE OF THE PENNATE PLANKTONIC DIATOM PSEUDO‐NITZSCHIA DELICATISSIMA (BACILLARIOPHYCEAE) 1 , 2005 .

[10]  T. Carrington,et al.  A nested molecule-independent neural network approach for high-quality potential fits. , 2006, The journal of physical chemistry. A.

[11]  Jessica I. Kelz,et al.  Nanoscale control of silica morphology and three-dimensional structure during diatom cell wall formation , 2006 .

[12]  Richard Dawes,et al.  Interpolating moving least-squares methods for fitting potential energy surfaces: computing high-density potential energy surface data from low-density ab initio data points. , 2007, The Journal of chemical physics.

[13]  Wei Wang,et al.  Electroluminescence and Photoluminescence from Nanostructured Diatom Frustules Containing Metabolically Inserted Germanium , 2008 .

[14]  P. Maddalena,et al.  Marine diatoms as optical chemical sensors: A time-resolved study , 2008 .

[15]  M J Doktycz,et al.  Diverse and conserved nano‐ and mesoscale structures of diatom silica revealed by atomic force microscopy , 2009, Journal of microscopy.

[16]  Gregory L. Rorrer,et al.  Photoluminescence Detection of Biomolecules by Antibody‐Functionalized Diatom Biosilica , 2009 .

[17]  Mark Hildebrand,et al.  Dynamics of silica cell wall morphogenesis in the diatom Cyclotella cryptica: substructure formation and the role of microfilaments. , 2010, Journal of structural biology.

[18]  P. Popelier,et al.  Potential energy surfaces fitted by artificial neural networks. , 2010, The journal of physical chemistry. A.

[19]  Lin Hua,et al.  Back Propagation neural network modeling for warpage prediction and optimization of plastic products during injection molding , 2011 .

[20]  Shilpi Agarwal,et al.  Prediction of capillary gas chromatographic retention times of fatty acid methyl esters in human blood using MLR, PLS and back-propagation artificial neural networks. , 2011, Talanta.

[21]  M. Buehler,et al.  Influence of geometry on mechanical properties of bio-inspired silica-based hierarchical materials , 2012, Bioinspiration & biomimetics.

[22]  Trina M. Norden-Krichmar,et al.  Whole transcriptome analysis of the silicon response of the diatom Thalassiosira pseudonana , 2012, BMC Genomics.

[23]  T. J. Jankun-Kelly,et al.  A Visual Analytics Approach for Correlation, Classification, and Regression Analysis , 2012 .

[24]  Weidong Huang,et al.  Innovative Approaches of Data Visualization and Visual Analytics , 2013 .

[25]  Peter E. Thornton,et al.  Big data visual analytics for exploratory earth system simulation analysis , 2013, Comput. Geosci..

[26]  M. Hildebrand,et al.  Evidence for a Regulatory Role of Diatom Silicon Transporters in Cellular Silicon Responses , 2014, Eukaryotic Cell.

[27]  M. Head‐Gordon,et al.  Interpolating moving least-squares methods for fitting potential energy surfaces : Computing high-density potential energy surface data from low-density ab initio data points , 2014 .

[28]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[29]  James E. Evans,et al.  Using molecular dynamics to quantify the electrical double layer and examine the potential for its direct observation in the in-situ TEM , 2015, Advanced Structural and Chemical Imaging.

[30]  Sergei V. Kalinin,et al.  Big data and deep data in scanning and electron microscopies: deriving functionality from multidimensional data sets , 2015, Advanced Structural and Chemical Imaging.

[31]  I. Hense,et al.  A theoretical investigation of the diatom cell size reduction–restitution cycle , 2015 .

[32]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[33]  Erratum: Big data and deep data in scanning and electron microscopies: deriving functionality from multidimensional data sets , 2015, Advanced Structural and Chemical Imaging.

[34]  P. Sloot,et al.  Temperature affects the silicate morphology in a diatom , 2015, Scientific Reports.

[35]  Shaun P Jackson,et al.  The class II PI 3-kinase, PI3KC2α, links platelet internal membrane structure to shear-dependent adhesive function , 2015, Nature Communications.

[36]  W. McDonough,et al.  Corrigendum: AGM2015: Antineutrino Global Map 2015 , 2015, Scientific Reports.

[37]  M. Hildebrand,et al.  Diatom silica biomineralization: Parallel development of approaches and understanding. , 2015, Seminars in cell & developmental biology.

[38]  N. Voelcker,et al.  Targeted drug delivery using genetically engineered diatom biosilica , 2015, Nature Communications.

[39]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  J. Greer,et al.  Microstructure provides insights into evolutionary design and resilience of Coscinodiscus sp. frustule , 2016, Proceedings of the National Academy of Sciences.

[41]  Roger G. Melko,et al.  Machine learning phases of matter , 2016, Nature Physics.

[42]  Rama Vasudevan,et al.  Deep Learning of Atomically Resolved Scanning Transmission Electron Microscopy Images: Chemical Identification and Tracking Local Transformations. , 2017, ACS nano.

[43]  M. Hildebrand,et al.  Characterization of a New Protein Family Associated With the Silica Deposition Vesicle Membrane Enables Genetic Manipulation of Diatom Silica , 2017, Scientific Reports.

[44]  S. Huber,et al.  Learning phase transitions by confusion , 2016, Nature Physics.

[45]  Maxim Ziatdinov,et al.  Learning surface molecular structures via machine vision , 2017, npj Computational Materials.

[46]  J. Gielis,et al.  Diatom Frustule Morphogenesis and Function: a Multidisciplinary Survey. , 2017, Marine genomics.

[47]  Juan Carrasquilla,et al.  Machine learning quantum phases of matter beyond the fermion sign problem , 2016, Scientific Reports.

[48]  Viswanathan Chinnuswamy,et al.  Detection of typhoid fever by diatom-based optical biosensor , 2018, Environmental Science and Pollution Research.

[49]  Stephen Lynch,et al.  Image Processing with Python , 2018 .

[50]  Helium Ion Microscopy for Imaging and Quantifying Porosity at the Nanoscale. , 2018, Analytical chemistry.

[51]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[52]  N. Kröger,et al.  Reconstituting the formation of hierarchically porous silica patterns using diatom biomolecules. , 2018, Journal of structural biology.

[53]  R. Congestri,et al.  UV-shielding and wavelength conversion by centric diatom nanopatterned frustules , 2018, Scientific Reports.

[54]  Ille C. Gebeshuber,et al.  On Light and Diatoms: A Photonics and Photobiology Review , 2019, Diatoms: Fundamentals and Applications.