Prediction of GFP spectral properties using artificial neural network

The prediction of the excitation and the emission maxima of green fluorescent protein (GFP) chromophores were investigated by a quantitative structure‐property relationship study. A data set of 19 GFP color variants and an additional data set consisting of 29 synthetic GFP chromophores were collected from the literature. Artificial neural network implementing the back‐propagation algorithm was employed. The proposed computational approach reliably predicted the excitation and the emission maxima of GFP chromophores with correlation coefficient exceeding 0.9. The usefulness of quantum chemical descriptors was revealed by a comparative study with other molecular descriptors. Assignment of appropriate protonation state of the chromophore for the GFP color variants data set was shown to be necessary for good predictive performance. Results suggest that the confinement of the GFP chromophore has no significant influence on the predictive performance of the data set used. A comparative investigation with the traditional modeling methods, particularly multiple linear regression and partial least squares, reveals that artificial neural network is the most suitable modeling approach for the GFP spectral properties. It is anticipated that this methodology has great potential in accelerating the design and engineering of novel GFP color variants of scientific or industrial interest. © 2007 Wiley Periodicals, Inc. J Comput Chem, 2007

[1]  Charles L. Brooks,et al.  CHARGE SCREENING AND THE DIELECTRIC CONSTANT OF PROTEINS : INSIGHTS FROM MOLECULAR DYNAMICS , 1996 .

[2]  A. J. Duke,et al.  Quantum topology of molecular charge distributions. 1 , 1979 .

[3]  R Y Tsien,et al.  Wavelength mutations and posttranslational autoxidation of green fluorescent protein. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[4]  H. Nakatsuji,et al.  Electronic excitations of the green fluorescent protein chromophore in its protonation states: SAC/SAC‐CI study , 2003, J. Comput. Chem..

[5]  Massimo Olivucci,et al.  Origin, nature, and fate of the fluorescent state of the green fluorescent protein chromophore at the CASPT2//CASSCF resolution. , 2004, Journal of the American Chemical Society.

[6]  C. Breneman,et al.  QSPR analysis of HPLC column capacity factors for a set of high‐energy materials using electronic van der waals surface property descriptors computed by transferable atom equivalent method , 1997 .

[7]  Marco Garavelli,et al.  Solvent effects on the vibrational activity and photodynamics of the green fluorescent protein chromophore: a quantum-chemical study. , 2005, Journal of the American Chemical Society.

[8]  Riccardo Nifosì,et al.  Ab Initio Molecular Dynamics of the Green Fluorescent Protein (GFP) Chromophore: An Insight into the Photoinduced Dynamics of Green Fluorescent Proteins , 2001 .

[9]  Martyn G. Ford,et al.  Unsupervised Forward Selection: A Method for Eliminating Redundant Variables , 2000, J. Chem. Inf. Comput. Sci..

[10]  M. Zimmer,et al.  Green fluorescent protein (GFP): applications, structure, and related photophysical behavior. , 2002, Chemical reviews.

[11]  E. V. Thomas,et al.  Partial least-squares methods for spectral analyses. 1. Relation to other quantitative calibration methods and the extraction of qualitative information , 1988 .

[12]  Peter G Schultz,et al.  Unnatural amino acid mutagenesis of green fluorescent protein. , 2003, The Journal of organic chemistry.

[13]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[14]  Curt M. Breneman,et al.  Transferable atom equivalent multicentered multipole expansion method , 2003, J. Comput. Chem..

[15]  R. Tsien,et al.  green fluorescent protein , 2020, Catalysis from A to Z.

[16]  A. Wada,et al.  A theoretical study of the dielectric constant of protein. , 1988, Protein engineering.

[17]  F. G. Prendergast,et al.  Biophysics of the green fluorescent protein. , 1999, Methods in cell biology.

[18]  Roger Y. Tsien,et al.  Crystal Structure of the Aequorea victoria Green Fluorescent Protein , 1996, Science.

[19]  Notker Rösch,et al.  Absorption spectra of the GFP chromophore in solution: comparison of theoretical and experimental results , 2001 .

[20]  S. Gery,et al.  Repression of the TMEFF2 promoter by c-Myc. , 2003, Journal of molecular biology.

[21]  Teodoro Laino,et al.  Relationship between structure and optical properties in green fluorescent proteins: a quantum mechanical study of the chromophore environment , 2004 .

[22]  Virapong Prachayasittikul,et al.  Quantitative prediction of imprinting factor of molecularly imprinted polymers by artificial neural network , 2005, J. Comput. Aided Mol. Des..

[23]  Notker Rösch,et al.  Structure and rotation barriers for ground and excited states of the isolated chromophore of the green fluorescent protein , 1998 .

[24]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[25]  M. Zimmer,et al.  Computational analysis of the autocatalytic posttranslational cyclization observed in histidine ammonia-lyase. A comparison with green fluorescent protein. , 2001, Journal of the American Chemical Society.

[26]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[27]  Ton Bisseling,et al.  Imaging protein-protein interactions in living cells , 2002, Plant Molecular Biology.

[28]  Virapong Prachayasittikul,et al.  Binding of chimeric metal-binding green fluorescent protein to lipid monolayer , 2004, European Biophysics Journal.

[29]  J. Lippincott-Schwartz,et al.  Studying protein dynamics in living cells , 2001, Nature Reviews Molecular Cell Biology.

[30]  C. Albano,et al.  All solid-state GFP sensor. , 2000, Biotechnology and bioengineering.

[31]  B. Rosen,et al.  Role of Cysteinyl Residues in Sensing Pb(II), Cd(II), and Zn(II) by the Plasmid pI258 CadC Repressor* , 2001, The Journal of Biological Chemistry.

[32]  B. Valeur,et al.  Molecular Fluorescence: Principles and Applications , 2001 .

[33]  Gregor Jung,et al.  The photophysics of green fluorescent protein: influence of the key amino acids at positions 65, 203, and 222. , 2005, Biophysical journal.

[34]  Notker Rösch,et al.  Quantum chemical modeling of structure and absorption spectra of the chromophore in green fluorescent proteins , 1998 .

[35]  G. Patterson,et al.  Improved Fluorescence and Dual Color Detection with Enhanced Blue and Green Variants of the Green Fluorescent Protein* , 1998, The Journal of Biological Chemistry.

[36]  Jacques Haiech,et al.  Fluorescent derivatives of the GFP chromophore give a new insight into the GFP fluorescence process. , 2003, Biophysical journal.

[37]  M. J. Cormier,et al.  Primary structure of the Aequorea victoria green-fluorescent protein. , 1992, Gene.

[38]  J A McCammon,et al.  Shedding light on the dark and weakly fluorescent states of green fluorescent proteins. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[39]  M K Gilson,et al.  The dielectric constant of a folded protein , 1986, Biopolymers.

[40]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[41]  Robert Huber,et al.  Expansion of the genetic code enables design of a novel "gold" class of green fluorescent proteins. , 2003, Journal of molecular biology.

[42]  Steven M Cramer,et al.  Prediction of the effect of mobile-phase salt type on protein retention and selectivity in anion exchange systems. , 2003, Analytical chemistry.

[43]  A Miyawaki,et al.  Directed evolution of green fluorescent protein by a new versatile PCR strategy for site-directed and semi-random mutagenesis. , 2000, Nucleic acids research.

[44]  Jonas S Almeida,et al.  Predictive non-linear modeling of complex data by artificial neural networks. , 2002, Current opinion in biotechnology.

[45]  M. Kearns,et al.  Algorithmic stability and sanity-check bounds for leave-one-out cross-validation , 1999 .

[46]  Ian Witten,et al.  Data Mining , 2000 .

[47]  Virapong Prachayasittikul,et al.  Nanoscale orientation and lateral organization of chimeric metal-binding green fluorescent protein on lipid membrane determined by epifluorescence and atomic force microscopy. , 2005, Biochemical and biophysical research communications.

[48]  Palanisamy Thanikaivelan,et al.  Application of quantum chemical descriptor in quantitative structure activity and structure property relationship , 2000 .

[49]  Richard N. Day,et al.  Fluorescent protein spectra. , 2001, Journal of cell science.

[50]  S J Remington,et al.  Structural and spectral response of green fluorescent protein variants to changes in pH. , 1999, Biochemistry.

[51]  V. Prachayasittikul,et al.  Lipid-Membrane Affinity of Chimeric Metal-binding Green Fluorescent Protein , 2004, The Journal of Membrane Biology.

[52]  R. Wachter,et al.  Maturation efficiency, trypsin sensitivity, and optical properties of Arg96, Glu222, and Gly67 variants of green fluorescent protein. , 2005, Biochemical and biophysical research communications.

[53]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[54]  M. Chalfie,et al.  Green fluorescent protein as a marker for gene expression. , 1994, Science.

[55]  M. Karelson,et al.  Quantum-Chemical Descriptors in QSAR/QSPR Studies. , 1996, Chemical reviews.

[56]  Ruth Pachter,et al.  Molecular modeling of green fluorescent protein: Structural effects of chromophore deprotonation , 2004, Biopolymers.

[57]  R Y Tsien,et al.  Understanding, improving and using green fluorescent proteins. , 1995, Trends in biochemical sciences.

[58]  Volkhard Helms,et al.  Chromophore Protonation States and the Proton Shuttle Mechanism in Green Fluorescent Protein: Inferences Drawn from ab Initio Theoretical Studies of Chemical Structures and Vibrational Spectra , 2001 .

[59]  Virapong Prachayasittikul,et al.  Lighting E. coli cells as biological sensors for Cd2+ , 2001, Biotechnology Letters.

[60]  Douglas C. Youvan,et al.  Dramatic reduction in fluorescence quantum yield in mutants of Green Fluorescent Protein due to fast internal conversion , 1998 .

[61]  Hongzhe Li,et al.  Statistical Applications in Genetics and Molecular Biology An additive genetic gamma frailty model for two-locus linkage analysis using sibship age of onset data , 2011 .

[62]  H P Schwan,et al.  Dielectric dispersion of crystalline powders of amino acids, peptides, and proteins. , 1965, The Journal of physical chemistry.

[63]  Robert Huber,et al.  Crystallographic Evidence for Isomeric Chromophores in 3‐Fluorotyrosyl‐Green Fluorescent Protein , 2004, Chembiochem : a European journal of chemical biology.