ANN Multiscale Model of Anti-HIV Drugs Activity vs AIDS Prevalence in the US at County Level Based on Information Indices of Molecular Graphs and Social Networks

This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.

[1]  David Loewenstern,et al.  Significantly lower entropy estimates for natural DNA sequences , 1997, Proceedings DCC '97. Data Compression Conference.

[2]  E. Petricoin,et al.  Proteins, drug targets and the mechanisms they control: the simple truth about complex networks , 2007, Nature Reviews Drug Discovery.

[3]  Feng Luan,et al.  Chemoinformatics in multi-target drug discovery for anti-cancer therapy: in silico design of potent and versatile anti-brain tumor agents. , 2012, Anti-cancer agents in medicinal chemistry.

[4]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[5]  Gant Z,et al.  A County-Level Examination of the Relationship Between HIV and Social Determinants of Health: 40 States, 2006-2008 , 2012, The open AIDS journal.

[6]  Filip Ponulak,et al.  Introduction to spiking neural networks: Information processing, learning and applications. , 2011, Acta neurobiologiae experimentalis.

[7]  Reinaldo Molina Ruiz,et al.  Desirability-based methods of multiobjective optimization and ranking for global QSAR studies. Filtering safe and potent drug candidates from combinatorial libraries. , 2008, Journal of combinatorial chemistry.

[8]  Marek Wesolowski,et al.  Artificial neural networks: theoretical background and pharmaceutical applications: a review. , 2012, Journal of AOAC International.

[9]  Maykel Cruz-Monteagudo,et al.  Global antifungal profile optimization of chlorophenyl derivatives against Botrytis cinerea and Colletotrichum gloeosporioides. , 2009, Journal of agricultural and food chemistry.

[10]  Joep Lange,et al.  The spread, treatment, and prevention of HIV-1: evolution of a global pandemic. , 2008, The Journal of clinical investigation.

[11]  Eric Goles,et al.  Dynamical complexity in cognitive neural networks. , 2007, Biological research.

[12]  E. Ding,et al.  Evidenced Formal Coverage Index and universal healthcare enactment: A prospective longitudinal study of economic, social, and political predictors of 194 countries. , 2013, Health policy.

[13]  Hojjat Adeli,et al.  A new supervised learning algorithm for multiple spiking neural networks with application in epilepsy and seizure detection , 2009, Neural Networks.

[14]  Lu Lu,et al.  Approaches for Identification of HIV-1 Entry Inhibitors Targeting gp41 Pocket , 2013, Viruses.

[15]  Ruth Brenk,et al.  Mining the ChEMBL Database: An Efficient Chemoinformatics Workflow for Assembling an Ion Channel-Focused Screening Library , 2011, J. Chem. Inf. Model..

[16]  Andrew Hunter,et al.  Application of neural networks and sensitivity analysis to improved prediction of trauma survival , 2000, Comput. Methods Programs Biomed..

[17]  Nathan Brown,et al.  Molecular optimization using computational multi-objective methods. , 2007, Current opinion in drug discovery & development.

[18]  Pawel Lewicki,et al.  Statistics : methods and applications : a comprehensive reference for science, industry, and data mining , 2006 .

[19]  G. A. Suthakaran,et al.  Artificial intelligence in TV , 2010, 2010 The 2nd International Conference on Industrial Mechatronics and Automation.

[20]  P Botella-Rocamora,et al.  Spatial moving average risk smoothing , 2013, Statistics in medicine.

[21]  Bingjie Qin,et al.  Design, synthesis, and preclinical evaluations of novel 4-substituted 1,5-diarylanilines as potent HIV-1 non-nucleoside reverse transcriptase inhibitor (NNRTI) drug candidates. , 2012, Journal of medicinal chemistry.

[22]  N. Trinajstic,et al.  Information theory, distance matrix, and molecular branching , 1977 .

[23]  Yoshua Bengio,et al.  Collaborative Filtering on a Family of Biological Targets , 2006, J. Chem. Inf. Model..

[24]  Jürgen Bajorath,et al.  Molecular Scaffolds with High Propensity to Form Multi-Target Activity Cliffs , 2010, J. Chem. Inf. Model..

[25]  Nenad Trinajstić,et al.  Isomer discrimination by topological information approach , 1981 .

[26]  Stephen E Gilman,et al.  Income inequality among American states and the incidence of major depression , 2013, Journal of Epidemiology & Community Health.

[27]  J. Burns,et al.  Income inequality and schizophrenia: Increased schizophrenia incidence in countries with high levels of income inequality , 2014, The International journal of social psychiatry.

[28]  Philip E. Bourne,et al.  SuperTarget goes quantitative: update on drug–target interactions , 2011, Nucleic Acids Res..

[29]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[30]  Marc C Nicklaus,et al.  Computer tools in the discovery of HIV-1 integrase inhibitors. , 2010, Future medicinal chemistry.

[31]  Arnaud Chiolero,et al.  Big data in epidemiology: too big to fail? , 2013, Epidemiology.

[32]  Eugenio Uriarte,et al.  Alignment-free prediction of a drug-target complex network based on parameters of drug connectivity and protein sequence of receptors. , 2009, Molecular pharmaceutics.

[33]  Jan Vyhnánek,et al.  Analysis of fMRI time-series by entropy measures. , 2012, Neuro endocrinology letters.

[34]  Cristian R. Munteanu,et al.  New Markov-Shannon Entropy models to assess connectivity quality in complex networks: from molecular to cellular pathway, Parasite-Host, Neural, Industry, and Legal-Social networks. , 2012, Journal of theoretical biology.

[35]  Daniel J. Graham,et al.  Information Content in Organic Molecules: Brownian Processing at Low Levels , 2007, J. Chem. Inf. Model..

[36]  Edward J. Mills,et al.  Adverse events associated with nevirapine and efavirenz-based first-line antiretroviral therapy: a systematic review and meta-analysis , 2013, AIDS.

[37]  A. Balaban,et al.  New vertex invariants and topological indices of chemical graphs based on information on distances , 1991 .

[38]  Daniel J. Graham,et al.  Information Content in Organic Molecules: Reaction Pathway Analysis via Brownian Processing , 2004, J. Chem. Inf. Model..

[39]  D. Kamenski,et al.  Symmetry and information content of chemical structures , 1976 .

[40]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[41]  Ruth Ann Marrie,et al.  Exploring the implications of small-area variation in the incidence of multiple sclerosis. , 2013, American journal of epidemiology.

[42]  Matthias Dehmer,et al.  Information theoretic measures of UHG graphs with low computational complexity , 2007, Appl. Math. Comput..

[43]  Nenad Trinajstić,et al.  Chemical graph theory: Modeling the thermodynamic properties of molecules , 1980 .

[44]  D. J. Klein,et al.  Quantitative Structure-Property Relationships Generated with Optimizable Even/Odd Wiener Polynomial Descriptors , 2001, SAR and QSAR in environmental research.

[45]  Matthias Dehmer,et al.  Information Indices with High Discriminative Power for Graphs , 2012, PloS one.

[46]  Kunal Roy,et al.  Comparative QSPR studies with molecular connectivity, molecular negentropy and TAU indices , 2003, Journal of molecular modeling.

[47]  Leila E Mansoor,et al.  A drug evaluation of 1% tenofovir gel and tenofovir disoproxil fumarate tablets for the prevention of HIV infection , 2012, Expert opinion on investigational drugs.

[48]  Kathrin Heikamp,et al.  Large-Scale Similarity Search Profiling of ChEMBL Compound Data Sets , 2011, J. Chem. Inf. Model..

[49]  V. V. Kleandrova,et al.  Chemoinformatics in anti-cancer chemotherapy: multi-target QSAR model for the in silico discovery of anti-breast cancer agents. , 2012, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[50]  I Rebelo,et al.  Application of desirability-based multi(bi)-objective optimization in the design of selective arylpiperazine derivates for the 5-HT1A serotonin receptor. , 2009, European journal of medicinal chemistry.

[51]  L B Kier,et al.  Use of molecular negentropy to encode structure governing biological activity. , 1980, Journal of pharmaceutical sciences.

[52]  D. Thirumalai,et al.  Proteins associated with diseases show enhanced sequence correlation between charged residues , 2004, Bioinform..

[53]  Alejandro Speck-Planche,et al.  Chemoinformatics for rational discovery of safe antibacterial drugs: simultaneous predictions of biological activity against streptococci and toxicological profiles in laboratory animals. , 2013, Bioorganic & medicinal chemistry.

[54]  D. Coddington,et al.  The big deal about big data. , 2013, Healthcare financial management : journal of the Healthcare Financial Management Association.

[55]  Matthias M Dehmer,et al.  Novel topological descriptors for analyzing biological networks , 2010, BMC Structural Biology.

[56]  Alan R. Katritzky,et al.  CODESSA-Based Theoretical QSPR Model for Hydantoin HPLC-RT Lipophilicities , 2001, J. Chem. Inf. Comput. Sci..

[57]  Geoffrey J. Gordon,et al.  Artificial intelligence in medicine , 1989, Singapore medical journal.

[58]  Richard Platt,et al.  Big data in epidemiology: too big to fail? , 2013, Epidemiology.

[59]  Matthias Dehmer,et al.  A history of graph entropy measures , 2011, Inf. Sci..

[60]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[61]  Martin Vingron,et al.  Lethality and entropy of protein interaction networks. , 2005, Genome informatics. International Conference on Genome Informatics.

[62]  Robin D. Rogers,et al.  QSPR Correlation of the Melting Point for Pyridinium Bromides, Potential Ionic Liquids. , 2002 .

[63]  Daniel J. Graham,et al.  Information and Organic Molecules: Structure Considerations via Integer Statistics , 2002, J. Chem. Inf. Comput. Sci..

[64]  Jürgen Bajorath,et al.  Differential Shannon Entropy Analysis Identifies Molecular Property Descriptors that Predict Aqueous Solubility of Synthetic Compounds with High Accuracy in Binary QSAR Calculations , 2002, J. Chem. Inf. Comput. Sci..

[65]  Ritu Jain,et al.  QSPR Correlation of the Melting Point for Pyridinium Bromides, Potential Ionic Liquids , 2002, J. Chem. Inf. Comput. Sci..

[66]  Maykel Cruz-Monteagudo,et al.  Desirability-based multi-objective QSAR in drug discovery. , 2012, Mini reviews in medicinal chemistry.

[67]  T G Dewey,et al.  The Shannon information entropy of protein sequences. , 1996, Biophysical journal.

[68]  Humberto González-Díaz,et al.  Using entropy of drug and protein graphs to predict FDA drug-target network: theoretic-experimental study of MAO inhibitors and hemoglobin peptides from Fasciola hepatica. , 2011, European journal of medicinal chemistry.

[69]  Robert F. Siliciano,et al.  A Quantitative Measurement of Antiviral Activity of Anti-Human Immunodeficiency Virus Type 1 Drugs against Simian Immunodeficiency Virus Infection: Dose-Response Curve Slope Strongly Influences Class-Specific Inhibitory Potential , 2012, Journal of Virology.

[70]  Humberto González-Díaz,et al.  3D MI-DRAGON: new model for the reconstruction of US FDA drug- target network and theoretical-experimental studies of inhibitors of rasagiline derivatives for AChE. , 2012, Current topics in medicinal chemistry.

[71]  Ovidiu Ivanciuc,et al.  Chemical graphs with degenerate topological indices based on information on distances , 1993 .

[72]  Danail Bonchev,et al.  Information theoretic indices for characterization of chemical structures , 1983 .

[73]  Xianghui Yu,et al.  Small-Molecule Inhibition of Human Immunodeficiency Virus Type 1 Replication by Targeting the Interaction between Vif and ElonginC , 2012, Journal of Virology.

[74]  Daniel J. Graham,et al.  Information Content in Organic Molecules: Aggregation States and Solvent Effects , 2005, J. Chem. Inf. Model..

[75]  Jürgen Bajorath,et al.  Classification of Compounds with Distinct or Overlapping Multi-Target Activities and Diverse Molecular Mechanisms Using Emerging Chemical Patterns , 2013, J. Chem. Inf. Model..

[76]  Jürgen Bajorath,et al.  Distinguishing between Natural Products and Synthetic Molecules by Descriptor Shannon Entropy Analysis and Binary QSAR Calculations , 2000, J. Chem. Inf. Comput. Sci..

[77]  Halil Baykal,et al.  Application of Artificial Neural Networks (ANNs) in Wine Technology , 2013, Critical reviews in food science and nutrition.

[78]  Humberto González-Díaz,et al.  Entropy model for multiplex drug-target interaction endpoints of drug immunotoxicity. , 2013, Current topics in medicinal chemistry.

[79]  Daniel J. Graham,et al.  Base Information Content in Organic Formulas , 2000, J. Chem. Inf. Comput. Sci..

[80]  P. Khadikar,et al.  Modelling of carbonic anhydrase inhibitory activity of sulfonamides using molecular negentropy. , 2003, Bioorganic & medicinal chemistry letters.