How Consistent are Publicly Reported Cytotoxicity Data? Large‐Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements

While increased attention is being paid to the impact of data quality in cell‐line sensitivity and toxicology modeling, to date, no systematic study has evaluated the comparability of independent cytotoxicity measurements on a large‐scale. Here, we estimate the experimental uncertainty of public cytotoxicity data from ChEMBL version 19. We applied stringent filtering criteria to assemble a curated data set comprised of pIC50 data for compound–cell line systems measured in independent laboratories. The estimated experimental uncertainty calculated was a mean unsigned error (MUE) value of 0.61–0.76, a median unsigned error (MedUE) value of 0.51–0.58, and a standard deviation of 0.76–1.00 pIC50 units. The experimental uncertainty (σE) estimated from all pairs of cytotoxicity measurements with a ΔpIC50 value lower than 2.5 was found to be 0.59–0.77 pIC50 units, and thus 21–60 % and 21–26 % higher than that of pKi and pIC50 data for ligand–protein data (σE=0.47–0.48 pKi units and σE=0.57‐0.61 pIC50 units, respectively). The estimated σE value from the pairs of pIC50 values measured with metabolic assays was 0.98, whereas the σE value was found to be 0.69 when using the 1388 pIC50 pairs measured using exactly the same experimental setup. The maximum achievable Pearson correlation coefficient ( RPearsonmax.2 ) of in silico models trained on cytotoxicity data from different laboratories was estimated to be 0.51–0.85, which is considerably different from the value of 1 corresponding to perfect predictions, hinting at the maximum performance one can expect also from computational cytotoxicity predictions. The lowest concordance between pairs of measurements was found for the drugs paclitaxel, methotrexate, zidovudine, and docetaxel, and for the cell lines HepG2, NCI‐H460, L1210, and CCRF‐CEM, hinting at particular sensitivity of those systems to experimental setups. The highest concordance was estimated for the compound–cell line system HL‐60–etoposide (σE=0.70), whereas the lowest for L1210–methotrexate (σE=1.68). We found that annotation errors are responsible for the high discordance observed for some pairs of measurements, pointing out the importance of data curation when automatically extracting cytotoxicity data from public databases. Likewise, these results highlight the importance of estimating compound cytotoxicity with assays providing complementary biological information (i.e., metabolic, clonogenic and assays based on cell membrane integrity), especially when the mechanism of action of test compounds is unknown. From this analysis, guidelines can be created on the reliability of cytotoxicity data from public databases, which could ultimately prove valuable for modeling purposes, and to guide reporting of data in the literature.

[1]  D. Banerjee,et al.  Cytotoxicity and Cell Growth Assays , 2006 .

[2]  I. Cree,et al.  Comparison of MTT and ATP-based assays for the measurement of viable cell number. , 1995, Journal of bioluminescence and chemiluminescence.

[3]  Jean-Pierre Gillet,et al.  Redefining the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance , 2011, Proceedings of the National Academy of Sciences.

[4]  F. Balis,et al.  Evolution of anticancer drug discovery and the role of cell-based screening. , 2002, Journal of the National Cancer Institute.

[5]  J. Goodisman,et al.  Cytotoxicity of Cu(II) and Zn(II) 2,2'-bipyridyl complexes: dependence of IC50 on recovery time. , 2010, Chemical research in toxicology.

[6]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[7]  Jean-Pierre Gillet,et al.  The clinical relevance of cancer cell lines. , 2013, Journal of the National Cancer Institute.

[8]  J. Balzarini,et al.  Synthesis and biological evaluation of unsaturated keto and exomethylene D-arabinopyranonucleoside analogs: novel 5-fluorouracil analogs that target thymidylate synthase. , 2011, European journal of medicinal chemistry.

[9]  Mathew J Garnett,et al.  The evolving role of cancer cell line-based screens to define the impact of cancer genomes on drug response? , 2014, Current opinion in genetics & development.

[10]  Karsten M. Borgwardt,et al.  Prediction of human population responses to toxic compounds by a collaborative competition , 2015, Nature Biotechnology.

[11]  A. Hubbard,et al.  Toxicogenomic profiling of chemically exposed humans in risk assessment. , 2010, Mutation research.

[12]  Mohammad Fallahi-Sichani,et al.  Metrics other than potency reveal systematic variation in responses to cancer drugs. , 2013, Nature chemical biology.

[13]  Ben van Ommen,et al.  Systems toxicology: applications of toxicogenomics, transcriptomics, proteomics and metabolomics in toxicology , 2005, Expert review of proteomics.

[14]  Pekka Tiikkainen,et al.  Estimating Error Rates in Bioactivity Databases , 2013, J. Chem. Inf. Model..

[15]  Verena M C Quent,et al.  Discrepancies between metabolic activity and DNA content as tool to assess cell proliferation in cancer research , 2010, Journal of cellular and molecular medicine.

[16]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[17]  C. Sander,et al.  Evaluating cell lines as tumour models by comparison of genomic profiles , 2013, Nature Communications.

[18]  M. Fellows,et al.  Cytotoxicity in cultured mammalian cells is a function of the method used to estimate it. , 2007, Mutagenesis.

[19]  Junying Yuan,et al.  Cell death assays for drug discovery , 2011, Nature Reviews Drug Discovery.

[20]  Brendan Borrell,et al.  How accurate are cancer cell lines? , 2010, Nature.

[21]  M. Hsiao,et al.  Folate analogues. 26. Syntheses and antifolate activity of 10-substituted derivatives of 5,8-dideazafolic acid and of the poly-gamma-glutamyl metabolites of N10-propargyl-5,8-dideazafolic acid (PDDF). , 1986, Journal of medicinal chemistry.

[22]  C. Lipinski Lead- and drug-like compounds: the rule-of-five revolution. , 2004, Drug discovery today. Technologies.

[23]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[24]  A. Numata,et al.  New class azaphilone produced by a marine fish-derived Chaetomium globosum. The stereochemistry and biological activities. , 2011, Bioorganic & medicinal chemistry.

[25]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[26]  Suresh Kumar,et al.  Design and synthesis of novel magnolol derivatives as potential antimicrobial and antiproliferative compounds. , 2012, European journal of medicinal chemistry.

[27]  Isidro Cortes-Ciriano,et al.  Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel , 2015, Bioinform..

[28]  Scott P. Brown,et al.  Healthy skepticism: assessing realistic model performance. , 2009, Drug discovery today.

[29]  A. Joubert,et al.  Limitations of the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl-2H-tetrazolium bromide (MTT) assay when compared to three commonly used cell enumeration assays , 2015, BMC Research Notes.

[30]  Valerie Speirs,et al.  Breast cancer cell lines: friend or foe? , 2003, Breast Cancer Research.

[31]  Terry L Riss,et al.  Use of multiple assay endpoints to investigate the effects of incubation time, dose of toxin, and plating density in cell-based cytotoxicity assays. , 2004, Assay and drug development technologies.

[32]  I. Wilson,et al.  Investigation of the Alamar Blue (resazurin) fluorescent dye for the assessment of mammalian cell cytotoxicity. , 2000, European journal of biochemistry.

[33]  D. Doolittle,et al.  Evaluation of eight in vitro assays for assessing the cytotoxicity of cigarette smoke condensate. , 2002, Toxicology in vitro : an international journal published in association with BIBRA.

[34]  T. Mosmann Rapid colorimetric assay for cellular growth and survival: application to proliferation and cytotoxicity assays. , 1983, Journal of immunological methods.

[35]  Christoph Globisch,et al.  Methoxylation of 3',4'-aromatic side chains improves P-glycoprotein inhibitory and multidrug resistance reversal activities of 7,8-pyranocoumarin against cancer cells. , 2008, Bioorganic & medicinal chemistry.

[36]  V. Tešević,et al.  Isolation and biological evaluation of jatrophane diterpenoids from Euphorbia dendroides. , 2011, Journal of natural products.

[37]  L. Kotra,et al.  Design, Synthesis, Biological Evaluation, and Structure–Activity Relationships of Substituted Phenyl 4-(2-Oxoimidazolidin-1-yl)benzenesulfonates as New Tubulin Inhibitors Mimicking Combretastatin A-4 , 2011, Journal of medicinal chemistry.

[38]  D. Emerson,et al.  O-Phosphonatomethylcholine, its analogues, alkyl esters, and their biological activity. , 2001, Journal of medicinal chemistry.

[39]  Christian Kramer,et al.  QSARs, data and error in the modern age of drug discovery. , 2012, Current topics in medicinal chemistry.

[40]  Sharon Patricia Mary Crouch,et al.  High-throughput cytotoxicity screening: hit and miss , 2001 .

[41]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[42]  P. Prusis,et al.  Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects , 2015 .

[43]  Isidro Cortes-Ciriano,et al.  Comparing the Influence of Simulated Experimental Errors on 12 Machine Learning Algorithms in Bioactivity Modeling Using 12 Diverse Data Sets , 2015, J. Chem. Inf. Model..

[44]  L. Babiss,et al.  Toxicogenomics in predictive toxicology in drug development. , 2004, Chemistry & biology.

[45]  S. Bopp,et al.  Comparison of four different colorimetric and fluorometric cytotoxicity assays in a zebrafish liver cell line , 2008, BMC pharmacology.

[46]  Y. Tu,et al.  Synthesis and cytotoxic activity of novel derivatives of 4'-demethylepipodophyllotoxin. , 2004, Bioorganic & medicinal chemistry letters.

[47]  John A Timbrell,et al.  In vitro cytotoxicity assays: comparison of LDH, neutral red, MTT and protein assay in hepatoma cell lines following exposure to cadmium chloride. , 2006, Toxicology letters.

[48]  Alexander Tropsha,et al.  Curation of chemogenomics data. , 2015, Nature chemical biology.

[49]  Jie Liu,et al.  Synthesis and biological evaluation of conjugates of deoxypodophyllotoxin and 5-FU as inducer of caspase-3 and -7. , 2012, European journal of medicinal chemistry.

[50]  J L Sebaugh,et al.  Guidelines for accurate EC50/IC50 estimation , 2011, Pharmaceutical statistics.

[51]  Philip L. Lorenzi,et al.  Cancer: Discrepancies in drug sensitivity , 2013, Nature.

[52]  J. Weinstein Drug discovery: Cell lines battle cancer , 2012, Nature.

[53]  Mark C. Wenlock,et al.  How Experimental Errors Influence Drug Metabolism and Pharmacokinetic QSAR/QSPR Models , 2015, J. Chem. Inf. Model..

[54]  T. Halazonetis,et al.  Genomic instability — an evolving hallmark of cancer , 2010, Nature Reviews Molecular Cell Biology.

[55]  Sridhar Ramaswamy,et al.  Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells , 2012, Nucleic Acids Res..

[56]  Joshua C. Gilbert,et al.  An Interactive Resource to Identify Cancer Genetic and Lineage Dependencies Targeted by Small Molecules , 2013, Cell.

[57]  D. Baccanari,et al.  Methotrexate analogues. 30. Dihydrofolate reductase inhibition and in vitro tumor cell growth inhibition by N epsilon-(haloacetyl)-L-lysine and N delta-(haloacetyl)-L-ornithine analogues and an acivicin analogue of methotrexate. , 1987, Journal of medicinal chemistry.

[58]  David A. Winkler,et al.  Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models , 2015, J. Chem. Inf. Model..

[59]  A. Vulpetti,et al.  Comparability of Mixed IC50 Data – A Statistical Analysis , 2013, PloS one.

[60]  Benjamin Haibe-Kains,et al.  Inconsistency in large pharmacogenomic studies , 2013, Nature.

[61]  Kanyawim Kirtikara,et al.  Sulforhodamine B colorimetric assay for cytotoxicity screening , 2006, Nature Protocols.

[62]  A. Vulpetti,et al.  The experimental uncertainty of heterogeneous public K(i) data. , 2012, Journal of medicinal chemistry.

[63]  Andreas Zimmer,et al.  A practical note on the use of cytotoxicity assays. , 2005, International journal of pharmaceutics.

[64]  A. Meager,et al.  Evaluation of assay designs for assays using microtitre plates: results of a study of in vitro bioassays and immunoassays for tumour necrosis factor (TNF). , 1995, Biologicals : journal of the International Association of Biological Standardization.