Computational selection of antibody-drug conjugate targets for breast cancer

The selection of therapeutic targets is a critical aspect of antibody-drug conjugate research and development. In this study, we applied computational methods to select candidate targets overexpressed in three major breast cancer subtypes as compared with a range of vital organs and tissues. Microarray data corresponding to over 8,000 tissue samples were collected from the public domain. Breast cancer samples were classified into molecular subtypes using an iterative ensemble approach combining six classification algorithms and three feature selection techniques, including a novel kernel density-based method. This feature selection method was used in conjunction with differential expression and subcellular localization information to assemble a primary list of targets. A total of 50 cell membrane targets were identified, including one target for which an antibody-drug conjugate is in clinical use, and six targets for which antibody-drug conjugates are in clinical trials for the treatment of breast cancer and other solid tumors. In addition, 50 extracellular proteins were identified as potential targets for non-internalizing strategies and alternative modalities. Candidate targets linked with the epithelial-to-mesenchymal transition were identified by analyzing differential gene expression in epithelial and mesenchymal tumor-derived cell lines. Overall, these results show that mining human gene expression data has the power to select and prioritize breast cancer antibody-drug conjugate targets, and the potential to lead to new and more effective cancer therapeutics.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[4]  H. Komatsu [Antibody therapy in cancer]. , 2010, Nihon rinsho. Japanese journal of clinical medicine.

[5]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[6]  Yidong Chen,et al.  GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus , 2008, Bioinform..

[7]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[8]  D. Neri,et al.  Curative properties of noninternalizing antibody-drug conjugates based on maytansinoids. , 2014, Cancer research.

[9]  G. Ginsburg,et al.  Personalized medicine: revolutionizing drug discovery and patient care. , 2001, Trends in biotechnology.

[10]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[11]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[12]  John M Lambert,et al.  Targeting HER2-positive breast cancer with trastuzumab-DM1, an antibody-cytotoxic drug conjugate. , 2008, Cancer research.

[13]  B. Teicher,et al.  Antibody Conjugate Therapeutics: Challenges and Potential , 2011, Clinical Cancer Research.

[14]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[15]  R. Weinberg,et al.  Transitions between epithelial and mesenchymal states: acquisition of malignant and stem cell traits , 2009, Nature Reviews Cancer.

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  S. Chan,et al.  Phase I open study of the effects of ascending doses of the cytotoxic immunoconjugate CMB-401 (hCTMO1-calicheamicin) in patients with epithelial ovarian cancer. , 2000, Annals of oncology : official journal of the European Society for Medical Oncology.

[18]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[19]  Michel Sadelain,et al.  The promise and potential pitfalls of chimeric antigen receptors. , 2009, Current Opinion in Immunology.

[20]  Zemin Zhang,et al.  Bioinformatics and cancer target discovery. , 2004, Drug discovery today.

[21]  Anthony Rhodes,et al.  American Society of Clinical Oncology/College of American Pathologists guideline recommendations for immunohistochemical testing of estrogen and progesterone receptors in breast cancer. , 2010, Archives of pathology & laboratory medicine.

[22]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[23]  S. Mallik,et al.  A disintegrin and metalloproteinase-12 (ADAM12): function, roles in disease progression, and clinical implications. , 2013, Biochimica et biophysica acta.

[24]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[25]  Elspeth A. Bruford,et al.  Genenames.org: the HGNC resources in 2015 , 2014, Nucleic Acids Res..

[26]  Janice M Reichert,et al.  The future of antibodies as cancer drugs. , 2012, Drug discovery today.

[27]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[28]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[29]  Raghu Kalluri,et al.  Fibroblasts in cancer , 2006, Nature Reviews Cancer.

[30]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[31]  A. Zolkiewska,et al.  Metalloproteinase-disintegrin ADAM12 is associated with a breast tumor-initiating cell phenotype , 2013, Breast Cancer Research and Treatment.

[32]  Neil H Bander,et al.  Antibody-drug conjugate target selection: critical factors. , 2013, Methods in molecular biology.

[33]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[34]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[35]  Ji-Hyun Kim,et al.  Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap , 2009, Comput. Stat. Data Anal..

[36]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[37]  William C Reinhold,et al.  Exon array analyses across the NCI-60 reveal potential regulation of TOP1 by transcription pausing at guanosine quartets in the first intron. , 2010, Cancer research.

[38]  LiTao,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004 .

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  J. Isaacs,et al.  Rationale Behind Targeting Fibroblast Activation Protein–Expressing Carcinoma-Associated Fibroblasts as a Novel Chemotherapeutic Strategy , 2012, Molecular Cancer Therapeutics.

[41]  William Loging,et al.  High-throughput electronic biology: mining information for drug discovery , 2007, Nature Reviews Drug Discovery.

[42]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[43]  Galit Shmueli,et al.  Research Commentary - Too Big to Fail: Large Samples and the p-Value Problem , 2013, Inf. Syst. Res..

[44]  K. Sugio,et al.  Identification of a new cancer/germline gene, KK-LC-1, encoding an antigen recognized by autologous CTL induced on human lung adenocarcinoma. , 2006, Cancer research.

[45]  I. Sassoon,et al.  Antibody-drug conjugate (ADC) clinical pipeline: a review. , 2013, Methods in molecular biology.

[46]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[47]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[48]  G. Weiner,et al.  Picking the optimal target for antibody-drug conjugates. , 2013, American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Annual Meeting.

[49]  K. Buetow,et al.  Cancer Informatics Vision: caBIG™ , 2006, Cancer informatics.

[50]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[51]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[52]  A. Zolkiewska,et al.  ADAM12-L is a direct target of the miR-29 and miR-200 families in breast cancer , 2015, BMC Cancer.

[53]  Nicholas C. Ide,et al.  The ClinicalTrials.gov results database--update and key issues. , 2011, The New England journal of medicine.

[54]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[55]  I. Fidler,et al.  The pathogenesis of cancer metastasis: the 'seed and soil' hypothesis revisited , 2003, Nature Reviews Cancer.

[56]  Edmund A. Rossi,et al.  Trop-2 is a novel target for solid cancer therapy with sacituzumab govitecan (IMMU-132), an antibody-drug conjugate (ADC) , 2015, Oncotarget.

[57]  K. Heider,et al.  Effective Immunoconjugate Therapy in Cancer Models Targeting a Serine Protease of Tumor Fibroblasts , 2008, Clinical Cancer Research.

[58]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[59]  Joel Greshock,et al.  Molecular target class is predictive of in vitro response profile. , 2010, Cancer research.

[60]  S. Rosenberg,et al.  Treating cancer with genetically engineered T cells. , 2011, Trends in biotechnology.

[61]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[62]  G. Churchill Fundamentals of experimental design for cDNA microarrays , 2002, Nature Genetics.

[63]  WestonJason,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002 .

[64]  Kurt Hornik,et al.  Open-source machine learning: R meets Weka , 2009, Comput. Stat..

[65]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[66]  Jacob Cohen The earth is round (p < .05) , 1994 .

[67]  S. Rosenberg,et al.  Adoptive cell transfer as personalized immunotherapy for human cancer , 2015, Science.

[68]  H. Bartsch,et al.  International Agency for Research on Cancer. , 1969, WHO chronicle.

[69]  Paul Polakis,et al.  Antibody Drug Conjugates for Cancer Therapy , 2016, Pharmacological Reviews.

[70]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[71]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[72]  R. Sikorski,et al.  The Clinical Landscape of Antibody-drug Conjugates , 2014 .

[73]  Luis Serrano,et al.  Correlation of mRNA and protein in complex biological samples , 2009, FEBS letters.

[74]  R. Weinberg,et al.  A Perspective on Cancer Cell Metastasis , 2011, Science.

[75]  Dirk Eddelbuettel,et al.  Seamless R and C++ Integration with Rcpp , 2013 .

[76]  Khusru Asadullah,et al.  What makes a good drug target? , 2011, Drug discovery today.

[77]  G. Goodall,et al.  The miR-200 family and miR-205 regulate epithelial to mesenchymal transition by targeting ZEB1 and SIP1 , 2008, Nature Cell Biology.

[78]  A. Onitilo,et al.  Breast Cancer Subtypes Based on ER/PR and Her2 Expression: Comparison of Clinicopathologic Features and Survival , 2009, Clinical Medicine & Research.

[79]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[80]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[81]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[82]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[83]  Ian Abramson On Bandwidth Variation in Kernel Estimates-A Square Root Law , 1982 .

[84]  A. Brazma,et al.  Reuse of public genome-wide gene expression data , 2012, Nature Reviews Genetics.

[85]  J. Peterse,et al.  Breast cancer metastasis: markers and models , 2005, Nature Reviews Cancer.

[86]  P. Siegel,et al.  Glycoprotein non-metastatic b (GPNMB): A metastatic mediator and emerging therapeutic target in cancer , 2013, OncoTargets and therapy.

[87]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[88]  O. Schilling,et al.  Understanding fibroblast activation protein (FAP): Substrates, activities, expression and targeting for cancer therapy , 2014, Proteomics. Clinical applications.

[89]  Yongliang Yang,et al.  Target discovery from data mining approaches. , 2009, Drug discovery today.

[90]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[91]  H. Koeppen,et al.  Armed antibodies targeting the mucin repeats of the ovarian cancer antigen, MUC16, are highly efficacious in animal tumor models. , 2007, Cancer research.

[92]  Axel Schmidt,et al.  Nonparametric estimation of the coefficient of overlapping - theory and empirical application , 2006, Comput. Stat. Data Anal..

[93]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[94]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[95]  M. Berger,et al.  A phase 2 study of the cytotoxic immunoconjugate CMB-401 (hCTM01-calicheamicin) in patients with platinum-sensitive recurrent epithelial ovarian carcinoma , 2003, Cancer Immunology, Immunotherapy.

[96]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[97]  S. Ramaswamy,et al.  Twist, a Master Regulator of Morphogenesis, Plays an Essential Role in Tumor Metastasis , 2004, Cell.

[98]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[99]  T. Arakawa,et al.  Developments and Challenges for mAb-Based Therapeutics , 2013 .

[100]  William C Reinhold,et al.  CellMiner: a relational database and query tool for the NCI-60 cancer cell lines , 2009, BMC Genomics.