Bayesian multiple instance regression for modeling immunogenic neoantigens

The relationship between tumor immune responses and tumor neoantigens is one of the most fundamental and unsolved questions in tumor immunology, and is the key to understanding the inefficiency of immunotherapy observed in many cancer patients. However, the properties of neoantigens that can elicit immune responses remain unclear. This biological problem can be represented and solved under a multiple instance learning framework, which seeks to model multiple instances (neoantigens) within each bag (patient specimen) with the continuous response (T cell infiltration) observed for each bag. To this end, we develop a Bayesian multiple instance regression method, named BMIR, using a Gaussian distribution to address continuous responses and latent binary variables to model primary instances in bags. By means of such Bayesian modeling, BMIR can learn a function for predicting the bag-level responses and for identifying the primary instances within bags, as well as give access to Bayesian statistical inference, which are elusive in existing works. We demonstrate the superiority of BMIR over previously proposed optimization-based methods for multiple instance regression through simulation and real data analyses. Our method is implemented in R package entitled “BayesianMIR” and is available at https://github.com/inmybrain/BayesianMIR.

[1]  F. Rodríguez,et al.  Immunodominance in Virus-Induced CD8+ T-Cell Responses Is Dramatically Modified by DNA Immunization and Is Regulated by Gamma Interferon , 2002, Journal of Virology.

[2]  Zoran Obradovic,et al.  Aerosol Optical Depth Prediction from Satellite Obsercations by Multiple Instance Regression , 2008, SDM.

[3]  Arnold Zellner,et al.  Applications of Bayesian Analysis in Econometrics , 1983 .

[4]  David Page,et al.  Multiple Instance Regression , 2001, ICML.

[5]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[6]  J. Gartner,et al.  Immunogenicity of somatic mutations in human gastrointestinal cancers , 2015, Science.

[7]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[8]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Xiaoxiao Du,et al.  Multiple Instance Choquet Integral Classifier Fusion and Regression for Remote Sensing Applications , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Mark W. Ball,et al.  Genomic correlates of response to immune checkpoint therapies in clear cell renal cell carcinoma , 2018, Science.

[11]  Nicolai J. Birkbak,et al.  Insertion-and-deletion-derived tumour-specific neoantigens and the immunogenic phenotype: a pan-cancer analysis. , 2017, The Lancet. Oncology.

[12]  Kiri L. Wagstaff,et al.  Salience Assignment for Multiple-Instance Regression , 2007 .

[13]  A. V. D. Vaart,et al.  BAYESIAN LINEAR REGRESSION WITH SPARSE PRIORS , 2014, 1403.0735.

[14]  R. Inman,et al.  Immunodominance: a pivotal principle in host response to viral infections. , 2012, Clinical immunology.

[15]  Morten Nielsen,et al.  Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction , 2007, BMC Bioinformatics.

[16]  Alessandro Sette,et al.  Properties of MHC Class I Presented Peptides That Enhance Immunogenicity , 2013, PLoS Comput. Biol..

[17]  Jinbo Bi,et al.  Effective 3D object detection and regression using probabilistic segmentation features in CT images , 2011, CVPR 2011.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  O. Lund,et al.  NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence , 2007, PloS one.

[20]  Vasant Honavar,et al.  Predicting MHC-II Binding Affinity Using Multiple Instance Regression , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Sri Krishna,et al.  TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes , 2015, Proceedings of the National Academy of Sciences.

[22]  Svetha Venkatesh,et al.  Bayesian nonparametric Multiple Instance Regression , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[23]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[24]  Eric Granger,et al.  Multiple instance learning: A survey of problem characteristics and applications , 2016, Pattern Recognit..

[25]  K. Cibulskis,et al.  Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. , 2014, Blood.

[26]  G. Linette,et al.  Neoantigen Vaccines Pass the Immunogenicity Test. , 2017, Trends in molecular medicine.

[27]  Tumor neoantigenicity assessment with CSiN score incorporates clonality and immunogenicity to predict immunotherapy outcomes , 2020, Science Immunology.

[28]  James R. Foulds,et al.  A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[29]  J. Yewdell,et al.  Immunodominance in TCD8+ responses to viruses: cell biology, cellular immunology, and mathematical models. , 2004, Immunity.

[30]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[31]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[32]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of clear cell renal cell carcinoma , 2013, Nature.

[33]  Jesse Davis,et al.  An integrated approach to feature invention and model construction for drug activity prediction , 2007, ICML '07.

[34]  Marco Loog,et al.  Multiple instance learning with bag dissimilarities , 2013, Pattern Recognit..

[35]  Andrei Popescu-Belis,et al.  Explicit Document Modeling through Weighted Multiple-Instance Learning , 2017, J. Artif. Intell. Res..

[36]  E. Mardis,et al.  A dendritic cell vaccine increases the breadth and diversity of melanoma neoantigen-specific T cells , 2015, Science.

[37]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[38]  H. Aburatani,et al.  Integrated molecular analysis of clear-cell renal cell carcinoma , 2013, Nature Genetics.

[39]  A. Levine,et al.  A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy , 2017, Nature.

[40]  Leonard D. Goldstein,et al.  An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors. , 2018, Cancer discovery.

[41]  Yang Xie,et al.  Artificial Intelligence in Lung Cancer Pathology Image Analysis , 2019, Cancers.

[42]  Aki Vehtari,et al.  Sparsity information and regularization in the horseshoe and other shrinkage priors , 2017, 1707.01694.

[43]  Andrei Popescu-Belis,et al.  Explaining the Stars: Weighted Multiple-Instance Learning for Aspect-Based Sentiment Analysis , 2014, EMNLP.

[44]  James G. Scott,et al.  Handling Sparsity via the Horseshoe , 2009, AISTATS.

[45]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of clear cell renal cell carcinoma , 2013, Nature.

[46]  T. Schumacher,et al.  Neoantigen landscape dynamics during human melanoma–T cell interactions , 2016, Nature.