Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer

We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in science and industry. We illustrate our method on a collection of experiments measuring the response of genetically diverse breast cancer cell lines to an array of ligands. Each experiment consists of a cell line–ligand combination, and contains time-course measurements of the early signalling kinases MAPK and AKT at two different ligand dose levels. By imposing appropriate structural constraints and respecting the multi-indexed structure of the data, the analysis of clusters can be optimized for biological interpretation and therapeutic understanding. We then perform a systematic, large-scale exploration of mechanistic models of MAPK–AKT crosstalk for each cluster. This analysis allows us to quantify the heterogeneity of breast cancer cell subtypes, and leads to hypotheses about the signalling mechanisms that mediate the response of the cell lines to ligands.

[1]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[2]  D. Lauffenburger,et al.  Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data , 2009, Molecular systems biology.

[3]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[4]  Boris N. Kholodenko,et al.  Crosstalk and Signaling Switches in Mitogen-Activated Protein Kinase Cascades , 2012, Front. Physiol..

[5]  P. Sorger,et al.  Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs , 2016, Nature Methods.

[6]  M. Onsum,et al.  Model-Based Design of a Decision Tree for Treating HER2+ Cancers Based on Genetic and Protein Biomarkers , 2015, CPT: pharmacometrics & systems pharmacology.

[7]  Reinhart Heinrich,et al.  Mathematical models of protein kinase signal transduction. , 2002, Molecular cell.

[8]  Ayhan Demiriz,et al.  Constrained K-Means Clustering , 2000 .

[9]  Marc Hafner,et al.  Analysis of growth factor signaling in genetically diverse breast cancer lines , 2014, BMC Biology.

[10]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Hiroyuki Kubota,et al.  Decoupling of Receptor and Downstream Signals in the Akt Pathway by Its Low-Pass Filter Characteristics , 2010, Science Signaling.

[12]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[13]  Tamara G. Kolda,et al.  Parallel Tensor Compression for Large-Scale Scientific Data , 2015, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[14]  G. Sutherland,et al.  γ-Heregulin: a fusion gene of DOC-4 and neuregulin-1 derived from a chromosome translocation , 1999, Oncogene.

[15]  Edward W. Davis,et al.  An Algorithm for Optimal Project Scheduling under Multiple Resource Constraints , 1971 .

[16]  P. Duncombe,et al.  Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices , 1985 .

[17]  Ulrik B. Nielsen,et al.  HER2+ Cancer Cell Dependence on PI3K vs. MAPK Signaling Axes Is Determined by Expression of EGFR, ERBB3 and CDKN1B , 2016, PLoS Comput. Biol..

[18]  Joshua M. Stuart,et al.  Subtype and pathway specific responses to anticancer compounds in breast cancer , 2011, Proceedings of the National Academy of Sciences.

[19]  Mauricio Barahona,et al.  Squeeze-and-breathe evolutionary Monte Carlo optimization with local search acceleration and its application to parameter fitting , 2011, Journal of The Royal Society Interface.

[20]  T. Sørlie,et al.  Triple‐negative breast cancer: Present challenges and new perspectives , 2010, Molecular oncology.

[21]  D. Yardley,et al.  Updates in the treatment of basal/triple-negative breast cancer , 2013, Current Opinion in Obstetrics and Gynecology.

[22]  Jean-Charles Delvenne,et al.  Stability of graph communities across time scales , 2008, Proceedings of the National Academy of Sciences.

[23]  Takeshi Norimatsu,et al.  Encoding and Decoding , 2016 .

[24]  Marc Hafner,et al.  Quantification of sensitivity and resistance of breast cancer cell lines to anti-cancer drugs using GR metrics , 2017, Scientific Data.

[25]  Chi-Ying F. Huang,et al.  Ultrasensitivity in the mitogen-activated protein kinase cascade. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[27]  Borislav Vangelov Unravelling biological processes using graph theoretical algorithms and probabilistic models , 2014 .

[28]  Ian Davidson,et al.  On constrained spectral clustering and its applications , 2012, Data Mining and Knowledge Discovery.

[29]  Ted K. Ralphs,et al.  Integer and Combinatorial Optimization , 2013 .

[30]  Mauricio Barahona,et al.  Linear models of activation cascades: analytical solutions and coarse-graining of delayed signal transduction , 2011, Journal of The Royal Society Interface.

[31]  Marc Hafner,et al.  Profiles of Basal and Stimulated Receptor Signaling Networks Predict Drug Response in Breast Cancer Lines , 2013, Science Signaling.

[32]  Stefan Kramer,et al.  Integer Linear Programming Models for Constrained Clustering , 2010, Discovery Science.

[33]  Maria Pia Saccomani,et al.  DAISY: A new software tool to test global identifiability of biological and physiological systems , 2007, Comput. Methods Programs Biomed..

[34]  Christine M. Anderson-Cook,et al.  Book review: quantitative risk management: concepts, techniques and tools, revised edition, by A.F. McNeil, R. Frey and P. Embrechts. Princeton University Press, 2015, ISBN 978-0-691-16627-8, xix + 700 pp. , 2017, Extremes.

[35]  G. Lahav,et al.  Encoding and Decoding Cellular Information through Signaling Dynamics , 2013, Cell.

[36]  Niloy Ganguly,et al.  Dynamics On and Of Complex Networks, Volume 2 , 2013 .

[37]  J. Mitchell Branch-and-Cut Algorithms for Combinatorial Optimization Problems , 1988 .

[38]  Thi-Bich-Hanh Dao,et al.  Constrained clustering by constraint programming , 2017, Artif. Intell..

[39]  Der-San Chen,et al.  Applied Integer Programming: Modeling and Solution , 2010 .

[40]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[41]  Ian Davidson,et al.  Reveling in Constraints , 2009, ACM Queue.

[42]  M. McCarthy,et al.  Tensor decomposition for multi-tissue gene expression experiments , 2016, Nature Genetics.

[43]  Kwang-Hyun Cho,et al.  The crossregulation between ERK and PI3K signaling pathways determines the tumoricidal efficacy of MEK inhibitor. , 2012, Journal of molecular cell biology.

[44]  M. Stratton,et al.  The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website , 2004, British Journal of Cancer.

[45]  Michael P. H. Stumpf,et al.  Maximizing the Information Content of Experiments in Systems Biology , 2013, PLoS Comput. Biol..

[46]  Dimitris Bertsimas,et al.  Optimization over integers , 2005 .

[47]  Daniel C Kirouac,et al.  Computational Modeling of ERBB2-Amplified Breast Cancer Identifies Combined ErbB2/3 Blockade as Superior to the Combination of MEK and AKT Inhibitors , 2013, Science Signaling.

[48]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[49]  Aram Galstyan,et al.  Discovering Structure in High-Dimensional Data Through Correlation Explanation , 2014, NIPS.

[50]  Mieke Schutte,et al.  Phosphatidylinositol-3-OH Kinase or RAS Pathway Mutations in Human Breast Cancer Cell Lines , 2007, Molecular Cancer Research.

[51]  M. Sliwkowski,et al.  γ-Heregulin: a novel heregulin isoform that is an autocrine growth factor for the human breast cancer cell line, MDA-MB-175 , 1997, Oncogene.

[52]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[53]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[54]  S. Chandarlapaty,et al.  PI3K inhibition results in enhanced HER signaling and acquired ERK dependency in HER2-overexpressing breast cancer , 2011, Oncogene.

[55]  Orly Alter,et al.  Tensor GSVD of Patient- and Platform-Matched Tumor and Normal DNA Copy-Number Profiles Uncovers Chromosome Arm-Wide Patterns of Tumor-Exclusive Platform-Consistent Alterations Encoding for Cell Transformation and Predicting Ovarian Cancer Survival , 2015, PloS one.

[56]  Mauricio Barahona,et al.  Finding role communities in directed networks using Role-Based Similarity, Markov Stability and the Relaxed Minimum Spanning Tree , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[57]  J. Baselga,et al.  Targeting Tyrosine Kinases in Cancer: The Second Wave , 2006, Science.

[58]  Stephen L. Abrams,et al.  Therapeutic resistance resulting from mutations in Raf/MEK/ERK and PI3K/PTEN/Akt/mTOR signaling pathways , 2011, Journal of cellular physiology.

[59]  Irmtraud M. Meyer,et al.  The clonal and mutational evolution spectrum of primary triple-negative breast cancers , 2012, Nature.

[60]  Ian Davidson,et al.  Constrained Clustering: Advances in Algorithms, Theory, and Applications , 2008 .

[61]  C. Marshall,et al.  Specificity of receptor tyrosine kinase signaling: Transient versus sustained extracellular signal-regulated kinase activation , 1995, Cell.

[62]  Mauricio Barahona,et al.  Interest communities and flow roles in directed networks: the Twitter network of the UK riots , 2013, Journal of The Royal Society Interface.

[63]  Walter Kolch,et al.  Cell fate decisions are specified by the dynamic ERK interactome , 2009, Nature Cell Biology.

[64]  Niloy Ganguly,et al.  Dynamics On and Of Complex Networks , 2009 .

[65]  Jean-Charles Delvenne,et al.  The stability of a graph partition: A dynamics-based framework for community detection , 2013, ArXiv.

[66]  B. Kholodenko,et al.  The dynamic control of signal transduction networks in cancer cells , 2015, Nature Reviews Cancer.

[67]  Thierry Denoeux,et al.  k-CEVCLUS: Constrained evidential clustering of large dissimilarity data , 2018, Knowl. Based Syst..

[68]  Sujeet Akula,et al.  Dynamics of and on Complex Networks , 2011 .