Resolving complex hierarchies in chemical mixtures: how chemometrics may serve in understanding the immune system.

In immunology, the resolution of complex chemical mixtures familiar from omics, comes with an added layer of hierarchy: bioactive immunological surface markers are embedded on the cell membranes of e.g. white blood cells. Therefore, each blood sample actually consists of a comprehensive mixture of cells. The cells need to be resolved based on their surface marker chemistry, to investigate their involvement in an immune response. This mixture may be measured on a single-cell level with Multicolour Flow Cytometry (MFC). Finding such cellular and molecular markers is of the utmost academic and diagnostic importance. Several advanced data analysis methods therefore aim to meet the considerable data challenge of resolving such cell mixtures. These multivariate methods are more resource-efficient than the manual analysis of MFC data, called sequential gating, but also likely provide additional biomedical insight compared to the conventional bivariate approach. To compare such methods more comprehensively than has been done until now, we have developed a list of criteria on how each method recovers the information on both the cell and the underlying molecular levels on an MFC sample of an asthma patient. We compare these methods for the chemometric data analysis commonly used in metabolomics. This shows that all compared methods have their own advantage in recovering the sequential gating results, giving insight into the limitations of sequential gating, providing insight into the chemical relationships between cells within the mixture and resolving information related to chemical heterogeneities between cells. We furthermore show how comparative analyses of different samples may lead to further insight into the subdivision of cells into different types based on their immunological involvement in asthma development, and how sparsity-a currently popular method to enhance the discriminative ability of multivariate models-may reduce the insight into the underlying hierarchical variability in cell chemistry. Although developed for cytometry, the presented chemometrics will be highly valuable to many more chemical systems where hierarchical arrangement of the molecules plays a crucial role.

[1]  Lutgarde M. C. Buydens,et al.  Self- and Super-organizing Maps in R: The kohonen Package , 2007 .

[2]  O. Ornatsky,et al.  Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. , 2009, Analytical chemistry.

[3]  M. Roederer,et al.  Flow cytometry strikes gold , 2015, Science.

[4]  D. Kell,et al.  Metabolomics by numbers: acquiring and understanding global metabolite data. , 2004, Trends in biotechnology.

[5]  G. Nolan,et al.  Mass Cytometry: Single Cells, Many Features , 2016, Cell.

[6]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[7]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[8]  T. P. Hofer,et al.  Toward a Refined Definition of Monocyte Subsets , 2013, Front. Immun..

[9]  Julien Picot,et al.  Flow cytometry: retrospective, fundamentals and recent instrumentation , 2012, Cytotechnology.

[10]  M. Rantalainen,et al.  OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification , 2006 .

[11]  R. Tibshirani,et al.  Automated identification of stratifying signatures in cellular subpopulations , 2014, Proceedings of the National Academy of Sciences.

[12]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[13]  J. J. Jansen,et al.  ASCA: analysis of multivariate data obtained from an experimental design , 2005 .

[14]  R. Tauler Multivariate curve resolution applied to second order data , 1995 .

[15]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[16]  Age K. Smilde,et al.  Principal Component Analysis , 2003, Encyclopedia of Machine Learning.

[17]  Johanna Smeyers-Verbeke,et al.  Handbook of Chemometrics and Qualimetrics: Part A , 1997 .

[18]  Johan Trygg,et al.  Chemometrics in metabolomics--a review in human disease diagnosis. , 2010, Analytica chimica acta.

[19]  I. Pavord,et al.  Diagnosing eosinophilic asthma using a multivariate prediction model based on blood granulocyte responsiveness , 2017, Allergy.

[20]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[21]  M. Roederer,et al.  Data analysis in flow cytometry: The future just started , 2010, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[22]  Serge Rudaz,et al.  Metabolomic analysis of urine samples by UHPLC-QTOF-MS: Impact of normalization strategies. , 2017, Analytica chimica acta.

[23]  Piet Demeester,et al.  FlowSOM: Using self‐organizing maps for visualization and interpretation of cytometry data , 2015, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[24]  Karel Drbal,et al.  CD molecules 2005: human cell differentiation molecules. , 2005, Blood.

[25]  R. Simpson,et al.  Total lymphocyte CD8 expression is not a reliable marker of cytotoxic T-cell populations in human peripheral blood following an acute bout of high-intensity exercise , 2008, Brain, Behavior, and Immunity.

[26]  T. Ebbels,et al.  Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts , 2007, Nature Protocols.

[27]  Marietta Kokla,et al.  Novel data analysis method for multicolour flow cytometry links variability of multiple markers on single cells to a clinical phenotype , 2017, Scientific Reports.

[28]  Alberto Orfao,et al.  Overview of clinical flow cytometry data analysis: recent advances and future challenges. , 2013, Trends in biotechnology.

[29]  A Orfao,et al.  EuroFlow antibody panels for standardized n-dimensional flow cytometric immunophenotyping of normal, reactive and malignant leukocytes , 2012, Leukemia.

[30]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[31]  Sean C. Bendall,et al.  Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum , 2011, Science.

[32]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[33]  Sean C. Bendall,et al.  A deep profiler's guide to cytometry. , 2012, Trends in immunology.

[34]  R Zenobi,et al.  Single-Cell Metabolomics: Analytical and Biological Perspectives , 2013, Science.

[35]  K. Pilipow,et al.  The Single-Cell Phenotypic Identity of Human CD8+ and CD4+ T Cells. , 2018, International review of cell and molecular biology.

[36]  Greg Finak,et al.  flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification , 2015, Bioinform..

[37]  Romà Tauler,et al.  A graphical user-friendly interface for MCR-ALS: a new tool for multivariate curve resolution in MATLAB , 2005 .

[38]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[39]  Greg Finak,et al.  Critical assessment of automated flow cytometry data analysis techniques , 2013, Nature Methods.

[40]  Y Kosugi,et al.  An interactive multivariate analysis of FCM data. , 1988, Cytometry.