Finding Archetypal Spaces Using Neural Networks

Archetypal analysis is a data decomposition method that describes each observation in a dataset as a convex combination of ”pure types” or archetypes. These archetypes represent extrema of a data space in which there is a trade-off between features, such as in biology where different combinations of traits provide optimal fitness for different environments. Existing methods for archetypal analysis work well when a linear relationship exists between the feature space and the archetypal space. However, such methods are not applicable to systems where the feature space is generated non-linearly from the combination of archetypes, such as in biological systems or image transformations. Here, we propose a reformulation of the problem such that the goal is to learn a non-linear transformation of the data into a latent archetypal space. To solve this problem, we introduce Archetypal Analysis network (AAnet), which is a deep neural network framework for learning and generating from a latent archetypal representation of data. We demonstrate state-of-the-art recovery of ground-truth archetypes in non-linear data domains, show AAnet can generate from data geometry rather than from data density, and use AAnet to identify biologically meaningful archetypes in single-cell gene expression data.

[1]  B. Chan,et al.  Archetypal analysis of galaxy spectra , 2003, astro-ph/0301491.

[2]  Kevin R. Moon,et al.  Recovering Gene Interactions from Single-Cell Data Using Data Diffusion , 2018, Cell.

[3]  L. Steinmetz,et al.  Human haematopoietic stem cell lineage commitment is a continuous process , 2017, Nature Cell Biology.

[4]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[5]  David van Dijk,et al.  Visualizing Structure and Transitions for Biological Data Exploration , 2018 .

[6]  Ambrose J. Carr,et al.  Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment , 2018, Cell.

[7]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[8]  Cordelia Schmid,et al.  Unsupervised Learning of Artistic Styles with Archetypal Style Analysis , 2018, NeurIPS.

[9]  Rob Knight,et al.  American Gut: an Open Platform for Citizen Science Microbiome Research , 2018, mSystems.

[10]  Ofir Lindenbaum,et al.  Geometry-Based Data Generation , 2018, NeurIPS.

[11]  Igor Kononenko,et al.  Weighted hierarchical archetypal analysis for multi-document summarization , 2016, Comput. Speech Lang..

[12]  F. Shanahan,et al.  Categorization of the gut microbiota: enterotypes or gradients? , 2012, Nature Reviews Microbiology.

[13]  O Shoval,et al.  Evolutionary Trade-Offs, Pareto Optimality, and the Geometry of Phenotype Space , 2012, Science.

[14]  C. Ji An Archetypal Analysis on , 2005 .

[15]  Zaïd Harchaoui,et al.  Fast and Robust Archetypal Analysis for Representation Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  J. Curtsinger,et al.  This information is current as To Stimulate Strong Function α Differentiation and Synergizes with IFN-Promotes Naive CD 8 T Cell γ Autocrine IFN - , 2012 .

[17]  H Hengartner,et al.  Fas and perforin pathways as major mechanisms of T cell-mediated cytotoxicity. , 1994, Science.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Uri Alon,et al.  Inferring biological tasks using Pareto analysis of high-dimensional data , 2015, Nature Methods.

[20]  Christopher E. McKinlay,et al.  Rethinking "enterotypes". , 2014, Cell host & microbe.

[21]  Andrea Montanari,et al.  Nonnegative Matrix Factorization Via Archetypal Analysis , 2017, Journal of the American Statistical Association.

[22]  Morten Mørup,et al.  Archetypal analysis of diverse Pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways , 2013, BMC Bioinformatics.

[23]  Richard T. Carson,et al.  Archetypal analysis: a new way to segment markets based on extreme individuals , 2003 .

[24]  Sohan Seth,et al.  Probabilistic archetypal analysis , 2013, Machine Learning.

[25]  Mitchell H. Grayson,et al.  The Immune Response: Basic and Clinical Principles , 2006 .

[26]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[27]  P. Bork,et al.  Enterotypes of the human gut microbiome , 2011, Nature.

[28]  Giancarlo Ragozini,et al.  On the use of archetypes as benchmarks , 2008 .

[29]  Lars Kai Hansen,et al.  Archetypal analysis for machine learning , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[30]  Yuval Hart,et al.  Geometry of the Gene Expression Space of Individual Cells , 2015, PLoS Comput. Biol..

[31]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..