A Measurement of In-Betweenness and Inference Based on Shape Theories

We propose a statistical framework to investigate whether a given subpopulation lies between two other subpopulations in a multivariate feature space. This methodology is motivated by a biological question from a collaborator: Is a newly discovered cell type between two known types in several given features? We propose two in-betweenness indices (IBI) to quantify the in-betweenness exhibited by a random triangle formed by the summary statistics of the three subpopulations. Statistical inference methods are provided for triangle shape and IBI metrics. The application of our methods is demonstrated in three examples: the classic Iris data set, a study of risk of relapse across three breast cancer subtypes, and the motivating neuronal cell data with measured electrophysiological features.

[1]  Fred L. Bookstein,et al.  Spatial relationships of neuroanatomic landmarks in schizophrenia , 1996, Psychiatry Research: Neuroimaging.

[2]  Alan Edelman,et al.  Random Triangle Theory with Geometry and Applications , 2015, Found. Comput. Math..

[3]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[4]  K. Mardia,et al.  The statistical analysis of shape data , 1989 .

[5]  Alex Bavelas A Mathematical Model for Group Structures , 1948 .

[6]  K. Mardia,et al.  Shape distributions for landmark data , 1989, Advances in Applied Probability.

[7]  K. Mardia,et al.  General shape distributions in a plane , 1991, Advances in Applied Probability.

[8]  Multivariate bootstrap confidence regions for abundance vector using , 2004, Environmental and Ecological Statistics.

[9]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[10]  Peter J. Green,et al.  Bayesian alignment using hierarchical models, with applications in protein bioinformatics , 2005 .

[11]  A. Horn Doubly Stochastic Matrices and the Diagonal of a Rotation Matrix , 1954 .

[12]  E. Anderson The Species Problem in Iris , 1936 .

[13]  Daniel J. Brass,et al.  Network Analysis in the Social Sciences , 2009, Science.

[14]  F. Bookstein Size and Shape Spaces for Landmark Data in Two Dimensions , 1986 .

[15]  Alex Bavelas,et al.  Communication Patterns in Task‐Oriented Groups , 1950 .

[16]  A. Terras Harmonic Analysis on Symmetric Spaces―Euclidean Space, the Sphere, and the Poincaré Upper Half-Plane , 2013 .

[17]  C. Perou,et al.  A PAM50-Based Chemoendocrine Score for Hormone Receptor–Positive Breast Cancer with an Intermediate Risk of Relapse , 2016, Clinical Cancer Research.

[18]  J. Gurland,et al.  Distribution of Quadratic Forms and Ratios of Quadratic Forms , 1953 .

[19]  D. Kendall SHAPE MANIFOLDS, PROCRUSTEAN METRICS, AND COMPLEX PROJECTIVE SPACES , 1984 .

[20]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .