A MULTIVARIATE MEDIAN IN BANACH SPACES AND APPLICATIONS TO ROBUST PCA

With the rise in prominence of high dimensional data, multivariate measures of center have become very important. In this paper we focus on one multivariate measure of center the geometric median, which is defined as the minimizer of the sum of distances to the data points. We study the quantitative robustness of the geometric median. Showing that for a non-degenerate distribution of N points, altering k points can only change the median by at most O(k/N). Taking advantage of this robustness we introduce a robust form of Principle Component Analysis (PCA), which is based on what we call the median covariance matrix. Since there are several natural matrix norms, we look at the notion of the geometric median in general Banach Spaces. We conclude by conjecturing that the geometric median is robust in all uniformly convex Banach spaces.