A fast multiscale framework for data in high-dimensions: Measure estimation, anomaly detection, and compressive measurements

Data sets are often modeled as samples from some probability distribution lying in a very high dimensional space. In practice, they tend to exhibit low intrinsic dimensionality, which enables both fast construction of efficient data representations and solving statistical tasks such as regression of functions on the data, or even estimation of the probability distribution from which the data is generated. In this paper we introduce a novel multiscale density estimator for high dimensional data and apply it to the problem of detecting changes in the distribution of dynamic data, or in a time series of data sets. We also show that our data representations, which are not standard sparse linear expansions, are amenable to compressed measurements. Finally, we test our algorithms on both synthetic data and a real data set consisting of a times series of hyperspectral images, and demonstrate their high accuracy in the detection of anomalies.