Efficient estimation algorithms for neighborhood variance and other moments

The neighborhood variance problem is as follows. Given a (directed or undirected) graph with values associated with each node, compute a data structure that for any given node v and r ≥ 0, would quickly produce an estimate of the variance of all values of nodes that lie within distance r from v. The problem can be generalized to other moment functions and to arbitrary distance-dependent decay.These problems are motivated by applications where the relevance of a measurement observed (or data present) at a certain location decreases with its distance, and thus the aggregate value varies by location. The centralized version of the problem is motivated by applications to query processing on graphical databases. The distributed version of the problem falls in a model we recently introduced for spatially decaying aggregation and is motivated by sensor or p2p networks.We present novel algorithms for the centralized and distributed versions of the problem. Our algorithms are nearly optimal, the centralized version requires Õ(m) time and the distributed version requires polylogarithmic communication per node or edge (depending on assumptions).