论文信息 - Randomized graph cluster randomization

Randomized graph cluster randomization

The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz-Thompson (HT) and Hajek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different cluster assignments, this design avoids a key problem with GCR where a given node is sometimes "lucky" or "unlucky" in a given clustering. We propose two randomized graph decomposition algorithms for use with RGCR, randomized 3-net and 1-hop-max, adapted from prior work on multiway graph cut problems. When integrating over their own randomness, these algorithms furnish network exposure probabilities that can be estimated efficiently. We develop upper bounds on the variance of the HT estimator of the GATE under assumptions on the metric structure of the interference graph. Where the best known variance upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable variance upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial reductions in the mean squared error for both HT and Hajek estimators of the GATE in a variety of settings.

Johan Ugander | Hao Yin | J. Ugander | Hao Yin

[1] Piotr Sapiezynski,et al. Quantifying Surveillance in the Networked Age: Node-based Intrusions and Group Privacy , 2018, ArXiv.

[2] N Linial,et al. Low diameter graph decompositions , 1993, Comb..

[3] D. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[4] Stanley Milgram,et al. An Experimental Study of the Small World Problem , 1969 .

[5] Tyler J. VanderWeele,et al. Concerning the consistency assumption in causal inference. , 2009, Epidemiology.

[6] Jean Pouget-Abadie,et al. Testing for arbitrary interference on experimentation platforms , 2017, Biometrika.

[7] Aaron Clauset,et al. Assembling thefacebook: Using Heterogeneity to Understand Online Social Network Assembly , 2015, WebSci.

[8] Christos Faloutsos,et al. Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[9] Stephen E. Fienberg,et al. A Brief History of Statistical Models for Network Analysis and Open Challenges , 2012 .

[10] D. Sussman,et al. Elements of estimation theory for causal effects in the presence of network interference , 2017, 1702.03578.

[11] J. Cheeger. A lower bound for the smallest eigenvalue of the Laplacian , 1969 .

[12] Ulrike von Luxburg,et al. A tutorial on spectral clustering , 2007, Stat. Comput..

[13] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14] Sharon L. Milgram,et al. The Small World Problem , 1967 .

[15] Lars Backstrom,et al. Balanced label propagation for partitioning massive graphs , 2013, WSDM.

[16] D. Horvitz,et al. A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[17] Mihalis Yannakakis,et al. The complexity of multiway cuts (extended abstract) , 1992, STOC '92.

[18] Eiji Miyano,et al. Distance- $d$ independent set problems for bipartite and chordal graphs , 2012, J. Comb. Optim..

[19] Dean Eckles,et al. Design and Analysis of Experiments in Networks: Reducing Bias from Interference , 2014, ArXiv.

[20] Gary L. Miller,et al. Parallel graph decompositions using random shifts , 2013, SPAA.

[21] Robert Krauthgamer,et al. Bounded geometries, fractals, and low-distortion embeddings , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[22] David F. Gleich,et al. Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[23] M. Hudgens,et al. Toward Causal Inference With Interference , 2008, Journal of the American Statistical Association.

[24] Noga Alon,et al. A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem , 1985, J. Algorithms.

[25] I NICOLETTI,et al. The Planning of Experiments , 1936, Rivista di clinica pediatrica.