Missing data in networks: exponential random graph (p∗) models for networks with non-respondents

Survey studies of complete social networks often involve non-respondents, whereby certain people within the “boundary” of a network do not complete a sociometric questionnaire—either by their own choice or by the design of the study—yet are still nominated by other respondents as network partners. We develop exponential random graph (p ∗ ) models for network data with non-respondents. We model respondents and non-respondents as two different types of nodes, distinguishing ties between respondents from ties that link respondents to non-respondents. Moreover, if we assume that the non-respondents are missing at random, we invoke homogeneity across certain network configurations to infer effects as applicable to the entire set of network actors. Using an example from a well-known network dataset, we show that treating a sizeable proportion of nodes as non-respondents may still result in estimates, and inferences about structural effects, consistent with those for the entire network. If, on the other hand, the principal research focus is on the respondent-only structure, with non-respondents clearly not missing at random, we incorporate the information about ties to non-respondents as exogenous. We illustrate this model with an example of a network within and between organizational departments. Because in this second class of models the number of non-respondents may be large, values of parameter estimates may not be directly comparable to those for models that exclude non-respondents. In the context of discussing recent technical developments in exponential random graph models, we present a heuristic method based on pseudo-likelihood estimation to infer whether certain structural effects may contribute substantially to the predictive capacity of a model, thereby enabling comparisons of important effects between models with differently sized node sets. © 2004 Elsevier B.V. All rights reserved.

[1]  David Krackhardt,et al.  Cognitive social structures , 1987 .

[2]  S. Wasserman,et al.  Models and Methods in Social Network Analysis , 2005 .

[3]  Clifford C. Clogg,et al.  Handbook of statistical modeling for the social and behavioral sciences , 1995 .

[4]  P. Pattison,et al.  New Specifications for Exponential Random Graph Models , 2006 .

[5]  P. Pattison,et al.  9. Neighborhood-Based Models for Social Networks , 2002 .

[6]  P. Pattison,et al.  Random graph models for temporal processes in social networks , 2001 .

[7]  Garry Robins,et al.  Network models for social selection processes , 2001, Soc. Networks.

[8]  M. Handcock Center for Studies in Demography and Ecology Assessing Degeneracy in Statistical Models of Social Networks , 2005 .

[9]  Tom A. B. Snijders,et al.  Markov Chain Monte Carlo Estimation of Exponential Random Graph Models , 2002, J. Soc. Struct..

[10]  Paul Erdös,et al.  On random graphs, I , 1959 .

[11]  L. Freeman Research Methods in Social Network Analysis , 1991 .

[12]  Carter T. Butts,et al.  Network inference, error, and informant (in)accuracy: a Bayesian approach , 2003, Soc. Networks.

[13]  A. Heath,et al.  Strategy and Transaction in an African Factory , 1974 .

[14]  P. Pattison,et al.  Models and Methods in Social Network Analysis: Random Graph Models for Social Networks: Multiple Relations or Multiple Raters , 2005 .

[15]  William Richards,et al.  Nonrespondents in Communication Network Studies , 1992 .

[16]  Ronald S. Burt,et al.  A note on missing network data in the general social survey , 1987 .

[17]  S. Wasserman,et al.  Models and Methods in Social Network Analysis: An Introduction to Random Graphs, Dependence Graphs, and p * , 2005 .

[18]  Henryk Sienkiewicz,et al.  Quo Vadis? , 1967, American Association of Industrial Nurses journal.

[19]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: III. Valued relations , 1999 .

[20]  D. Altman,et al.  Missing data , 2007, BMJ : British Medical Journal.

[21]  P. Pattison,et al.  Network models for social influence processes , 2001 .

[22]  David Strauss On a general class of models for interaction , 1986 .

[23]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: II. Multivariate relations. , 1999, The British journal of mathematical and statistical psychology.

[24]  Krzysztof Nowicki,et al.  Exploratory statistical analysis of networks , 1992 .

[25]  Tom A. B. Snijders,et al.  DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS , 2003 .

[26]  Everett M. Rogers,et al.  Communication Networks: Toward a New Paradigm for Research , 1980 .

[27]  Gueorgi Kossinets Effects of missing data in social networks , 2006, Soc. Networks.

[28]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[29]  P. Pattison,et al.  Models and Methods in Social Network Analysis: Interdependencies and Social Processes: Dependence Graphs and Generalized Dependence Structures , 2005 .

[30]  P. Pattison,et al.  Small and Other Worlds: Global Network Structures from Local Processes1 , 2005, American Journal of Sociology.

[31]  Devon D. Brewer,et al.  Forgetting of friends and its effects on measuring friendship networks , 2000, Soc. Networks.

[32]  D. J. Strauss,et al.  Pseudolikelihood Estimation for Social Networks , 1990 .

[33]  P. Pattison LOGIT MODELS AND LOGISTIC REGRESSIONS FOR SOCIAL NETWORKS: I. AN INTRODUCTION TO MARKOV GRAPHS AND p* STANLEY WASSERMAN UNIVERSITY OF ILLINOIS , 1996 .