An Empirically-Based Model for Network Estimation Under Uncertainty and Policy Analysis

Social network analysis has been used to understand groups of individuals and how they operate. Most of the literature in social networks has dealt with overt organizations with an easily discernable network structure. This paper examines the possibilities of using the inherent structures observed in social networks to make predictions of networks using limited and missing information. The model is based on empirical network data exhibiting the structural properties of triad closure and adjacency. Triad closure indicates that if person i has a dyad with person j and person j has a dyad with person k, then there is a higher than chance likelihood that person i and person j have a dyad. The model exploits these properties using an inference model to update adjacent dyads given information on a reference dyad. The model is tested against several networks to understand and discern its behavior. The paper illustrates that if the model is built with careful consideration towards the network being predicted, it will assist in making better decisions regarding uncertain organizational phenomenon. The method is applied in a covert network example, and has been extended to show its usefulness in epidemiological networks and improving performance in organizations operating under stress. The paper opens up new avenues in the development of models designed to make network predictions and use those predictions to make better decisions. Support: This research has been supported in part by the National Science Foundation IGERT in CASOS, the office of Naval Research ONR 1681-1-1001944 and the center for Computational Analysis of Social and Organizational Systems. The views and results expressed herein are solely the responsibility of the authors and do not represent the official views of the Office of Naval Research or the National Science Foundation. Unknown Network Structures Social network analysis has been used to understand organizational dynamics in a variety of application areas (e.g., epidemiology, technological diffusion, and management consulting). A group’s behavior, values, and/or performance can be articulated by understanding the relationships that exist within the group. Most applications to date have been on open groups or societies in controlled experiments. Currently there have been very few network applications to covert or “hidden” networks of interest. Social network measures and tools that could efficiently infer “hidden” networks from limited data could allow policy makers to make better informed decisions in a variety of applications. This paper presents an empirically-based probabilistic model, grounded on observational social networks, to infer network structure using limited and incomplete information. First, relative similarity information is used to build a prior probability assessment of who communicates with whom. As direct information on dyadic likelihood is received, these priors are updated. Adjacent dyads are updated through an empirically based inference model. This continues until the likelihood of every dyad in the probabilistic network is inferred and updated. The resulting network provides an estimate of the actual network and may be used to guide policy analysis. Network Properties Researchers have uncovered inherent structural properties in social networks (Skvoretz, 1990). These properties arise from the structure of the network itself and not from the behavior of the individuals in the network. They include reciprocity, triad-closure, and triad-closure reciprocity. A corollary of the triad properties is an adjacency property. Simply stated, if persons i and j are talkative with each other, then they are likely to be talkative with others. Formally, if A and B are adjacent dyads, then if n n E n n A B > > ) ( , and if n n E n n A B < < ) ( , where nB is the number of interactions recorded on dyad B, E(nA) is the expected number of interactions on dyad A, and n-bar is the mean number of interactions for the whole network. In other words, if B has above average activity then the expected value of the distribution of interactions for all of its adjacent dyads will also exceed the mean number of interactions. The degree to which these properties exist varies from network to network (Krackhardt, 1987). Constructing the Model The problem domain will determine the relationship of interest (ROI). In most real-world situations, only samples of interactions between individuals can be observed. Depending on the type of interaction, knowing that i and j interacted will inform our belief about the likelihood of a ROI existing between the individuals. But, what, if any, inference can be made about these individuals’ relationships with others in the network? For illustrative purposes and to facilitate model development, we focus on one social network dataset, Bernard and Killworth’s 1979 observed interactions between 58 fraternity brothers at a West Virginia university. Because of the size of this data set, it was not possible to develop a robust inference model based on the triad-closure property. Instead, the model is based on adjacency properties found in the data. Figure 1 shows this relationship between interactions on a reference dyad and the expected number of interactions on an adjacent dyad. As the number of communications for the reference dyad increases, so does the expected number for the adjacent dyads. Adjacency Property-A plot of Conversation Counts