Social patterns: Community detection using behavior-generated network datasets

A set of behavior rules, personal characteristics, group affiliations and roles was used to generate a dataset of mixed communication actions modeling those at a large organization. Several different approaches to community detection and modeling were applied to this generated dataset, in order to compare the strengths and range of applicability of different algorithms. Graph partitioning methods performed well at assigning membership to formal, exclusive groups such as organizational departments, if there is a priori knowledge of the target number of groups. SSDE-cluster, a fast and scalable algorithm, performed well in detecting normal departments and can be used when the number of groups is not known. It also was able to detect small overlapping groups, but with only moderate accuracy. Clique enumeration performed well in detecting small overlapping groups, when a priori knowledge of average group size was used. Different methods of constructing social network graphs from the mixed communication actions were investigated, as well as different link weighing methods. We conclude that behavior-generated datasets with complex and complete ground truths are useful for collaborative validation of different community and role detection and modeling methods.