Efficient and Effective Pair-Matching Algorithms for Agent-Based Models

Microsimulations and agent-based models across various disciplines need to match agents into relationships. Some of these models need to repeatedly match different pairs of agents, for example microsimulations of sexually transmitted infection epidemics. We describe the requirements for pair-matching in these types of microsimulations, and present several pair-matching algorithms: Brute force (BFPM), Random (RPM), Random k (RKPM), Weighted shuffle (WSPM), Cluster shuffle (CSPM), and Distribution counting (DCPM). Using two microsimulations, we empirically compare the speeds, and pairing quality of these six algorithms. For models which execute pair-matching many thousands or millions of times, BFPM is not usually a practical option because it is slow. On the other hand, RPM is fast but chooses poor quality pairs. Nevertheless both algorithms are used, sometimes implicitly, in many models. Here we use them as yardsticks for upper and lower bounds for speed and quality. In these tests CSPM offers the best trade-off of speed and effectiveness. In general, CSPM is fast and produces stochastic, high quality pair-matches, which are often desirable characteristics for pair-matching in discrete time step microsimulations. Moreover it is a simple algorithm that can be easily adapted for the specific needs of a particular domain. However, for some models, RKPM or DCPM would be as fast as CSPM with matches of similar quality. We discuss the circumstances under which this would happen.