Canopy — Fast Sampling with Cover Trees

• Decompose {1, . . . n} into sets L,H with i ∈ L if πi < n−1 and i ∈ H otherwise. • For each i ∈ L pick some j ∈ H . – Append the triple (i, j, πi) to an array A – Set residual π′ j := πj + πi − n−1 – If π′ j > n−1 return π′ j to H , otherwise to L. Preprocessing takes O(n) computation and memory since we remove one element at a time from L. • To sample from the array pick u ∼ U(0, 1) uniformly at random. • Choose the tuple (i, j, πi) at position bunc. • If u− n−1bunc < πi return i, else return j. This step costs O(1) operations and it follows by construction that i is returned with probability πi. Now we need a data structure that will allow us to sample many objects in bulk without the need to inspect each item individually. Cover trees satisfy this requirement.