Fast Probabilistic Subsumption Checking for Publish/Subscribe Systems

Efficient subsumption checking, deciding whether a subscription or publication is subsumed (covered) by a set of previously defined subscriptions, is of paramount importance for publish/subscribe systems. It provides the core system functionality, and additionally, reduces the overall system load and generated traffic in distributed environments. As the deterministic solution was shown previously to be co-NP complete and existing solutions typically employ costly pairwise comparisons to detect the subsumption relationship, we propose a probabilistic algorithm for the general subsumption problem. It efficiently determines whether a publication/subscription is covered by a disjunction of subscriptions in $O(k m d)$, where $k$ is the number of subscriptions, $m$ is the number of distinct attributes in subscriptions, and $d$ is the number of tests performed to answer a subsumption question. The probability of error is problem specific and typically very small, and determines an upper bound on $d$ in polynomial time prior to the algorithm execution. Our experimental results demonstrate the algorithm performs even better in practice due to introduced optimizations, and is adequate for fast forwarding of publications/subscriptions, especially in resource scarce environments, e.g. sensor networks.