High performance content-based matching using GPUs

Matching incoming event notifications against received subscriptions is a fundamental part of every publish-subscribe infrastructure. In the case of content-based systems this is a fairly complex and time consuming task, whose performance impacts that of the entire system. In the past, several algorithms have been proposed for efficient content-based event matching. While they differ in most aspects, they have in common the fact of being conceived to run on conventional, sequential hardware. On the other hand, modern Graphical Processing Units (GPUs) offer off-the-shelf, highly parallel hardware, at a reasonable cost. Unfortunately, GPUs introduce a totally new model of computation, which re- quires algorithms to be fully re-designed. In this paper, we describe a new content-based matching algorithm designed to run efficiently on CUDA, a widespread architecture for general purpose programming on GPUs. A detailed comparison with SFF, the matching algorithm of Siena, known for its efficiency, demonstrates how the use of GPUs can bring impressive speedups in content-based matching. At the same time, this analysis demonstrates the peculiar aspects of CUDA programming that mostly impact performance.

[1]  Timothy S. Axelrod,et al.  Effects of synchronization barriers on multiprocessor performance , 1986, Parallel Comput..

[2]  Gian Pietro Picco,et al.  REDS: a reconfigurable dispatching system , 2006, SEM '06.

[3]  Alejandro P. Buchmann,et al.  An Architectural Framework für Electronic Commerce Applications , 2001, GI Jahrestagung.

[4]  Alexander L. Wolf,et al.  A routing scheme for content-based networking , 2004, IEEE INFOCOM 2004.

[5]  Antonio Carzaniga,et al.  Content-based communication: a research agenda , 2006, SEM '06.

[6]  Helmut Veith,et al.  Efficient filtering in publish-subscribe systems using binary decision diagrams , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[7]  W Luk,et al.  Accelerating Publish/Subscribe Matching on Reconfigurable Supercomputing Platforms , 2010 .

[8]  Christof Fetzer,et al.  Bloom filter based routing for content-based publish/subscribe , 2008, DEBS.

[9]  Sasu Tarkoma,et al.  Distributed event routing in publish/subscribe communication systems , 2009 .

[10]  Daniel B. Horn,et al.  Assessment of Graphic Processing Units (GPUs) for Department of Defense (DoD) Digi , 2005 .

[11]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[12]  Christopher Krügel,et al.  Decentralized Event Correlation for Intrusion Detection , 2001, ICISC.

[13]  Hans-Arno Jacobsen,et al.  Parallel event processing for content-based publish/subscribe systems , 2009, DEBS '09.

[14]  Kun-Lung Wu,et al.  Evaluation of streaming aggregation on parallel hardware architectures , 2010, DEBS '10.

[15]  Alexander L. Wolf,et al.  Content-Based Networking: A New Communication Infrastructure , 2001, Infrastructure for Mobile and Wireless Systems.

[16]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[17]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[18]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[19]  Peter R. Pietzuch,et al.  Distributed event-based systems , 2006 .

[20]  Alexander L. Wolf,et al.  Forwarding in a content-based network , 2003, SIGCOMM '03.

[21]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[22]  David Luckham,et al.  The power of events - an introduction to complex event processing in distributed enterprise systems , 2002, RuleML.

[23]  David S. Rosenblum,et al.  Achieving scalability and expressiveness in an Internet-scale event notification service , 2000, PODC '00.

[24]  Felix C. Freiling,et al.  Evaluating advanced routing algorithms for content-based publish/subscribe systems , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[25]  Jack J. Purdum,et al.  C programming guide , 1983 .

[26]  Hari Balakrishnan,et al.  The design and implementation of an intentional naming system , 1999, SOSP.

[27]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[28]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[29]  Rüdiger Westermann,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, SIGGRAPH Courses.

[30]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.