Fast and Scalable Classification of Structured Data in the Network

For many network services, such as firewalling, load balancing, or cryptographic acceleration, data packets need to be classified (or filtered) before network appliances can apply any action processing on them. Typical actions are header manipulations, discarding packets, or tagging packets with additional information required for later processing. Structured data, such as XML, is independent from any particular presentation format and is an ideal information exchange format for a variety of heterogeneous sources. In this paper, we propose a new algorithm for fast and efficient classification of structured data in the network. In our approach, packet processing and classification is performed on structured payload data rather than only packet header information. Using a combination of hash functions, Bloom filter, and set intersection theory our algorithm builds a hierarchical and layered data element tree over the input grammar that requires logarithmic time and tractable space complexity.