Index-Trie: Efficient archival and retrieval of network traffic

Abstract Historical network traffic traces, both at the flow and packet level, play a significant role in many research and engineering areas, such as network security, traffic engineering and accounting. To retrieve the specific entries at a higher speed from large traces, each packet or flow should be indexed using multiple query fields during archiving. This brings challenges both in terms of archiving speed and storage consumption. We propose a network traffic indexing and querying method based on Index–Trie, to achieve fast archiving, low storage space of the indexes, and fast retrieval. We implemented a system for online trace archival and retrieval. Our experiments, performed both offline and online on backbone, campus and datacenter network traffic, demonstrate that our method outperforms the popular FastBit method. For packet traces, the Index–Trie based method can obtain an improvement up to 72% for the archiving rate, 56% lower storage consumption, and 14 times faster retrieving time. For flow traces, compared to FastBit, our system is up to 15 times faster in term of the archiving rate, 42% less storage, and 100 times faster retrieving speed. Furthermore, we extend the application of Index-Tries to log file indexing and retrieving.

[1]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[2]  Arie Shoshani,et al.  Optimizing bitmap indices with efficient compression , 2006, TODS.

[3]  Michail Vlachos,et al.  Real-time creation of bitmap indexes on streaming network data , 2011, The VLDB Journal.

[4]  Jun Li,et al.  TIFA: Enabling Real-Time Querying and Storage of Massive Stream Data , 2011, 2011 Second International Conference on Networking and Distributed Computing.

[5]  Yifan Yu,et al.  TIFAflow: enhancing traffic archiving system with flow granularity for forensic analysis in network security , 2013 .

[6]  Mohsen Sardari,et al.  Packet-Level Network Compression: Realization and Scaling of the Network-Wide Benefits , 2014, IEEE/ACM Transactions on Networking.

[7]  Anja Feldmann,et al.  Building a time machine for efficient recording and retrieval of high-volume network traffic , 2005, IMC '05.

[8]  Arie Shoshani,et al.  Compressing bitmap indexes for faster search operations , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[9]  Gunnar Karlsson,et al.  IP-address lookup using LC-tries , 1999, IEEE J. Sel. Areas Commun..

[10]  Eddie Kohler,et al.  Observed structure of addresses in IP traffic , 2006, TNET.

[11]  Xenofontas A. Dimitropoulos,et al.  RasterZip: compressing network monitoring data with support for partial decompression , 2012, Internet Measurement Conference.

[12]  Antonio Pescapè,et al.  Efficient Storage and Processing of High-Volume Network Monitoring Data , 2013, IEEE Transactions on Network and Service Management.

[13]  Michail Vlachos,et al.  Net-Fli: On-the-fly Compression, Archiving and Indexing of Streaming Network Traffic , 2010, Proc. VLDB Endow..

[14]  G. Antoshenkov,et al.  Byte-aligned bitmap compression , 1995, Proceedings DCC '95 Data Compression Conference.

[15]  Luca Deri,et al.  Collection and Exploration of Large Data Monitoring Sets Using Bitmap Databases , 2010, TMA.

[16]  Magdalena Balazinska,et al.  On-Demand View Materialization and Indexing for Network Forensic Analysis , 2007, NetDB.

[17]  Prabhat,et al.  FastBit: interactively searching massive data , 2009 .

[18]  Arie Shoshani,et al.  Enabling Real-Time Querying of Live and Historical Stream Data , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[19]  Nasir D. Memon,et al.  NetStore: An Efficient Storage Infrastructure for Network Forensics and Monitoring , 2010, RAID.

[20]  Antonio Pescapè Entropy-Based Reduction of Traffic Data , 2007, IEEE Communications Letters.

[21]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[22]  Torben Bach Pedersen,et al.  Position list word aligned hybrid: optimizing space and performance for compressed bitmaps , 2010, EDBT '10.

[23]  Arie Shoshani,et al.  On the performance of bitmap indices for high cardinality attributes , 2004, VLDB.

[24]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.