VATE: a trade-off between memory and preserving time for high accuracy cardinalities estimation under sliding time window

Abstract Host cardinality is one of the most important attributes in the field of network research. The cardinality estimation under sliding time window has become a research hotspot in recent years. This kind of algorithms preserve the time information of sliding time window by introducing more powerful counters. The more counters used in these algorithms, the higher the estimation accuracy of these algorithms will be. However, the available number of sliding counters is limited due to their large memory footprint or long state-preserving time. To solve this problem, a new sliding counter, asynchronous time stamp (AT), is designed in this paper. AT has the advantages of small memory consumption and low state-preserving time. It can directly replace the counter used in the existing algorithms. On the same platform, higher accuracy can be achieved by adopting more AT. Furthermore, this paper designs a new per host cardinality estimation algorithm, virtual AT estimator (VATE), based on AT. VATE is also a parallel algorithm that can be deployed on GPU. With the parallel processing capability of GPU, VATE can estimate cardinalities of hosts in a 40 Gb/s high-speed network in real time at the time granularity of 1 s. In our experiments, VATE increases the state-preserving speed by 4 to 400 times at the cost of 11.11% more memory compared with a state-of-the-art algorithm.

[1]  Guiqiang Ni,et al.  CVS: Fast cardinality estimation for large-scale data streams over sliding windows , 2016, Neurocomputing.

[2]  Frédérique Silber-Chaussumier,et al.  Generating data transfers for distributed GPU parallel programs , 2013, J. Parallel Distributed Comput..

[3]  Chi-Chun Lo,et al.  A Cooperative Intrusion Detection System Framework for Cloud Computing Networks , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[4]  Dawn Xiaodong Song,et al.  New Streaming Algorithms for Fast Detection of Superspreaders , 2005, NDSS.

[5]  Erik D. Demaine,et al.  Identifying frequent items in sliding windows over on-line packet streams , 2003, IMC '03.

[6]  Jonghoon Park,et al.  Cardinality estimation using collective interference for large-scale RFID systems , 2017, J. Netw. Comput. Appl..

[7]  Josep Sanjuàs-Cuxart,et al.  Counting Flows over Sliding Windows in High Speed Networks , 2009, Networking.

[8]  Hyang-Ah Kim,et al.  Counting network flows in real time , 2003, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489).

[9]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[10]  Yao Ge,et al.  Top-k frequent items and item frequency tracking over sliding windows of any size , 2019, Inf. Sci..

[11]  Dipankar Raychaudhuri,et al.  EIR: Edge-aware inter-domain routing protocol for the future mobile internet , 2017, Comput. Networks.

[12]  Frédéric Giroire,et al.  Estimating the Number of Active Flows in a Data Stream over a Sliding Window , 2007, ANALCO.

[13]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2003, SIGCOMM '03.

[14]  Zhiyang Li,et al.  A Continuous Virtual Vector-Based Algorithm for Measuring Cardinality Distribution , 2014, ICA3PP.

[15]  Mohamed Saad Non-isotonic routing metrics solvable to optimality via shortest path , 2018, Comput. Networks.

[16]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[17]  Jie Liu,et al.  High Speed Network Super Points Detection Based on Sliding Time Window by GPU , 2017, 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC).

[18]  Jorge González-Domínguez,et al.  Accelerating binary biclustering on platforms with CUDA-enabled GPUs , 2019, Inf. Sci..

[19]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[20]  Ming Zhang,et al.  Spreader Classification Based on Optimal Dynamic Bit Sharing , 2013, IEEE/ACM Transactions on Networking.

[21]  MyungKeun Yoon,et al.  A grand spread estimator using a graphics processing unit , 2014, J. Parallel Distributed Comput..

[22]  C. Shoba Bindu,et al.  StreamSW: A density-based approach for clustering data streams over sliding windows , 2019, Measurement.

[23]  Sunny Behal,et al.  D-FACE: An anomaly based distributed approach for early detection of DDoS attacks and flash events , 2018, J. Netw. Comput. Appl..

[24]  Yuangang Wang,et al.  Benchmarking the GPU memory at the warp level , 2018, Parallel Comput..

[25]  Min Chen,et al.  Cardinality Estimation for Elephant Flows: A Compact Solution Based on Virtual Register Sharing , 2017, IEEE/ACM Transactions on Networking.

[26]  S. Karthik,et al.  Analysis of simulation of DDOS attack in cloud , 2014, International Conference on Information Communication and Embedded Systems (ICICES2014).

[27]  Georges Hébrail,et al.  Sliding HyperLogLog: Estimating Cardinality in a Data Stream over a Sliding Window , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[28]  Dhruba Kumar Bhattacharyya,et al.  Real-time DDoS attack detection using FPGA , 2017, Comput. Commun..

[29]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[30]  Massimo Bernaschi,et al.  Benchmarking of communication techniques for GPUs , 2013, J. Parallel Distributed Comput..

[31]  Craig Warren,et al.  A CUDA-based GPU engine for gprMax: Open source FDTD electromagnetic simulation software , 2019, Comput. Phys. Commun..

[32]  Amir Rajabzadeh,et al.  Analyzing data locality in GPU kernels using memory footprint analysis , 2019, Simul. Model. Pract. Theory.

[33]  P. Flajolet,et al.  HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .

[34]  Kai Chen,et al.  Optimized projection for hashing , 2019, Pattern Recognit. Lett..

[35]  Shigang Chen,et al.  Per-flow counting for big network data stream over sliding windows , 2017, 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS).

[36]  Witold Pedrycz,et al.  Network traffic fusion and analysis against DDoS flooding attacks with a novel reversible sketch , 2019, Inf. Fusion.

[37]  Richard E. Harang,et al.  Evasion-resistant network scan detection , 2015, Security Informatics.

[38]  P. Flajolet,et al.  Loglog counting of large cardinalities , 2003 .

[39]  Tao Qin,et al.  A Data Streaming Method for Monitoring Host Connection Degrees of High-Speed Links , 2011, IEEE Transactions on Information Forensics and Security.

[40]  Frédéric Giroire,et al.  Order statistics and estimating cardinalities of massive data sets , 2009, Discret. Appl. Math..

[41]  Roberto Baldoni,et al.  High frequency batch-oriented computations over large sliding time windows , 2015, Future Gener. Comput. Syst..

[42]  Nikos Giatrakos,et al.  Omnibus outlier detection in sensor networks using windowed locality sensitive hashing , 2020, Future Gener. Comput. Syst..

[43]  Guiqiang Ni,et al.  Fast counting the cardinality of flows for big traffic over sliding windows , 2017, Frontiers of Computer Science.

[44]  Zhibo Wang,et al.  Approximate Cardinality Estimation (ACE) in large-scale Internet of Things deployments , 2017, Ad Hoc Networks.

[45]  Yuan He,et al.  BFCE: A Constant-time cardinality estimator for large-scale RFID systems , 2017, Comput. Commun..

[46]  Mitsuo Gen,et al.  Accelerating genetic algorithms with GPU computing: A selective overview , 2019, Comput. Ind. Eng..

[47]  Kyu-Young Whang,et al.  A linear-time probabilistic counting algorithm for database applications , 1990, TODS.

[48]  Rafail Ostrovsky,et al.  How to catch L2-heavy-hitters on sliding windows , 2014, Theor. Comput. Sci..

[49]  Keqiu Li,et al.  Detection of Superpoints Using a Vector Bloom Filter , 2016, IEEE Transactions on Information Forensics and Security.