Frequent Itemsets Mining in Network Traffic Data

Many projects have tried to analyze the structure and dynamics of application overlay networks on the Internet using packet analysis and network flow data. While such analysis is essential for a variety of network management and security tasks, it is difficult on many networks: either the volume of data is so large as to make packet inspection intractable, or privacy concerns forbid packet capture and require the dissociation of network flows from users' actual IP addresses. In this paper, an algorithm for mining privacy preserving item sets is proposed. On the one hand, only maximal item set is considered, which reduces the number of item sets greatly. On the other hand, the intermediate mining results are encrypted for the security concern. Experimental results show that the proposed algorithm is both accurate and efficient.