Revealing skype traffic: when randomness plays with you

Skype is a very popular VoIP software which has recently attracted the attention of the research community and network operators. Following a closed source and proprietary design, Skype protocols and algorithms are unknown. Moreover, strong encryption mechanisms are adopted by Skype, making it very difficult to even glimpse its presence from a traffic aggregate. In this paper, we propose a framework based on two complementary techniques to reveal Skypetraffic in real time. The first approach, based on Pearson'sChi-Square test and agnostic to VoIP-related trafficcharacteristics, is used to detect Skype's fingerprint from the packet framing structure, exploiting the randomness introduced at the bit level by the encryption process. Conversely, the second approach is based on a stochastic characterization of Skype traffic in terms of packet arrival rate and packet length, which are used as features of a decision process based on Naive Bayesian Classifiers.In order to assess the effectiveness of the above techniques, we develop an off-line cross-checking heuristic based on deep-packet inspection and flow correlation, which is interesting per se. This heuristic allows us to quantify the amount of false negatives and false positives gathered by means of the two proposed approaches: results obtained from measurements in different networks show that the technique is very effective in identifying Skype traffic. While both Bayesian classifier and packet inspection techniques are commonly used, the idea of leveraging on randomness to reveal traffic is novel. We adopt this to identify Skype traffic, but the same methodology can be applied to other classification problems as well.