Early identification of spammers through identity linking, social network and call features

Abstract Multiple identities are created to gain financial benefits by performing malicious activities such as spamming, committing frauds and abusing the system. A single malicious individual may have a large number of identities in order to make malicious activities to a large number of legitimate individuals. Linking identities of an individual would help in protecting the legitimate users from abuses, frauds, and maintains reputation of the service provider. Simply analyzing each identity's historical behavior is not sufficient to block spammers frequently changing identity because spammers quickly discards the identity and start using new one. Moreover, spammers may appear as a legitimate user on an initial analysis, for example because of small number of interactions from any identity. The challenge is to identify the spammer by analyzing the aggregate behavior of an individual rather than that of a single calling identity. This paper presents EIS (early identification of spammers) system for the early identification of spammers frequently changing identities. Specifically, EIS system consists of three modules and uses social call graph among identities. (1) An ID-CONNECT module that links identities that belongs to a one physical individual based on a social network structure and calling attributes of identities; (2) a reputation module that computes reputation of an individual by considering his aggregate behavior from his different identities; and (3) a detection module that computes automated threshold below which individuals are classified as a spammer or a non-spammer. We evaluate the proposed system on a synthetic data-set that has been generated for the different graph networks and different percentage of spammers. Performance analysis shows that EIS is effective against spammers frequently changing their identities and is able to achieve high true positive rate when spammers have high small overlap in target victims from their identities.

[1]  Christoph Pörschmann,et al.  Content-Based Detection and Prevention of Spam over IP Telephony - System Design, Prototype and First Results , 2011, 2011 IEEE International Conference on Communications (ICC).

[2]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[3]  Martin Szomszor,et al.  Correlating user profiles from multiple folksonomies , 2008, Hypertext.

[4]  Roberta Presta,et al.  An anomaly-based approach to the analysis of the social behavior of VoIP users , 2013, Comput. Networks.

[5]  Ramayya Krishnan,et al.  HYDRA: large-scale social identity linkage via heterogeneous behavior modeling , 2014, SIGMOD Conference.

[6]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[7]  Hong Yan,et al.  Incorporating Active Fingerprinting into SPIT Prevention Systems , 2006 .

[8]  Reza Zafarani,et al.  Connecting Corresponding Identities across Communities , 2009, ICWSM.

[9]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[10]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[11]  Adam Doupé,et al.  SoK: Everyone Hates Robocalls: A Survey of Techniques Against Telephone Spam , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[12]  Xiao Wang,et al.  VoteTrust: Leveraging Friend Invitation Graph to Defend against Social Network Sybils , 2016, IEEE Transactions on Dependable and Secure Computing.

[13]  Anupam Joshi,et al.  @i seek 'fb.me': identifying users across multiple online social networks , 2013, WWW.

[14]  Virgílio A. F. Almeida,et al.  Characterizing a spam traffic , 2004, IMC '04.

[15]  Muhammad Ajmal Azad,et al.  Caller-REP: Detecting unwanted calls with caller social strength , 2013, Comput. Secur..

[16]  R. MacIntosh,et al.  Detection and mitigation of spam in IP telephony networks using signaling protocol analysis , 2005, IEEE/Sarnoff Symposium on Advances in Wired and Wireless Communication, 2005..

[17]  Barbara Carminati,et al.  User similarities on social networks , 2013, Social Network Analysis and Mining.

[18]  Mustaque Ahamad,et al.  Phoneypot: Data-driven Understanding of Telephony Threats , 2015, NDSS.

[19]  Haesun Park,et al.  CallRank: Combating SPIT Using Call Duration, Social Networks and Global Reputation , 2007, CEAS.

[20]  Christos Faloutsos,et al.  Mobile call graphs: beyond power-law and lognormal distributions , 2008, KDD.

[21]  Xiao Wang,et al.  VoteTrust: Leveraging friend invitation graph to defend against social network Sybils , 2013, 2013 Proceedings IEEE INFOCOM.

[22]  Saurabh Bagchi,et al.  Spam detection in voice-over-IP calls through semi-supervised clustering , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[23]  Muhammad Ajmal Azad,et al.  ID-CONNECT: Combining Network and Call Features to Link Different Identities of a User , 2015, 2015 IEEE 18th International Conference on Computational Science and Engineering.

[24]  M. Newman Random Graphs as Models of Networks , 2002, cond-mat/0202208.

[25]  Eric Y. Chen,et al.  Using Call Patterns to Detect Unwanted Communication Callers , 2009, 2009 Ninth Annual International Symposium on Applications and the Internet.

[26]  Vincent Y. Shen,et al.  User identification across multiple social networks , 2009, 2009 First International Conference on Networked Digital Technologies.

[27]  Ling Huang,et al.  Joint Link Prediction and Attribute Inference Using a Social-Attribute Network , 2014, TIST.

[28]  Victor C. M. Leung,et al.  Modeling Channel Occupancy Times for Voice Traffic in Cellular Networks , 2007, 2007 IEEE International Conference on Communications.

[29]  Hassan Takabi,et al.  Towards active detection of identity clone attacks on online social networks , 2011, CODASPY '11.

[30]  Marit Hansen,et al.  Developing a Legally Compliant Reachability Management System as a Countermeasure against SPIT 1 , 2006 .

[31]  Virgílio A. F. Almeida,et al.  Studying User Footprints in Different Online Social Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[32]  Cullen Jennings,et al.  The Session Initiation Protocol (SIP) and Spam , 2008, RFC.

[33]  Richard Chbeir,et al.  User Profile Matching in Social Networks , 2010, 2010 13th International Conference on Network-Based Information Systems.

[34]  Christos Faloutsos,et al.  Catching Synchronized Behaviors in Large Networks , 2016, ACM Trans. Knowl. Discov. Data.

[35]  Peter Fankhauser,et al.  Identifying Users Across Social Tagging Systems , 2011, ICWSM.

[36]  Pável Calado,et al.  Resolving user identities over social networks through supervised learning and rich similarity features , 2012, SAC '12.

[37]  Dongwook Shin,et al.  Progressive multi gray-leveling: a voice spam protection algorithm , 2006, IEEE Network.

[38]  Angelos D. Keromytis,et al.  A Comprehensive Survey of Voice over IP Security Research , 2012, IEEE Communications Surveys & Tutorials.

[39]  Antonio Nucci,et al.  You can SPIT, but you can't hide: Spammer identification in telephony networks , 2011, 2011 Proceedings IEEE INFOCOM.

[40]  Shyhtsun Felix Wu,et al.  Analysis of user keyword similarity in online social networks , 2011, Social Network Analysis and Mining.

[41]  Damien Ernst,et al.  Outbound SPIT filter with optimal performance guarantees , 2012, Comput. Networks.

[42]  Xinyuan Wang,et al.  Call Behavioral Analysis to Thwart SPIT Attacks on VoIP Networks , 2011, SecureComm.

[43]  Cao Xiao,et al.  Detecting Clusters of Fake Accounts in Online Social Networks , 2015, AISec@CCS.

[44]  Arvind Krishnamurthy,et al.  Studying Spamming Botnets Using Botlab , 2009, NSDI.

[45]  Bartunov Sergey,et al.  Joint Link-Attribute User Identity Resolution in Online Social Networks , 2012 .

[46]  Miika Komu,et al.  Cure for Spam Over Internet Telephony , 2007, 2007 4th IEEE Consumer Communications and Networking Conference.