Robust link prediction in criminal networks: A case study of the Sicilian Mafia

Abstract Link prediction exercises may prove particularly challenging with noisy and incomplete networks, such as criminal networks. Also, the link prediction effectiveness may vary across different relations within a social group. We address these issues by assessing the performance of different link prediction algorithms on a mafia organization. The analysis relies on an original dataset manually extracted from the judicial documents of operation “Montagna”, conducted by the Italian law enforcement agencies against individuals affiliated with the Sicilian Mafia. To run our analysis, we extracted two networks: one including meetings and one recording telephone calls among suspects, respectively. We conducted two experiments on these networks. First, we applied several link prediction algorithms and observed that link prediction algorithms leveraging the full graph topology (such as the Katz score) provide very accurate results even on very sparse networks. Second, we carried out extensive simulations to investigate how the noisy and incomplete nature of criminal networks may affect the accuracy of link prediction algorithms. The experimental findings suggest the soundness of link predictions is relatively high provided that only a limited amount of knowledge about connections is hidden or missing, and the unobserved edges follow some kind of generative law. The different results on the meeting and telephone call networks indicate that the specific features of a network should be taken into careful consideration.

[1]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  R. Hulst Introduction to Social Network Analysis (SNA) as an investigative tool , 2009 .

[3]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[4]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[5]  Giuseppe F. Italiano,et al.  A Data Streaming Approach to Link Mining in Criminal Networks , 2017, 2017 5th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW).

[6]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[7]  Edward R. Kleemans,et al.  Criminal Careers in Organized Crime and Social Opportunity Structure , 2008 .

[8]  Francesco Calderoni,et al.  The structure of drug trafficking mafias: the ‘Ndrangheta and cocaine , 2012 .

[9]  Azween B. Abdullah,et al.  Hidden Link Prediction in Criminal Networks Using the Deep Reinforcement Learning Technique , 2019, Comput..

[10]  Nicola Parolini,et al.  Link Prediction in Criminal Networks: A Tool for Criminal Intelligence Analysis , 2016, PloS one.

[11]  Thomas C. Sharkey,et al.  Integrative Analytics for Detecting and Disrupting Transnational Interdependent Criminal Smuggling, Money, and Money-Laundering Networks , 2018, 2018 IEEE International Symposium on Technologies for Homeland Security (HST).

[12]  Hsinchun Chen,et al.  Link prediction approach to collaborative filtering , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[13]  P. Klerks The Network Paradigm Applied to Criminal Organisations: Theoretical nitpicking or a relevant doctrine for investigators? Recent developments in the Netherlands , 2001 .

[14]  Sheldon M. Ross Introductory Statistics , 1995 .

[15]  Martin Bouchard,et al.  Collaboration and Boundaries in Organized Crime: A Network Perspective , 2020, Crime and Justice.

[16]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[17]  Carlo Morselli,et al.  Career opportunities and network-based privileges in the Cosa Nostra , 2003 .

[18]  Francesco Calderoni,et al.  Social Network Analysis of Organized Criminal Groups , 2014 .

[19]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[20]  Diego Gambetta The Sicilian Mafia , 1993 .

[21]  Rosanna Grassi,et al.  Betweenness to assess leaders in criminal networks: New evidence using the dual projection approach , 2019, Soc. Networks.

[22]  Santo Fortunato,et al.  Network structure, metadata and the prediction of missing nodes , 2016, ArXiv.

[23]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[24]  Aditya Khamparia,et al.  A comprehensive survey of edge prediction in social networks: Techniques, parameters and challenges , 2019, Expert Syst. Appl..

[25]  G. Berlusconi Do all the pieces matter? Assessing the reliability of law enforcement data sources for the network analysis of wire taps , 2013 .

[26]  Changjun Fan,et al.  An efficient link prediction index for complex military organization , 2017 .

[27]  Gang Wang,et al.  Crime data mining: a general framework and some examples , 2004, Computer.

[28]  Letizia Paoli,et al.  Italian Organised Crime: Mafia Associations and Criminal Enterprises , 2004 .

[29]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[30]  Konstantin Avrachenkov,et al.  Similarities on graphs: Kernels versus proximity measures , 2018, Eur. J. Comb..

[31]  Tomáš Diviák,et al.  Key aspects of covert networks data collection: Problems, challenges, and opportunities , 2019, Soc. Networks.

[32]  P. Jones,et al.  Inferring missing links in partially observed social networks , 2009, J. Oper. Res. Soc..

[33]  Ricardo B. C. Prudêncio,et al.  Proximity measures for link prediction based on temporal events , 2013, Expert Syst. Appl..

[34]  Pasquale De Meo,et al.  Detecting criminal organizations in mobile phone networks , 2014, Expert Syst. Appl..

[35]  Sabine De Moor,et al.  Assessing the missing data problem in criminal network analysis using forensic DNA data , 2020, Soc. Networks.

[36]  Pasquale De Meo,et al.  Social Network Analysis of Sicilian Mafia Interconnections , 2019, COMPLEX NETWORKS.

[37]  Pasquale De Meo,et al.  Network Structure and Resilience of Mafia Syndicates , 2015, Inf. Sci..

[38]  Malcolm K. Sparrow,et al.  The application of network analysis to criminal intelligence: An assessment of the prospects , 1991 .

[39]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[40]  Chen Li,et al.  Efficient heterogeneous proximity preserving network embedding model , 2019, Expert Syst. Appl..

[41]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[42]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[43]  Diego Gambetta,et al.  Conspiracy among the Many: the Mafia in Legitimate Industries , 1995 .

[44]  Jure Leskovec,et al.  The Network Completion Problem: Inferring Missing Nodes and Edges in Networks , 2011, SDM.

[45]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[46]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[47]  Carlo Piccardi,et al.  Communities in criminal networks: A case study , 2017, Soc. Networks.

[48]  Martin G. Everett,et al.  The dual-projection approach for two-mode networks , 2013, Soc. Networks.

[49]  Scott W. Duxbury,et al.  Criminal network security: An agent‐based approach to evaluating network resilience* , 2019, Criminology.

[50]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[51]  Steven J. Strang Network Analysis in Criminal Intelligence , 2014 .

[52]  Damián Zaitch,et al.  The social embeddedness of organized crime , 2014 .

[53]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[54]  Hsinchun Chen,et al.  Criminal network analysis and visualization , 2005, CACM.

[55]  Federico Varese,et al.  Listening to the wire: criteria and techniques for the quantitative analysis of phone intercepts , 2011, Trends in Organized Crime.

[56]  Guido Caldarelli,et al.  Entropy-based approach to missing-links prediction , 2018, Appl. Netw. Sci..