An Entity Resolution Model based on Time-Sensitive for Telecom Fraud Crime Investigation

Entity resolution is an important part of the social network analysis, and it aims to identify the real corresponding entity out of a confusion of entities. Based on the standard terminology in graph entity recognition, this paper proposed an improved Jacquard similarity measure approach according to features of telecom fraud crime, and presented a novel entity resolution algorithm which utilizes some new measures as time step and time-sensitive coefficient. This algorithm can be applied to multi-time-step conditions in order to analyze the entity’s changes. By taking into full account of the movement area of the suspects established by the Police Bureau, it excludes the impossible telecom entities, and eventually discovers the result cluster of the given mobile numbers that possibly belong to the same entity. This approach can help the Police Bureau narrow the investigation scope and maximize the anti-crime effects in the information age. Based on the phone-call data collected by the police bureau, this algorithm that has been applied in the telecom frauds well show its feasibility, practicality and effectiveness.

[1]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[2]  Huidong Jin,et al.  PutMode: prediction of uncertain trajectories in moving objects databases , 2010, Applied Intelligence.

[3]  Lise Getoor,et al.  Iterative record linkage for cleaning and integration , 2004, DMKD '04.

[4]  Charles Elkan,et al.  The Field Matching Problem: Algorithms and Applications , 1996, KDD.

[5]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[6]  A. Unnikrishnan,et al.  Correlation Clustering Model for Crime Pattern Detection , 2010, Int. J. Adv. Comp. Techn..

[7]  Shaojie Qiao,et al.  SimRank: A Page Rank approach based on similarity measure , 2010, 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering.

[8]  Jian Pei,et al.  Improving Grouped-Entity Resolution Using Quasi-Cliques , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  Huidong Jin,et al.  Constrained k-closest pairs query processing based on growing window in crime databases , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[10]  Qiao Shao Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System , 2008 .

[11]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[12]  Shaojie Qiao,et al.  WebRank: A Hybrid Page Scoring Approach Based on Social Network Analysis , 2010, RSKT.

[13]  Hong Li,et al.  HCUBE: A HIERARCHICAL CLUSTERING ALGORITHM USING BLOCKMODELING IN WEB SOCIAL NETWORKS , 2010 .

[14]  Rajeev Motwani,et al.  Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.

[15]  Surajit Chaudhuri,et al.  Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.

[16]  Yong-Dal Shin New Model for Cyber Crime Investigation Procedure , 2011 .

[17]  Shaojie Qiao,et al.  Parallel Sequential Pattern Mining of Massive Trajectory Data , 2010, Int. J. Comput. Intell. Syst..

[18]  Gang Wang,et al.  A Crime Group Identification Model Based on Mobile Communication Vestige Records , 2011 .

[19]  Changjie Tang,et al.  VCCM Mining: Mining Virtual Community Core Members Based on Gene Expression Programming , 2006, WISI.