A review of data mining applications in crime

Crime continues to remain a severe threat to all communities and nations across the globe alongside the sophistication in technology and processes that are being exploited to enable highly complex criminal activities. Data mining, the process of uncovering hidden information from Big Data, is now an important tool for investigating, curbing and preventing crime and is exploited by both private and government institutions around the world. The primary aim of this paper is to provide a concise review of the data mining applications in crime. To this end, the paper reviews over 100 applications of data mining in crime, covering a substantial quantity of research to date, presented in chronological order with an overview table of many important data mining applications in the crime domain as a reference directory. The data mining techniques themselves are briefly introduced to the reader and these include entity extraction, clustering, association rule mining, decision trees, support vector machines, naive Bayes rule, neural networks and social network analysis amongst others. © 2016 Wiley Periodicals, Inc. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2016

[1]  Panagiotis Kanellis,et al.  Digital Crime And Forensic Science in Cyberspace (N/A) , 2006 .

[2]  P. Klerks The Network Paradigm Applied to Criminal Organisations: Theoretical nitpicking or a relevant doctrine for investigators? Recent developments in the Netherlands , 2001 .

[3]  Chih Hao Ku,et al.  Crime Information Extraction from Police and Witness Narrative Reports , 2008, 2008 IEEE Conference on Technologies for Homeland Security.

[4]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[5]  George R. Krupka,et al.  IsoQuest Inc.: Description of the NetOwl™ Extractor System as Used for MUC-7 , 1998, MUC.

[6]  Emmanuel Sirimal Silva,et al.  Data Mining and Official Statistics: The Past, the Present and the Future , 2014, Big Data.

[7]  Yashwant Prasad Singh,et al.  Adaboost and SVM based cybercrime detection and prevention model , 2012, Artif. Intell. Res..

[8]  Oludayo O. Olugbara,et al.  Exploring Support Vector Machines and Random Forests to Detect Advanced Fee Fraud Activities on Internet , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[9]  Roberto Basili,et al.  Kernel-Based Learning for Domain-Specific Relation Extraction , 2009, AI*IA.

[10]  Luiz Eduardo Soares de Oliveira,et al.  Crime scene classification , 2008, SAC '08.

[11]  Aida Mustapha,et al.  An experimental study of classification algorithms for crime prediction. , 2013 .

[12]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[13]  Roselina Sallehuddin,et al.  Economic Indicators Selection for Property Crime Rates using Grey Relational Analysis and Support Vector Regression , 2022, International Journal of Systems Applications, Engineering & Development.

[14]  Hsinchun Chen,et al.  Extracting Meaningful Entities from Police Narrative Reports , 2002, DG.O.

[15]  Gondy Leroy,et al.  Natural Language Processing and e-Government: Extracting Reusable Crime Report Information , 2007, 2007 IEEE International Conference on Information Reuse and Integration.

[16]  Walter A. Kosters,et al.  Data Mining Approaches to Criminal Career Analysis , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Wang Jing,et al.  Analysis of decision tree classification algorithm based on attribute reduction and application in criminal behavior , 2011, 2011 3rd International Conference on Computer Research and Development.

[18]  Gang Wang,et al.  Crime data mining: a general framework and some examples , 2004, Computer.

[19]  Shyam Varan Nath,et al.  Crime Pattern Detection Using Data Mining , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops.

[20]  Dr. M. VijayaKumar,et al.  THE DAY-TO-DAY CRIME FORECASTING ANALYSIS OF USING SPATIAL- TEMPORAL CLUSTERING SIMULATION , 2013 .

[21]  Chaochang Chiu,et al.  Internet Auction Fraud Detection Using Social Network Analysis and Classification Tree Approaches , 2011, Int. J. Electron. Commer..

[22]  Charles Marcos Décoodé : la culture, porte d'entrée , 2008 .

[23]  Rong Zheng,et al.  Crime Data Mining: An Overview and Case Studies , 2003, DG.O.

[24]  Koby Crammer,et al.  Advances in Neural Information Processing Systems 14 , 2002 .

[25]  Jennifer Jie Xu,et al.  Mining communities and their relationships in blogs: A study of online hate groups , 2007, Int. J. Hum. Comput. Stud..

[26]  Christie M. Fuller,et al.  An investigation of data and text mining methods for real world deception detection , 2011, Expert Syst. Appl..

[27]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[28]  Qiao Shao Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System , 2008 .

[29]  Richard M. Schwartz,et al.  BBN: Description of the SIFT System as Used for MUC-7 , 1998, MUC.

[30]  Hsinchun Chen,et al.  Using Coplink to Analyze Criminal-Justice Data , 2002, Computer.

[31]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[32]  Ramasamy Uthurusamy,et al.  EVOLVING DATA MINING INTO SOLUTIONS FOR INSIGHTS , 2002 .

[33]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[34]  Vadlamani Ravi,et al.  Detecting phishing e-mails using text and data mining , 2012, 2012 IEEE International Conference on Computational Intelligence and Computing Research.

[35]  Peter J. Carrington,et al.  Crime and Social Network Analysis , 2014 .

[36]  Qiwei Liu,et al.  Mining the Core Member of Terrorist Crime Group Based on Social Network Analysis , 2007, PAISI.

[37]  Aziz Nasridinov,et al.  A Decision Tree-Based Classification Model for Crime Prediction , 2013, ITCS.

[38]  Oscar Gish Achieving health for all: Economic and social policy , 1990 .

[39]  Keun Ho Ryu,et al.  Mining association rules on significant rare data using relative support , 2003, J. Syst. Softw..

[40]  Rebecca P Ang,et al.  Predicting Juvenile Offending , 2013, International journal of offender therapy and comparative criminology.

[41]  Jyun-Cheng Chiu,et al.  Detecting online auction inflated-reputation behaviors using Social Network Analysis , 2005 .

[42]  Katherine Faust Centrality in affiliation networks , 1997 .

[43]  Jesus Mena,et al.  Investigative Data Mining for Security and Criminal Detection , 2002 .

[44]  Donald E. Brown,et al.  An Outlier-based Data Association Method for Linking Criminal Incidents , 2003, SDM.

[45]  R. Lombardo,et al.  Crime And Economic Performance. A Cluster Analysis Of Panel Data On Italy'S Nuts 3 Regions , 2011 .

[46]  Bhavani M. Thuraisingham,et al.  Extraction of expanded entity phrases , 2011, Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics.

[47]  Rekha Bhowmik,et al.  Detecting Auto Insurance Fraud by Data Mining Techniques , 2011 .

[48]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[49]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[50]  Kuldeep Kumar,et al.  A Comparative Analysis of Decision Trees Vis-à-vis Other Computational Data Mining Techniques in Automotive Insurance Fraud Detection , 2012 .

[51]  Amin Milani Fard,et al.  Collaborative Mining in Multiple Social Networks Data for Criminal Group Discovery , 2009, 2009 International Conference on Computational Science and Engineering.

[52]  Jennifer Xu,et al.  Using Web Mining and Social Network Analysis to Study The Emergence of Cyber Communities In Blogs , 2008 .

[53]  Patrick A. Shoemaker,et al.  A note on least-squares learning procedures and classification by neural network models , 1991, IEEE Trans. Neural Networks.

[54]  Yuen-Hsien Tseng,et al.  Name entity extraction based on POS tagging for criminal information analysis and relation visualization , 2012, 2012 6th International Conference on New Trends in Information Science, Service Science and Data Mining (ISSDM2012).

[55]  Malcolm K. Sparrow,et al.  The application of network analysis to criminal intelligence: An assessment of the prospects , 1991 .

[56]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[57]  Chih-Hao Wen,et al.  Identifying Smuggling Vessels with Artificial Neural Network and Logistics Regression in Criminal Intelligence Using Vessels Smuggling Case Data , 2012, ACIIDS.

[58]  Andreas Schaad,et al.  Privacy-preserving social network analysis for criminal investigations , 2008, WPES '08.

[59]  Nancy Chinchor,et al.  Overview of MUC-7 , 1998, MUC.

[60]  Ralph Grishman,et al.  NYU: Description of the MENE Named Entity System as Used in MUC-7 , 1998, MUC.

[61]  Ying Wang,et al.  Computer Crime Forensics Based on Improved Decision Tree Algorithm , 2014, J. Networks.

[62]  Richard Adderley,et al.  Data mining case study: modeling the behavior of offenders who commit serious sexual assaults , 2001, KDD '01.

[63]  Jay Liebowitz,et al.  The synergy of social network analysis and knowledge mapping: a case study , 2006 .

[64]  박종찬 수자원의 합리적 배분 , 2007 .

[65]  K. Rameshkumar,et al.  A Complete Survey on application of Frequent Pattern Mining and Association Rule Mining on Crime Pattern Mining , 2014 .

[66]  W. Marsden I and J , 2012 .

[67]  Steve Benford,et al.  EmoPlayer: A media player for video clips with affective annotations , 2008, Interact. Comput..

[68]  Masnizah Mohd,et al.  Extraction of nationality from crime news , 2013 .

[69]  Hsinchun Chen,et al.  Fighting organized crimes: using shortest-path algorithms to identify associations in criminal networks , 2004, Decis. Support Syst..

[70]  Wei Liu,et al.  Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System: Mining Key Members of Crime Networks Based on Personality Trait Simulation Email Analysis System , 2009 .

[71]  Chamont Wang,et al.  Data Mining and Hotspot Detection in an Urban Development Project , 2008 .

[72]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[73]  S.T. Sarasamma,et al.  Data Mining Through Fuzzy Social Network Analysis , 2007, NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society.

[74]  Shlomo Hershkop,et al.  Automated social hierarchy detection through email network analysis , 2007, WebKDD/SNA-KDD '07.

[75]  M. Muthukumar,et al.  Histopathological Study of Infection Process of Colletotrichum gloeosporioides Penz and Sacc. on Mangifera indica L. , 2012 .

[76]  Gondy Leroy,et al.  Natural language processing and e-Government: crime information extraction from heterogeneous data sources , 2008, DG.O.

[77]  Lc Gupta Chapter-22 Radiation Changes in Bone , 2008 .

[78]  Mohammad Reza Keyvanpour,et al.  Detecting and investigating crime by means of data mining: a general crime matching framework , 2011, WCIT.

[79]  Divya Prakash,et al.  Detection and Analysis of Hidden Activities in Social Networks , 2013 .

[80]  Richard Adderley,et al.  Use of data mining techniques to model crime scene investigator performance , 2007, Knowl. Based Syst..

[81]  Renuka Nagpal,et al.  Crime Analysis using K-Means Clustering , 2013 .

[82]  Dongdai Lin,et al.  A Method for Locating Digital Evidences with Outlier Detection Using Support Vector Machine , 2008, Int. J. Netw. Secur..

[83]  S. Appavu alias Balamurugan,et al.  Association Rule Mining for Suspicious Email Detection: A Data Mining Approach , 2007, 2007 IEEE Intelligence and Security Informatics.

[84]  Christopher M. Gifford,et al.  Fuzzy association rule mining for community crime pattern discovery , 2010, ISI-KDD '10.

[85]  Hassan Usman,et al.  Complementing GIS with Cluster Analysis in Assessing Property Crime in Katsina State, Nigeria , 2012 .

[86]  Fred Branfman Long Live Zinn , 2010 .

[87]  H. Gish,et al.  A probabilistic approach to the understanding and training of neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[88]  Malek Ben Salem,et al.  Detecting Masqueraders: A Comparison of One-Class Bag-of-Words User Behavior Modeling Techniques , 2010, J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl..

[89]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[90]  Hsinchun Chen,et al.  Analyzing Terrorist Networks: A Case Study of the Global Salafi Jihad Network , 2005, ISI.

[91]  S. Appavu alias Balamurugan,et al.  Suspicious E-mail Detection via Decision Tree: A Data Mining Approach , 2007, J. Comput. Inf. Technol..

[92]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[93]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[94]  Sukanya,et al.  Criminals and crime hotspot detection using data mining algorithms: clustering and classification , 2012 .

[95]  Prabin Kumar Panigrahi,et al.  A Review of Financial Accounting Fraud Detection based on Data Mining Techniques , 2012, ArXiv.

[96]  Wei Ding,et al.  Crime Forecasting Using Data Mining Techniques , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[97]  T. Grubesic Detecting Hot Spots Using Cluster Analysis and GIS , 2007 .

[98]  Hsinchun Chen,et al.  Automated criminal link analysis based on domain knowledge , 2007, J. Assoc. Inf. Sci. Technol..

[99]  R. Berk,et al.  Forecasting murder within a population of probationers and parolees: a high stakes application of statistical learning , 2009 .

[100]  Ping Wang,et al.  Predicting Criminal Recidivism with Support Vector Machine , 2010, 2010 International Conference on Management and Service Science.

[101]  Siddhartha Bhattacharyya,et al.  Data mining for credit card fraud: A comparative study , 2011, Decis. Support Syst..

[102]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[103]  James Duke,et al.  CHAPTER 47 – Degenerative Neurologic Diseases and Neuropathies , 2011 .

[104]  Maria T. Pazienza,et al.  Information Extraction , 2002, Lecture Notes in Computer Science.

[105]  John Scott Social Network Analysis , 1988 .

[106]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[107]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[108]  Gopal K Gupta,et al.  Introduction to Data Mining with Case Studies , 2011 .

[109]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[110]  George M. Mohay,et al.  Mining e-mail content for author identification forensics , 2001, SGMD.

[111]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[112]  Ahmad Kadri Junoh,et al.  Crime Detection with DCT and Artificial Intelligent Approach , 2013 .

[113]  Vibhu O. Mittal,et al.  Applying Machine Learning for High‐Performance Named‐Entity Extraction , 2000, Comput. Intell..

[114]  Donald E. Brown,et al.  Data association methods with applications to law enforcement , 2003, Decis. Support Syst..

[115]  Douglas H. Harris,et al.  The Application of Link Analysis to Police Intelligence , 1975 .

[116]  Hsinchun Chen,et al.  COPLINK Center: Information and Knowledge Management for Law Enforcement , 2004, DG.O.

[117]  Jennifer Xu,et al.  Automated criminal link analysis based on domain knowledge: Research Articles , 2007 .

[118]  Gondy Leroy,et al.  Using Decision Trees to Predict Crime Reporting , 2008 .

[119]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[120]  Vladia Pinheiro,et al.  Natural Language Processing based on Semantic inferentialism for extracting crime information from text , 2010, 2010 IEEE International Conference on Intelligence and Security Informatics.

[121]  Nazlia Omar,et al.  Arabic named entity recognition in crime documents , 2012 .

[122]  Dirk Neumann,et al.  Investigating Crime-to-Twitter Relationships in Urban Environments - Facilitating a Virtual Neighborhood Watch , 2014, ECIS.

[123]  N. Tollenaar,et al.  Which method predicts recidivism best?: a comparison of statistical, machine learning and data mining predictive models , 2013 .

[124]  Ramasamy Uthurusamy,et al.  Evolving data into mining solutions for insights , 2002, CACM.

[125]  Stephen Chi-fai Chan,et al.  Incremental Mining for Temporal Association Rules for Crime Pattern Discoveries , 2007, ADC.

[126]  P. Thongtae,et al.  An Analysis of Data Mining Applications in Crime Domain , 2008, 2008 IEEE 8th International Conference on Computer and Information Technology Workshops.

[127]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[128]  Bastin Tony Roy Savarimuthu,et al.  Extracting Crime Information from Online Newspaper Articles , 2014, AWC.

[129]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[130]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[131]  Jyun-Cheng Wang,et al.  Recommending trusted online auction sellers using social network analysis , 2008, Expert Syst. Appl..

[132]  Eric A. Wan,et al.  Neural network classification: a Bayesian interpretation , 1990, IEEE Trans. Neural Networks.

[133]  William M. Pottenger,et al.  Distributed higher order association rule mining using information extracted from textual data , 2005, SKDD.

[134]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[135]  Hsinchun Chen Homeland security data mining using social network analysis , 2008, ISI.

[136]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[137]  Ian H. Witten,et al.  Using language models for generic entity extraction , 1999 .

[138]  Reda Alhajj,et al.  EFFECTIVENESS OF SUPPORT VECTOR MACHINE FOR CRIME HOT-SPOTS PREDICTION , 2008, Appl. Artif. Intell..

[139]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[140]  Roselina Sallehuddin,et al.  Hybrid Support Vector Regression and Autoregressive Integrated Moving Average Models Improved by Particle Swarm Optimization for Property Crime Rates Forecasting with Economic Indicators , 2013, TheScientificWorldJournal.

[141]  Henry G. Goldberg,et al.  Restructuring Transactional Data for Link Analysis in the FinCEN AI System , 1998 .

[142]  Walter A. Kosters,et al.  A Distance Measure for Determining Similarity Between Criminal Investigations , 2006, Industrial Conference on Data Mining.

[143]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[144]  Javad Hosseinkhani,et al.  Detecting Suspicion Information on the Web Using Crime Data Mining Techniques , 2014 .

[145]  Hsinchun Chen,et al.  COPLINK: managing law enforcement data and knowledge , 2003, CACM.

[146]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[147]  C. Haythornthwaite Social network analysis: An approach and technique for the study of information exchange☆ , 1996 .

[148]  D. M. Schwartz,et al.  Using social network analysis to target criminal networks , 2009 .