A Comparative Study to Evaluate Filtering Methods for Crime Data Feature Selection

Abstract In this study, we present a comparative study on correlation and information gain algorithms to evaluate and produce the subset of crime features. The main objective of the study is to find a subset of attributes from a dataset described by a feature set and to classify the crimes into three different categories; low, medium and high. The experiment is carried out on the communities and crime dataset using WEKA, an open source data mining software. Based on attributes chosen by five features selection methods, the accuracy rates of several classification algorithms were obtained for analysis. The results from the experiment demonstrated that, the correlation method out performed information gain and human expert with a mean accuracy of 96.94% for entire classifier and FSs with 13 optimal features selection. This subset feature is important information for classification and can be effectively applied to crime dataset to predict crime category for different state and directly support decision making in crime prevention system.

[1]  Alok Baveja,et al.  Computing , Artificial Intelligence and Information Technology A data-driven software tool for enabling cooperative information sharing among police departments , 2002 .

[2]  Weixin Xie,et al.  Novel Hybrid Feature Selection Algorithms for Diagnosing Erythemato-Squamous Diseases , 2012, HIS.

[3]  Ronald V. Clarke,et al.  “Situational” Crime Prevention: Theory and Practice , 1980 .

[4]  Noor Maizura Mohamad Noor,et al.  A Hybrid Selection Method Based on HCELFS and SVM for the Diagnosis of Oral Cancer Staging , 2015 .

[5]  Aida Mustapha,et al.  An experimental study of classification algorithms for crime prediction. , 2013 .

[6]  M. Esmel ElAlami A filter model for feature subset selection based on genetic algorithm , 2009, Knowl. Based Syst..

[7]  Ezhilmaran Devarasan,et al.  Computing the Probability on Socio Economic Factors to Predict the Crime Locations by Means of Joint Probability Based AMABC-FCIL , 2016 .

[8]  Christopher M. Gifford,et al.  Fuzzy association rule mining for community crime pattern discovery , 2010, ISI-KDD '10.

[9]  Peter A. Bandettini,et al.  Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images , 2012, NeuroImage.

[10]  Santosh Kumar,et al.  Validation of UML Class Model through Finite-State Machine , 2012 .

[11]  Krzysztof Halawa A method to improve the performance of multilayer perceptron by utilizing various activation functions in the last hidden layer and the least squares method , 2011, Neural Processing Letters.

[12]  S. P. Rajagopalan,et al.  A Hybrid Feature Selection Method based on IGSBFS and Naïve Bayes for the Diagnosis of Erythemato - Squamous Diseases , 2012 .