Complex Problem Solving: Identity Matching Based on Social Contextual Information

Complex problems like drug crimes often involve a large number of variables interacting with each other. A complex problem may be solved by breaking it into parts (i.e., sub-problems), which can be tackled more easily. The identity matching problem, for example, is a part of the problem of drug and other types of crimes. It is often encountered during crime investigations when a single criminal is represented by multiple identity records in law enforcement databases. Because of the discrepancies among these records, a single criminal may appear to be different people. Following Enid Mumford’s three-stage problem solving framework, we design a new method to address the problem of criminal identity matching for fighting drug-related crimes. Traditionally, the complexity of criminal identity matching was reduced by treating criminals as isolated individuals who maintain certain personal identities. In this research, we recognize the intrinsic complexity of the problem and treat criminals as interrelated rather than isolated individuals. In other words, we take into consideration of the social relationships between criminals during the matching process. We study not only the personal identities but also the social identities of criminals. Evaluation results were quite encouraging and showed that combining social features with personal features could improve the performance of criminal identity matching. In particular, the social features become more useful when data contain many missing values for personal attributes.

[1]  Jesse Davis,et al.  Establishing Identity Equivalence in Multi-Relational Domains , 2005 .

[2]  Ian Witten,et al.  Data Mining , 2000 .

[3]  S. Briggs,et al.  SELF-CONSCIOUSNESS AND ASPECTS OF IDENTITY , 1982 .

[4]  R. Sternberg,et al.  Complex Problem Solving : Principles and Mechanisms , 1992 .

[5]  H. Tajfel,et al.  The Social Identity Theory of Intergroup Behavior. , 2004 .

[6]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[7]  Pradeep Ravikumar,et al.  Adaptive Name Matching in Information Integration , 2003, IEEE Intell. Syst..

[8]  Hsinchun Chen,et al.  Using Coplink to Analyze Criminal-Justice Data , 2002, Computer.

[9]  Naomi Ellemers,et al.  Social Identity: Context, Commitment, Content , 1999 .

[10]  Enid Mumford,et al.  Dangerous Decisions: Problem Solving in Tomorrow's World , 1999 .

[11]  Sumit Sarkar,et al.  A Distance-Based Approach to Entity Reconciliation in Heterogeneous Databases , 2002, IEEE Trans. Knowl. Data Eng..

[12]  Gang Wang,et al.  Automatically detecting deceptive criminal identities , 2004, CACM.

[13]  Lior Rokach,et al.  An Introduction to Decision Trees , 2007 .

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  John C. Henderson,et al.  Strategic Alignment: Leveraging Information Technology for Transforming Organizations , 1993, IBM Syst. J..

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[17]  Pairin Katerattanakul,et al.  Is information systems a reference discipline? , 2006, CACM.

[18]  Roger Clarke,et al.  Human Identification in Information Systems , 1994 .

[19]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[20]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[21]  Hsinchun Chen,et al.  Untangling Criminal Networks: A Case Study , 2003, ISI.

[22]  Jeff Jonas Identity resolution: 23 years of practical experience and observations at scale , 2006, SIGMOD Conference.

[23]  Hsinchun Chen,et al.  Web mining: Machine learning for web applications , 2005, Annu. Rev. Inf. Sci. Technol..

[24]  Judith L. Klavans,et al.  Methods for precise named entity matching in digital collections , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[25]  K. J. Lynch,et al.  Automatic construction of networks of concepts characterizing document databases , 1992, IEEE Trans. Syst. Man Cybern..

[26]  Hsinchun Chen,et al.  COPLINK: managing law enforcement data and knowledge , 2003, CACM.

[27]  Thomas Redman,et al.  The impact of poor data quality on the typical enterprise , 1998, CACM.

[28]  C RedmanThomas The impact of poor data quality on the typical enterprise , 1998 .

[29]  Hsinchun Chen,et al.  Concept-based searching and browsing: a geoscience experiment , 2001, J. Inf. Sci..

[30]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[31]  Joachim Funke,et al.  Solving complex problems: Exploration and control of complex systems , 1991 .

[32]  Salvatore T. March,et al.  Design and natural science research on information technology , 1995, Decis. Support Syst..

[33]  K. Deaux,et al.  Interpersonal networks and social categories: Specifying levels of context in identity processes , 2003 .

[34]  Donald E. Brown,et al.  Data association methods with applications to law enforcement , 2003, Decis. Support Syst..

[35]  Hsinchun Chen,et al.  Automated criminal link analysis based on domain knowledge , 2007, J. Assoc. Inf. Sci. Technol..

[36]  Enid Mumford,et al.  Problems, knowledge, solutions: solving complex problems , 1998, J. Strateg. Inf. Syst..

[37]  Richard T. Serpe,et al.  Commitment, Identity Salience, and Role Behavior: Theory and Research Example , 1982 .

[38]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[39]  S. Worchel,et al.  Psychology of intergroup relations , 1986 .

[40]  Alan R. Hevner,et al.  Design Science in Information Systems Research , 2004, MIS Q..

[41]  Wayne D. Gray Simulated task environments: The role of high-fidelity simulations, scaled worlds, synthetic environments, and laboratory tasks in basic and applied cognitive research. , 2002 .

[42]  H. Chen,et al.  Automatically detecting criminal identity deception: an adaptive detection algorithm , 2006, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[43]  M. V. Valkenburg Network Analysis , 1964 .

[44]  J. Quesada *,et al.  Complex problem-solving: a field in search of a definition? , 2005 .

[45]  Hsinchun Chen,et al.  Alleviating Search Uncertainty Through Concept Associations: Automatic Indexing, Co-Occurrence Analysis, and Parallel Computing , 1998, J. Am. Soc. Inf. Sci..

[46]  John C. Turner,et al.  Some current issues in research on social identity and self-categorization theories , 1999 .

[47]  H. Atabakhsh,et al.  Cross-jurisdictional criminal activity networks to support border and transportation security , 2004, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749).

[48]  Marvin V. Zelkowitz,et al.  Experimental Models for Validating Technology , 1998, Computer.