From Linking Text to Linking Crimes: Information Retrieval, But Not As You Know It

Information retrieval techniques have been used for a long time to identify links between textual items for the automatic construction of hypertexts and electronic books where sought information can be accessed by browsing. While research work in this area has been steadily decreasing in recent years, some of the techniques developed in that context are proving very valuable in a number of new application areas. In this paper we present an approach to automatic linking of textual items that is used to prioritise criminal suspects in a police investigation. A free-text description of an unsolved crime is compared to previous offence descriptions where the offender is known. By linking the descriptions, inferences about likely suspects can be made. Language Modeling is adapted to produce a Bayesian model which assigns a probability to each suspect. An empirical study showed that the linking of free text descriptions of burglaries enables prioritisation of offenders. The model presented in this paper could be easily extended to take account of additional crime and suspect linking data, such as geographical location of crimes or suspect social networks. This would enable large networks of investigative information automatically constructed from police archives to be browsed.

[1]  Fabio Crestani,et al.  User Centered Evaluation of an Automatically Constructed Hyper-TextBook , 2001 .

[2]  Ben Shneiderman,et al.  Automatically transforming regularly structured linear documents into hypertext , 1989 .

[3]  Fabio Crestani,et al.  Searching the web by constrained spreading activation , 2000, Inf. Process. Manag..

[4]  Peter Bruza,et al.  Stratified Hypermedia Structures for Information Disclosure , 1992, Comput. J..

[5]  Louis M. Gomez,et al.  Formative design evaluation of superbook , 1989, TOIS.

[6]  Fabio Crestani,et al.  A Case study of Automatic Authoring: From a Textbook to a Hyper-Textbook , 1998, Data Knowl. Eng..

[7]  Fabio Crestani,et al.  Automatic construction of hypertexts for self-referencing: the Hyper-TextBook project , 2003, Inf. Syst..

[8]  John Tebbutt User Evaluation of Automatically Generated Semantic Hypertext Links in a Heavily Used Procedural Manual , 1999, Inf. Process. Manag..

[9]  Fabio Crestani,et al.  A methodology for the automatic construction of a hypertext for information retrieval , 1993, SAC '93.

[10]  D. Canter Offender profiling and criminal differentiation , 2000 .

[11]  Alan F. Smeaton,et al.  Experiments on the automatic construction of hypertexts from texts , 1995, New Rev. Hypermedia Multim..

[12]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[13]  Massimo Melucci,et al.  Making digital libraries effective: Automatic generation of links for similarity search across hyper-textbooks , 2004, J. Assoc. Inf. Sci. Technol..

[14]  Fabio Crestani,et al.  Application of Language Models to Suspect Prioritisation and Suspect Likelihood in Serial Crimes , 2007 .

[15]  James Mayfield,et al.  Two-Level Models of Hypertext , 1997, Intelligent Hypertext.

[16]  D V Canter,et al.  Linking commercial burglaries by modus operandi: tests using regression and ROC analysis. , 2002, Science & justice : journal of the Forensic Science Society.

[17]  Tao Tao,et al.  A formal study of information retrieval heuristics , 2004, SIGIR '04.

[18]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[19]  John Robertson,et al.  The Hypermedia Authoring Research Toolkit (HART) , 1994, ECHT '94.

[20]  Vijay V. Raghavan,et al.  On modeling of information retrieval concepts in vector spaces , 1987, TODS.

[21]  D. Canter,et al.  Differentiating arsonists: A model of firesetting actions and characteristics , 1998 .

[22]  Ben Shneiderman,et al.  Structural analysis of hypertexts: identifying hierarchies and useful metrics , 1992, TOIS.

[23]  Alan F. Smeaton,et al.  Building Hypertexts under the Influence of Topology Metrics , 1995, IWHD.

[24]  Rodrigo A. Botafogo Cluster analysis for hypertext systems , 1993, SIGIR.

[25]  Djoerd Hiemstra,et al.  Language Modelling and Relevance , 2003 .

[26]  R. H. Thompson The design and implementation of an intelligent interface for information retrieval , 1989 .

[27]  Giles Oatley,et al.  Crimes analysis software: 'pins in maps', clustering and Bayes net prediction , 2003, Expert Syst. Appl..

[28]  Fabio Crestani,et al.  A methodology for the enhancement of a hypertext version of a textbook by the automatic insertion of links in the subject index , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[29]  Louis M. Gomez,et al.  Acquiring information in books and superbooks , 1991 .

[30]  Maristella Agosti,et al.  A two-level hypertext retrieval model for legal data , 1991, SIGIR '91.

[31]  E. Frisse Mark,et al.  Searching for information in a hypertext medical handbook , 1988 .

[32]  Fabio Crestani,et al.  Automatic authoring and construction of hypermedia for information retrieval , 1995, Multimedia Systems.

[33]  Fabio Crestani,et al.  Appearance and functionality of electronic books , 2004, International Journal on Digital Libraries.

[34]  Fabio Crestani,et al.  On the Use of Information Retrieval Techniques for the Automatic Construction of Hypertext , 1997, Inf. Process. Manag..

[35]  Roy Rada,et al.  Converting a textbook to hypertext , 1992, TOIS.

[36]  Fabio Crestani,et al.  Design and Implementation of a Tool for the Automatic Construction of Hypertexts for Information Retrieval , 1996, Inf. Process. Manag..

[37]  Monica Landoni,et al.  The Visual Book system : a study of the use of visual rhetoric in the design of electronic books , 1997 .

[38]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .