Assessing the State of the Art of Commercial Tools for Unstructured Information Exploitation

This paper provides a snapshot of the state-of-the-art in information retrieval and information extraction from text through a selection of commercial, market-leading tools. Rooted in a research project conducted for the Belgian Police, we give an overview of the main (desired) features provided or lacked by these tools, along with their measured quality in operation. Various shortcomings and suggestions for improvement will be formulated.

[1]  Marie-Francine Moens,et al.  Evaluation of Intelligent Exploitation Tools for Non-structured Police Information , 2005 .

[2]  William S. Cooper,et al.  On selecting a measure of retrieval effectiveness , 1973, J. Am. Soc. Inf. Sci..

[3]  F. W. Lancaster,et al.  Information retrieval systems; characteristics, testing, and evaluation , 1968 .

[4]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[5]  Kalina Bontcheva,et al.  Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project , 2004, Data Knowl. Eng..

[6]  Kalina Bontcheva,et al.  Multilingual adaptations of a reusable information extraction tool , 2003, EACL.

[7]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[8]  Ken W. Collier,et al.  A methodology for evaluating and selecting data mining software , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.

[9]  Diana Maynard,et al.  Multilingual adaptations of ANNIE, a reusable information extraction tool , 2003 .

[10]  Alexander Schnabl,et al.  Development of Multi-Criteria Metrics for Evaluation of Data Mining Algorithms , 1997, KDD.

[11]  Marie-Francine Moens,et al.  Information Extraction: Algorithms and Prospects in a Retrieval Context , 2006, The Information Retrieval Series.

[12]  Marie-Francine Moens,et al.  Rpref: a generalization of Bpref towards graded relevance judgments , 2006, SIGIR '06.

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  Gerhard Weikum,et al.  Intelligent Search on XML Data: Applications, Languages, Models, Implementations, and Benchmarks , 2003 .

[15]  Hsinchun Chen,et al.  Extracting Meaningful Entities from Police Narrative Reports , 2002, DG.O.

[16]  Peter Ingwersen,et al.  Information Retrieval Interaction , 1992 .

[17]  John Zeleznikow,et al.  Decision support systems for police: Lessons from the application of data mining techniques to “soft” forensic evidence , 2006, Artificial Intelligence and Law.