Natural language processing and query expansion in legal information retrieval: Challenges and a response

As methods in legal information retrieval (IR) evolve to meet the demands of rapidly increasing stores of electronic information, there is the intuitive appeal of capturing detail in legal queries with natural language processing (NLP). One difficulty with this approach is that incorporation of word dependencies in IR has not been shown to consistently and reliably improve results over a unigram bag-of-words approach. We consider challenges faced when incorporating NLP in IR and briefly review three proposals in this vein, highlighting how these might have responded better to requirements in legal search. We then present our novel response based on split query expansion that accounts for the way lawyers seek to apply search results whilst meeting the challenges identified in a unique and flexible manner.

[1]  Thorsten Brants,et al.  Natural Language Processing in Information Retrieval , 2003, CLIN.

[2]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[3]  Barbara J. Grosz,et al.  Natural-Language Processing , 1982, Artificial Intelligence.

[4]  Jon Oberlander,et al.  Evaluation of semantic events for legal case retrieval , 2009, ESAIR '09.

[5]  Peter Bruza,et al.  Discovering information flow suing high dimensional conceptual space , 2001, SIGIR '01.

[6]  John Kingston,et al.  No Model Behaviour: Ontologies for Fraud Detection , 2003, Law and the Semantic Web.

[7]  Tomek Strzalkowski,et al.  Natural Language Information Retrieval: TREC-8 Report , 1994, TREC.

[8]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[9]  Burkhard Schafer,et al.  Concept and Context in Legal Information Retrieval , 2008, JURIX.

[10]  Wessel Kraaij,et al.  Viewing stemming as recall enhancement , 1996, SIGIR '96.

[11]  Alan F. Smeaton,et al.  Progress in the Application of Natural Language Processing to Information Retrieval Tasks , 1992, Comput. J..

[12]  Alan F. Smeaton,et al.  Indexing Structures Derived from Syntax in TREC-3: System Description , 1994, TREC.

[13]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[14]  Ian Freckelton,et al.  Vexatious litigant law reform. , 2009, Journal of law and medicine.

[15]  Karen Sparck Jones What is the Role of NLP in Text Retrieval , 1999 .

[16]  Alan F. Smeaton,et al.  Using NLP or NLP Resources for Information Retrieval Tasks , 1999 .