A relatedness analysis of government regulations using domain knowledge and structural organization

The complexity and diversity of government regulations make understanding and retrieval of regulations a non-trivial task. One of the issues is the existence of multiple sources of regulations and interpretive guides with differences in format, terminology and context. This paper describes a comparative analysis scheme developed to help retrieval of related provisions from different regulatory documents. Specifically, the goal is to identify the most strongly related provisions between regulations. The relatedness analysis makes use of not only traditional term match but also a combination of feature matches, and not only content comparison but also structural analysis.Regulations are first compared based on conceptual information as well as domain knowledge through feature matching. Regulations also possess specific organizational structures, such as a tree hierarchy of provisions and heavy referencing between provisions. These structures represent useful information in locating related provisions, and are therefore exploited in the comparison of regulations for completeness. System performance is evaluated by comparing a similarity ranking produced by users with the machine-predicted ranking. Ranking produced by the relatedness analysis system shows a reduction in error compared to that of Latent Semantic Indexing. Various pairs of regulations are compared and the results are analyzed along with observations based on different feature usages. An example of an e-rulemaking scenario is shown to demonstrate capabilities and limitations of the prototype relatedness analysis system.

[1]  Dieter Merkl,et al.  En route to data mining in legal text corpora: clustering, neural computation, and international treaties , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[2]  Leon Sterling,et al.  JUSTICE: a judicial search tool using intelligent concept extraction , 1999, ICAIL '99.

[3]  C. Lee Giles,et al.  CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications , 1998, AGENTS '98.

[4]  Kincho H. Law,et al.  A comparative analysis framework for semi-structured documents, with applications to government regulations , 2004 .

[5]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  Aviezri S. Fraenkel,et al.  Local Feedback in Full-Text Retrieval Systems , 1977, JACM.

[8]  Alan F. Smeaton,et al.  A Connectivity Analysis Approach to Increasing Precision in Retrieval From Hyperlinked Documents , 1999, TREC.

[9]  Stephen Budiansky,et al.  US Environmental Protection Agency: Political pollution unabated , 1983, Nature.

[10]  Kincho H. Law,et al.  An Information Infrastructure for Government Regulations , 2003 .

[11]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[12]  Trevor J. M. Bench-Capon Knowledge-based systems and legal applications , 1991 .

[13]  C. Coglianese E-Rulemaking: Information Technology and the Regulatory Process , 2004 .

[14]  Kincho H. Law,et al.  Similarity analysis on government regulations , 2003, KDD '03.

[15]  Peter Jackson,et al.  A machine learning approach to prior case retrieval , 2001, ICAIL '01.

[16]  Berthier A. Ribeiro-Neto,et al.  Link-based and content-based evidential information in a belief network model , 2000, SIGIR '00.

[17]  Edwina L. Rissland,et al.  What You Saw Is What You Want: Using Cases to Seed Information Retrieval , 1997, ICCBR.

[18]  N Burrows,et al.  The Scottish Executive , 1999 .

[19]  Marie-Francine Moens,et al.  Abstracting of legal cases: the SALOMON experience , 1997, ICAIL '97.

[20]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[21]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[22]  Berthier A. Ribeiro-Neto,et al.  Local versus global link information in the Web , 2003, TOIS.

[23]  Ronald Reagan,et al.  Small Business Administration , 1978 .

[24]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[25]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[26]  Gene H. Golub,et al.  Matrix computations , 1983 .

[27]  L. Karl Branting,et al.  Reasoning with portions of precedents , 1991 .

[28]  Carole D. Hafner,et al.  The potential of artificial intelligence to help solve the crisis in our legal system , 1989, CACM.

[29]  John Zeleznikow,et al.  Building intelligent legal information systems : representation and reasoning in law , 1994 .

[30]  Kevin D. Ashley,et al.  Improving the representation of legal case texts with information extraction methods , 2001, ICAIL '01.

[31]  Michael W. Berry,et al.  Understanding search engines: mathematical modeling and text retrieval (software , 1999 .

[32]  Kevin D. Ashley,et al.  AI and Law: A fruitful synergy , 2003, Artif. Intell..

[33]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[34]  Andreas Rauber,et al.  Automatic text representation, classification and labeling in European law , 2001, ICAIL '01.

[35]  Paul Jen-Hwa Hu,et al.  Technology Implementation Management in Law Enforcement , 2003, DG.O.

[36]  Marek J. Sergot,et al.  The British Nationality Act as a logic program , 1986, CACM.

[37]  Edwina L. Rissland,et al.  CABARET: Rule Interpretation in a Hybrid Architecture , 1991, Int. J. Man Mach. Stud..

[38]  Kincho H. Law,et al.  A Framework for Regulation Comparison with Application to Accessibility Codes , 2003, DG.O.

[39]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[40]  Kincho H. Law,et al.  A software infrastructure for regulatory information management and compliance assistance , 2003 .

[41]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[42]  Daniel G. Bobrow,et al.  Making ontologies work for resolving redundancies across documents , 2002, CACM.

[43]  Paul Thompson Automatic categorization of case law , 2001, ICAIL '01.

[44]  K. Branting,et al.  Building Explanations from Rules and Structured Cases , 1991, Int. J. Man Mach. Stud..

[45]  Kincho H. Law,et al.  Logic-based regulation compliance-assistance , 2003, ICAIL.

[46]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[47]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[48]  Daniel G. Bobrow,et al.  Scalability of Redundancy Detection in Focused Document Collections , 2002 .

[49]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.