Text and Citations Based Cluster Analysis of Legal Judgments

Developing efficient approaches to extract relevant information from a collection of legal judgments is a research issue. Legal judgments contain citations in addition to text. It can be noted that the link information has been exploited to build efficient search systems in web domain. Similarly, the citation information in legal judgments could be utilized for efficient search. In this paper, we have proposed an approach to find similar judgments by exploiting citations in legal judgments through cluster analysis. As several judgments have few citations, a notion of paragraph link is employed to increase the number of citations in the judgment. User evaluation study on the judgment dataset of Supreme Court of India shows that the proposed clustering approach is able to find similar judgments by exploiting citations and paragraph links. Overall, the results show that citation information in judgments can be exploited to establish similarity between judgments.

[1]  Yoshua Bengio,et al.  Convergence Properties of the K-Means Algorithms , 1994, NIPS.

[2]  Chris H. Q. Ding,et al.  Web document clustering using hyperlink structures , 2001, Comput. Stat. Data Anal..

[3]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[4]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[5]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[6]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[7]  Sushanta Kumar,et al.  Similarity analysis of legal judgments , 2011, Bangalore Compute Conf..

[8]  Paul Zhang,et al.  Semantics-based legal citation network , 2007, ICAIL.

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  Zdenek Zdráhal,et al.  Automatic generation of inter-passage links based on semantic similarity , 2010, COLING.

[11]  Paul Thompson Automatic categorization of case law , 2001, ICAIL '01.

[12]  Jack G. Conrad,et al.  Legal document clustering with built-in topic segmentation , 2011, CIKM '11.

[13]  M. Saravanan,et al.  Improving Legal Document Summarization Using Graphical Models , 2006, JURIX.

[14]  Peter Jackson,et al.  A machine learning approach to prior case retrieval , 2001, ICAIL '01.

[15]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[16]  Sushanta Kumar Similarity Analysis of Legal Judgments and applying 'Paragraph-link' to Find Similar Legal Judgments. , 2014 .

[17]  P. Krishna Reddy,et al.  Finding Similar Legal Judgements under Common Law System , 2013, DNIS.

[18]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[19]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[20]  Ying Zhao,et al.  Effective document clustering for large heterogeneous law firm collections , 2005, International Conference on Artificial Intelligence and Law.

[21]  Berthier A. Ribeiro-Neto,et al.  Combining link-based and content-based methods for web document classification , 2003, CIKM '03.

[22]  Monika Henzinger,et al.  Finding Related Pages in the World Wide Web , 1999, Comput. Networks.

[23]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.