Risk analysis for intellectual property litigation

We introduce the problem of risk analysis for Intellectual Property (IP) lawsuits. More specifically, we focus on estimating the risk for participating parties using solely prior factors, i. e., historical and concurrent behavior of the entities involved in the case. This work represents a first step towards building a comprehensive legal risk assessment system for parties involved in litigation. This technology will allow parties to optimize their case parameters to minimize their own risk, or to settle disputes out of court and thereby ease the burden on the judicial system. In addition, it will also help U.S. courts detect and fix any inherent biases in the system. We model risk estimation as a relational classification problem using conditional random fields [6] to jointly estimate the risks of concurrent cases. We evaluate our model on data collected by the Stanford Intellectual Property Litigation Clearinghouse, which consists of over 4,200 IP lawsuits filed across 88 U.S. federal districts and ranging over 8 years, probably the largest legal data set reported in data mining research. Despite being agnostic to the merits of the case, our best model achieves a classification accuracy of 64%, 22% (relative) higher than the majority-class baseline.

[1]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[2]  Alexander M. Millkey The Black Swan: The Impact of the Highly Improbable , 2009 .

[3]  Max Welling,et al.  Learning in Markov Random Fields An Empirical Study , 2005 .

[4]  Henry A. Kautz,et al.  Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields , 2007, Int. J. Robotics Res..

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Onur Behzat Tokdemir,et al.  Predicting the Outcome of Construction Litigation Using Neural Networks , 1998 .

[7]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[8]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.

[9]  Kwok-Wing Chau,et al.  Predicting Construction Litigation Outcome Using Particle Swarm Optimization , 2005, IEA/AIE.

[10]  R. Fisher On the Interpretation of χ 2 from Contingency Tables , and the Calculation of P Author , 2022 .

[11]  Kevin P. Murphy,et al.  Figure-ground segmentation using a hierarchical conditional random field , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[12]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[13]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[14]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[15]  Andrew Stranieri,et al.  Knowledge Discovery from Legal Databases , 2005 .

[16]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..