Predicting litigation likelihood and time to litigation for patents

An ability to forecast the likelihood of a patent litigation1 and time-to-litigation benefits companies in many aspects, such as in patent portfolio management, and strategic planning. Thus, we develop predictive models for estimating the likelihood of litigation for patents and the expected time to litigation. Our work focuses on improving the state-of-the-art by relying on a different set of features and employing more sophisticated algorithms with realistic data. Specifically, we consider potential factors influencing a patent to be litigated in the model. These features, collected at the issue date of the patent and thus prior to the actual litigation, include textual features, patent's general information as well as financial information of patent's assignee. Our proposed models are a combination of a clustering approach coupled with an ensemble classification method. With a very low litigation rate of 1 to 2 percent, the results from the models show promising predictability. Financial information and features related to referencing are important indicators to distinguish between litigated and non-litigated patents