Automatic Generation of Generalization Lemmas for Proving Properties of Tail-Recursive Definitions

Automatically proving properties of tail-recursive function definitions by induction is known to be challenging. The difficulty arises due to a property of a tail-recursive function definition typically expressed by instantiating the accumulator argument to be a constant only on one side of the property. The application of the induction hypothesis gets blocked in a proof attempt. Following an approach developed by Kapur and Subramaniam, a transformation heuristic is proposed which hypothesizes the other side of property to also have an occurrence of the same constant. Constraints on the transformation are identified which enable a generalization of the constant on both sides with the hope that the generalized conjecture is easier to prove. Conditions are generated from which intermediate lemmas necessary to make a proof attempt to succeed can be speculated. By considering structural properties of recursive definitions, it is possible to identify properties of the functions used in recursive definitions for the conjecture to be valid. The heuristic is demonstrated on well-known tail-recursive definitions on numbers as well as other recursive data structures, including finite lists, finite sequences, finite trees, where a definition is expressed using one recursive call or multiple recursive calls. In case, a given conjecture is not valid because of a possible bug in an implementation (a tail-recursive definition) or a specification (a recursive definition), the heuristic can be often used to generate a counter-example. Conditions under which the heuristic is applicable can be checked easily. The proposed heuristic is likely to be helpful for automatically generating loop invariants as well as in proofs of correctness of properties of programs with respect to their specifications.