The Generalized Torquist: Specification and Estimation of a New Vocabulary-Text Size Function

The aim of this study has been to construct and test a new model for vocabulary size (V) and text length (N) or V – N relationship. In this study we propose and estimate a new V – N function, the generalized Torquist, for which the elasticity parameter varies with the text length. As its name suggests, the generalized Torquist function includes the Torquist function as special case and thus permits testing using conventional statistical techniques. Thus, it enables us to shed new light on the relationship between V and N. We apply the new generalized Torquist function to eight ancient Greek texts. To our knowledge this study is the first to estimate such a generalized Torquist function using ancient Greek texts. This study, in addition, makes use of the concept of elasticity which can serve not only to explore the relationship between changes in N and V but also as a powerful tool to analyse text similarity pattern.