Rens Bod, Beyond grammar: an experience-based theory of language. Stanford, CA: CSLI Publications, 1998. Pp. xiii+168.

During the last  years, there has been a sea change in natural language processing (NLP), with the majority of the field turning to the use of machine learning methods, particularly probabilistic models learned from richly annotated training data, rather than relying on handcrafted grammar models. Until recently, this revolution has had little impact within linguistics proper (as noted, but lamented, by Abney ), but this is now beginning to change, giving Bod’s Beyond grammar particular relevance. This medium-length monograph is largely in the tradition of NLP research, but it is more interesting to linguists than most such work because it does spend time on foundational issues, arguing for an ‘experience-based’ model of language as an alternative to standard rule-based conceptions. Indeed, it is useful to distinguish two parts to Bod’s book. The larger, middle part is solidly NLP: statistical parsing, the formalization of various probabilistic grammar models and the evaluation of systems on parsing tasks. The beginning and end of the book address the big picture of how to approach human language processing. The book’s thesis is that the central issues in human language cannot be described via a competence grammar, but rather should be described via ‘a statistical ensemble of language experiences ’ () remembered by each language user – a corpus, if you will – which the language user draws on and productively recombines to understand and produce new sentences. The central issue of linguistics should not be Universal Grammar but defining a Universal Representation suitable for this corpus – these representational issues are particularly acute once one moves beyond syntactic phrase structure to issues of semantic and discourse representation. In such a conception, linguistic competence and performance are inseparably intertwined. At a big picture level, I think Bod is right in a number of respects. He is right to emphasize that people are very sensitive to frequency when processing and producing language, and that this was for a long time ignored. He is right to believe that linguistics should be more engaged with modern machine learning research, and that it was a mistake to think that limitations on long-term memory are a major concern in models of human cognition. And I suspect that he is right in his scepticism toward traditional views of a strongly innate notion of knowledge of language. The distinctive feature of Bod’s Data Oriented Parsing (DOP) approach is to model sentence probabilities in terms of the previously observed frequencies of sentence fragments, including large fragments, whereas most other approaches work using just information in immediate local trees, with limited means of information percolation, in particular, use of head percolation in a manner familiar from theories such as Generalized Phrase Structure Grammar. The use of large and varied fragments to predict the probabilities of trees allows Bod to give a good account of processing idiom chunks of various sorts, and to explain non-head dependencies, such as between a superlative and a following PP – the fastest woman in the world – although whether there is a non-head dependency here depends on details of the assumed linguistic analysis. This is not the appropriate venue for a detailed discussion of the technical NLP part of the book. To summarize very briefly, Bod’s DOP model is one of a number of approaches to statistical parsing developed during the s (see Manning & Schu$ tze () for general background and discussion of other approaches). Most of the book discusses DOP or Stochastic Tree-Substitution grammars, which use conventional phrase structure tree representations. The emphasis on syntactic phrase structure trees is implausible from a psycholinguistic perspective, where most evidence suggests that humans rapidly forget words and syntactic structures and remember only meanings, but Bod justifies this from the practical perspective that the large structured corpora available provide only phrase structure trees. Later chapters discuss