Review of 'Learning Machines' (Nilsson, Nils J.; 1965)

Under the all-embracing title of Learning Mac/&es, this book is an easy-to-read, well-presented survey of one approach to one aspect of a subclass of learning machines. The subclass of learning machines treated in this book is that “which can be trained to recognize patterns.” This type of machine is fed sets of “patterns” from several pattern classes and is trained to recognize the class membership of new patterns by use of stored information extracted from the data through training. The approach to pattern recognition used in the book involves the use of discriminant functions (primarily linear discriminants). These partition the space of measured pattern attributes into regions by means of hyperplanes of adjustable locations and orientations. The aspect of pattern recognition treated by the book is limited to the processing of mearsurements for training and recognition. It is not concerned with the measurement selection problem. After an introduction to the subject, some important discriminant functions are discussed in Chapter 2. Linear discriminants, quadratic discriminants (or minimum distance classifiers), and certain types of polynomial discriminants are covered. The foundations of the pertinent aspects of decision theory are outlined in the third chapter. Since in decision theory the probability densities describing the distributions of the sample patterns are assumed to be known, except for a few unknown parameters which must be estimated from available data, the author discusses decision theoretic techniques under the heading, “Parametric Training Methods.” The Gaussian assumption of the functional form of the prevailing probability densities and the estimation of the means of the Gaussian process from samples plays a dominant role in this chapter. Chapters 4 and 5 are very important, for they give construction procedures and proofs for partitioning the space with threshold logic units (hyperplanes). A brief discussion of cascaded (or layered) machines is given in Chapter 6, and methods for seeking out the modes of the probability distributions are mentioned in the final chapter. While no examples are drawn from the real world, the frequent use of a graphic, geometrical portrayal of the ways in which the vector space is partitioned by the different methods discussed, does much to illustrate the properties (and the limitations) of the techniques presented. While the above and other limitations of the scope of coverage and depth of treatment are mentioned by the author at appropriate places, a student of learning machines (or of pattern recognition) who is not familiar with the literature and would attempt to aquaint himself with the field through this book would be left with an in-, complete picture of the state of the art. Learning Machines is conspicuous by its omission of subject matter for such an embracing title. Subjects not discussed include learning on unlabeled inputs (without a teacher), learning to discriminate among a set of mutually not exclusive hypotheses, the dependence of adaptive techniques on the order in which inputs are introduced, a discussion of the relationship between work in artificial intelligence and pattern recognition, etc. This reviewer sympathizes with Nilsson and admits that the present state of the art does not permit one to say a great deal about the above topics. Failure to mention them, even if only to point out their existence, however, tends to make the reader believe that the main concern of the field of “learning machines” is to determine how to place K 1 hyperplanes in an N-dimensional vector space to separate K finite sets of labeled vectors into their respective categories. For this reason, as a textbook or reference book on learning machines, this book fails to cover the subject adequately by virtue of its intentionally narrow scope. As a technical treatise or monograph on the application of (linear) discriminants to the problems of learning to recognize patterns from a finite number of samples, the book lacks depth and presents little that has not already been published. The problems that are treated and the solutions and ideas that are presented, however, are presented clearly, simply, and with an honest statement of their limitations. This is a refreshing change from much of the published literature of this still-obscure field.