Algorithmic Inference in Machine Learning

This book can be succinctly characterized as a coherent and comprehensive treatise of the fundamentals, algorithms, and practice of machine learning. Being more specific, the authors have focused on various inference paradigms developed within a framework of the computational learning theory. The content in the book, as stated by authors, encompasses ideas, experience and experiments developed throughout the life of the Laboratory for Neural Networks, University of Milan, Milan, Italy. The volume is structured into five chapters. The first one titled “Knowledge versus randomness” addresses an important and practically relevant question of relationships between these two concepts and demonstrates how probabilistic models help capture and quantify randomness. Most of this chapter serves as a solid prerequisite to the rest of the book by covering the fundamentals of calculus of random variables. The next chapter focuses on mechanisms of algorithmic inferences realized in the statistical setting. The reader finds here ideas of sufficient statistics, confidence intervals, adequacy of sample size, point estimators, and issues of entropy. The first two chapters naturally form the introductory part of the book as dealing with the foundations of the subject. The second part is about the applications of machine learning and consists of three comprehensive chapters: on computational learning theory, regression theory, and subsymbolic learning. This combination very much reflects the main directions of the key developments encountered in machine learning. Chapter 3 is devoted to computational learning and includes its core concepts. Learning Boolean functions along with their well known probably approximately correct (PAC) principle occupies a prominent position in the theory and the same becomes true in this chapter. A lot of attention is paid to computing the hypotheses, confidence intervals for learning errors, high level abstraction and learnability. The chapter on regression theory (Chapter 4) is quite extensive and includes aspects of linear and nonlinear regression, PAC learning of regression functions, point estimators and confidence intervals. When looking at neural networks (the subject covered in a number of sections of the book), it is worth stressing that this material is nicely cast in the framework of the general learning theory. I would say this is a quite inspiring environment that is both useful and appealing to newcomers to the area of neurocomputing as well as those very much versatile in learning algorithms of neural networks. Neural networks dominate a chapter on subsymbolic learning (Chapter 5 of the book). Here the authors cover a generic taxonomy of the networks, elaborate on various learning strategies (cast in the setting of machine learning) and tackle the subject of learning without supervision (self associative memories and self-organizing memories/maps). Some of the topics that deliver another look at neural networks concern networks and data compression (the compression relates to the idea of Kolgomorov complexity), sufficient statistics, and stochastic facets of learning mechanisms. A comprehensive, well-organized and carefully compiled bibliography is an asset of the book; one can find here entries to classic texts in statistics, probability, learning theory, fuzzy sets, granular computing, and neural networks. The book is equipped with sidebars highlighting