Dyadic Data Analysis

neighbors, classification and regression trees (six pages), and neural networks (17 pages). The mechanics of the latter are adequately described, but there is too much emphasis on arithmetic, and little effort is made to intuitively justify the prediction process. At the end of the chapter, there is one paragraph on multiple regression analysis and one sentence on logistic regression. A major disappointment to me was the almost exclusive reliance in the examples on a rather old automobile fuel efficiency dataset (there is one observation for a Datsun 1200 vehicle). I had hoped to see some real business applications. After a four page “Deployment” chapter, the book ends with a “Conclusions” chapter containing one large-scale example involving data on the incidence of diabetes among Pima Indians. Here we find histograms, box plots, a two-sample t test, some derived associative rules which I did not find overly insightful, and a brief summary of prediction results via neural networks. I think students coming out of an undergraduate regression course could do a fine job of analyzing these data without resorting to the bells and whistles of data mining. And once again, where are the business applications? Although the text does give a brief snapshot of the subject, it is lacking in detail, applications, and opportunities for practice. Someone considering becoming involved in a data mining project or teaching an introductory course in the subject would be advised to learn much more than what MSD offers. Good information sources are the much more ambitious books by Hastie, Tibshirani, and Friedman (2001) (the best-selling Springer statistics book ever, thanks to purchases by those outside our discipline) and Bishop (2006).