Decision Trees for Business Intelligence and Data Mining: Using SAS® Enterprise Miner™

This book presents a series of intellectual discussions concerning some of the most basic topics in statistics, including interpreting probability and statistical models of induction, including the concepts of inference, p values, and the comparison of evidential theories. The book is a well-researched and thoroughly professional approach to the various dilemmas faced by statisticians in interpreting data. The author’s knowledge of the literature as it pertains to the various issues discussed in the book is excellent and reminds me of much of the literature I read and studied during my lengthy and varied education in statistics and the practice of statistical science. Furthermore, he provides for teachers of statistics a lengthy list of quotes from many great statisticians on subjects of interest to students of statistics, including Deming, Shewhart, and others, all the way to deFinetti and other Bayesians. The wealth of knowledge present in this 152-page book is remarkable; the book is a worthy of addition to any library of statistical literature. Important questions as to why this book was written, what is its purpose and how does a practicing statistician use the knowledge therein (e.g., “what is the purpose of the nature of statistical evidence?”) is left to the reader to answer for himself or herself. The underlying theme of the book is never stated by the author in a simple, clear-cut manner. Does one use the knowledge of the text to make one a better statistician or a better teacher of statistics, or simply to understand the logic of statistical science in a way that is meaningful to all? As I read the book intensively, I asked myself whether this book is good for teaching others or is simply an intellectual exercise for some. What does the author have in mind when he discusses such common and useful concepts as p values or true value? Does he contribute to the understanding of these concepts, or does he make them more difficult to comprehend? Furthermore, he translates at least one German word, iterationen, incorrectly. The German ending-en indicates that the word is plural and is translated as iterations or runs. This simple error makes one think that there are many places in the text, especially some of the mathematics, which also may be in error. Early in the book, the author attempts to determine some definition of statistical evidence (last paragraph on p. 3). But after reading the paragraph and the extensive discussion of the entire text, one cannot be sure what one has read. When teaching statistics at the college level, I try to keep the ideas as simple as possible without relying on an overwhelming exercise in sophistication as the goal. The author seems to define concepts, redefine concepts, and then define the same concepts yet again, with the end result being imperfect and probably misunderstood. Much of the ideas expressed and discussed in the book appear in a manner that is exasperating to comprehend. As an analogy, reading this book is like entering a room to learn something new but in the end leaving the room with no more knowledge and understanding than when one entered. There is simply too much confusion resulting from lengthy discussions of relatively simple concepts. Because the book is part of a series titled “Lecture Notes in Statistics,” I can only deduce that it is nothing more than a series of lecture notes. There is no flow to the book and little thought given to presenting the details in a style that would induce one to want to read it. When I was young, I would hear that many statisticians were very wise and intelligent persons who should be listened to and understood but presented their knowledge in a direct and understandable way. I fear that this book is full of confusing discussions of very important issues. I would like to see this material rewritten in a manner that we as readers would not have to work so hard to understand the aims of the author. Although the author produces and quotes the works of many great thinkers, he does not bring out their points in a style that is easy for the reader to follow. Often, he places difficult mathematical theorems and proofs in the midst of lengthy paragraphs not needing such mathematical sophistication. When I started reading this book, I expected a thoughtful text introducing the reader to the origins of statistics as a science and methodology. The book did not do this, nor did it present the numerous wonderful applications of statistics to the collection, analysis, and interpretation of data that illuminate our knowledge of the world. Where are the examples in quality control, experimentation, and the like that the author eludes to and how does statistics provide the foundation for us to understand our world? When I finished the book, I felt that the author had failed to communicate to the reader why statistics is so important; instead, he clarifies less than he confuses.