Data Mining for Cross-Sectional Data

This chapter investigates methods for analyzing cross-sectional data, i.e., data which are represented in matrix form, where each row represents one process instance. The analysis methods can be grouped into supervised learning methods, also known as predictive analysis, and unsupervised learning. Under the term predictive analysis, we summarize analytical techniques for regression and classification, whereas in case of unsupervised learning, we present methods for cluster analysis. Section 5.1 gives an introduction to supervised learning and Sects. 5.2 and 5.3 present a number of techniques for regression and classification. Section 5.4 treats principles of unsupervised learning and techniques for cluster analysis.

[1]  Michael Friendly,et al.  Where's Waldo? Visualizing Collinearity Diagnostics , 2009 .

[2]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[3]  Bernhard Schölkopf,et al.  A Primer on Kernel Methods , 2004 .

[4]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[5]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[6]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[7]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[8]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[9]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[10]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[11]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[12]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[13]  Stefan Fritsch,et al.  neuralnet: Training of Neural Networks , 2010, R J..

[14]  Vladimir Cherkassky,et al.  Learning from Data: Concepts, Theory, and Methods , 1998 .

[15]  Richard D. Deveaux,et al.  Applied Smoothing Techniques for Data Analysis , 1999, Technometrics.

[16]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[17]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[18]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[19]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[20]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[21]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[22]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[23]  Xindong Wu,et al.  The Top Ten Algorithms in Data Mining , 2009 .

[24]  Stephen R. Marsland,et al.  Machine Learning - An Algorithmic Perspective , 2009, Chapman and Hall / CRC machine learning and pattern recognition series.

[25]  J. Faraway Linear Models with R , 2014 .

[26]  J. Craggs Applied Mathematical Sciences , 1973 .