Distances in Classification

The notion of distance is the most important basis for classification. This is especially true for unsupervised learning, i.e. clustering, since there is no validation mechanism by means of objects of known groups. But also for supervised learning standard distances often do not lead to appropriate results. For every individual problem the adequate distance is to be decided upon. This is demonstrated by means of three practical examples from very different application areas, namely social science, music science, and production economics. In social science, clustering is applied to spatial regions with very irregular borders. Then adequate spatial distances may have to be taken into account for clustering. In statistical musicology the main problem is often to find an adequate transformation of the input time series as an adequate basis for distance definition. Also, local modelling is proposed in order to account for different subpopulations, e.g. instruments. In production economics often many quality criteria have to be taken into account with very different scaling. In order to find a compromise optimum classification, this leads to a pre-transformation onto the same scale, called desirability.

[1]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[2]  Gero Szepannek,et al.  Application of a Genetic Algorithm to Variable Selection in Fuzzy Clustering , 2004, GfKl.

[3]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[4]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[5]  Claus Weihs,et al.  Classification in music research , 2007, Adv. Data Anal. Classif..

[6]  Claus Weihs,et al.  Local Models in Register Classification by Timbre , 2006, Data Science and Classification.

[7]  Claus Weihs,et al.  Register Classification by Timbre , 2004, GfKl.

[8]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[9]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[10]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[11]  J. Gower,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[12]  Ramón López de Mántaras,et al.  A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.

[13]  Heike Trautmann,et al.  On the distribution of the desirability index using Harrington’s desirability function , 2006 .

[14]  Petra Perner,et al.  Case-Based Reasoning and the Statistical Challenges , 2008, ECCBR.

[15]  Petra Perner,et al.  Data Mining on Multimedia Data , 2002, Lecture Notes in Computer Science.

[16]  Claus Weihs,et al.  Local Modelling in Classification , 2008, ICDM.

[17]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .