Combinatorial optimisation and hierarchical classifications

Abstract This paper is devoted to some selected topics relating Combinatorial Optimization and Hierarchical Classification. It is oriented toward extensions of the standard classification schemes (the hierarchies): pyramids, quasi-hierarchies, circular clustering, rigid clustering and others. Bijection theorems between these models and dissimilarity models allow to state some clustering problems as optimization problems. Within the galaxy of optimization we have especially discussed the following: NP-completeness results and search for polynomial instances; problems solved in a polynomial time (e.g. subdominant theory); design, analysis and applications of algorithms. In contrast with the orientation to “new” clustering problems, the last part discusses some standard algorithmic approaches.

[1]  Lawrence Hubert,et al.  Linear and circular unidimensional scaling for symmetric proximity matrices , 1997 .

[2]  B. Mellers,et al.  Similarity and Choice. , 1994 .

[3]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[4]  Melvin F. Janowitz,et al.  The k-weak Hierarchies: An Extension of the Weak Hierarchical Clustering Structure , 1999, Electron. Notes Discret. Math..

[5]  L. Cavalli-Sforza,et al.  PHYLOGENETIC ANALYSIS: MODELS AND ESTIMATION PROCEDURES , 1967, Evolution; international journal of organic evolution.

[6]  C. J. Jardine,et al.  The structure and construction of taxonomic hierarchies , 1967 .

[7]  Bernhard Korte,et al.  Optimization and Operations Research , 1976 .

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  A. D. Gordon,et al.  Classification : Methods for the Exploratory Analysis of Multivariate Data , 1981 .

[10]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[11]  B. Leclerc Description combinatoire des ultramétriques , 1981 .

[12]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[13]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[14]  François Brucker Sub-dominant theory in numerical taxonomy , 2006, Discret. Appl. Math..

[15]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[16]  Patrice Bertrand Set Systems and Dissimilarities , 2000, Eur. J. Comb..

[17]  Hans-Jürgen Bandelt,et al.  An order theoretic framework for overlapping clustering , 1994, Discret. Math..

[18]  N. Henley A psychological study of the semantics of animal terms , 1969 .

[19]  Carole Durand-Lepoivre Ordres et graphes pseudo-hiérarchiques : théorie et optimisation algorithmique , 1989 .

[20]  K. Florek,et al.  Sur la liaison et la division des points d'un ensemble fini , 1951 .

[21]  François Brucker Modèles de classification en classes empiétantes , 2001 .

[22]  J. Hartigan REPRESENTATION OF SIMILARITY MATRICES BY TREES , 1967 .

[23]  M. Schader,et al.  New Approaches in Classification and Data Analysis , 1994 .

[24]  Patrice Bertrand,et al.  Set systems for which each set properly intersects at most one other set - Application to pyramidal clustering , 2002 .

[25]  Israël-César Lerman,et al.  REVUE DE STATISTIQUE APPLIQUÉE , 1987 .

[26]  G. N. Lance,et al.  A general theory of classificatory sorting strategies: II. Clustering systems , 1967, Comput. J..

[27]  E. Reingold,et al.  Combinatorial Algorithms: Theory and Practice , 1977 .

[28]  Lawrence Hubert,et al.  Graph-theoretic representations for proximity matrices through strongly-anti-Robinson or circular strongly-anti-Robinson matrices , 1998 .

[29]  P. Sneath The application of computers to taxonomy. , 1957, Journal of general microbiology.

[30]  Rudolf Bayer,et al.  Symmetric binary B-Trees: Data structure and maintenance algorithms , 1972, Acta Informatica.

[31]  Pierre Hansen,et al.  How to Choose K Entities Among N , 1994, Partitioning Data Sets.

[32]  Boumedine Bouriche L'analyse de similitude , 2005 .

[33]  Jean Diatta Dissimilarités multivoies et généralisations d'hypergraphes sans triangles , 1997 .

[34]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[35]  P. Duchet Classical Perfect Graphs: An introduction with emphasis on triangulated and interval graphs , 1984 .

[36]  J. Carroll,et al.  Spatial, non-spatial and hybrid models for scaling , 1976 .

[37]  Melvin F. Janowitz Continuous L-cluster methods , 1981, Discret. Appl. Math..

[38]  W. S. Robinson A Method for Chronologically Ordering Archaeological Deposits , 1951, American Antiquity.

[39]  Victor Chepoi,et al.  Recognition of Robinsonian dissimilarities , 1997 .

[40]  Jean Diatta Approximating dissimilarities by quasi-ultrametrics , 1998, Discret. Math..

[41]  E. Diday Une nouvelle méthode en classification automatique et reconnaissance des formes la méthode des nuées dynamiques , 1971 .

[42]  E. Diday Inversions en classification hiérarchique : application à la construction adaptative d'indices d'agrégation , 1982 .

[43]  Jean Diatta Une extension de la classification hiérarchique : les quasi-hiérarchies , 1996 .

[45]  R. Sokal,et al.  THE COMPARISON OF DENDROGRAMS BY OBJECTIVE METHODS , 1962 .

[46]  Melvin F. Janowitz,et al.  The k-weak Hierarchical Representations: An Extension of the Indexed Closed Weak Hierarchies , 2003, Discret. Appl. Math..

[47]  Jean-Pierre Barthélemy,et al.  NP-hard Approximation Problems in Overlapping Clustering , 2001, J. Classif..

[48]  Bruno Leclerc,et al.  Les hiérarchies de parties et leur demi-treillis , 1985 .

[49]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[50]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[51]  Mirko Krivánek,et al.  NP-hard problems in hierarchical-tree clustering , 1986, Acta Informatica.

[52]  Robin Sibson,et al.  Some Observations on a Paper by Lance and Williams , 1971, Comput. J..

[53]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[54]  J. V. Ness,et al.  Space-conserving agglomerative algorithms , 1996 .

[55]  R. P. Dilworth Review: G. Birkhoff, Lattice theory , 1950 .

[56]  M. F. Janowitz,et al.  Monotone Equivariant Cluster Methods , 1979 .

[57]  P. Brucker On the Complexity of Clustering Problems , 1978 .

[58]  M. F. Janowitz,et al.  An Order Theoretic Model for Cluster Analysis , 1978 .

[59]  B. Leclerc,et al.  La comparaison des hiérarchies: indices et métriques , 1985 .

[60]  Bernard Van Cutsem,et al.  Classification And Dissimilarity Analysis , 1994 .

[61]  E. Diday Une représentation visuelle des classes empiétantes: les pyramides , 1986 .

[62]  François Brucker From hypertrees to arboreal quasi-ultrametrics , 2005, Discret. Appl. Math..

[63]  J. Farris On the Cophenetic Correlation Coefficient , 1969 .

[64]  Alain Quilliot Circular representation problem on hypergraphs , 1984, Discret. Math..

[65]  H. Colonius,et al.  Tree structures for proximity data , 1981 .

[66]  J. Chandon,et al.  Construction de l'ultramétrique la plus proche d'une dissimilarité au sens des moindres carrés , 1980 .

[67]  V. Chepoi,et al.  l ∞ -approximation via subdominants , 2000 .

[68]  A. Dress,et al.  Weak hierarchies associated with similarity measures--an additive clustering technique. , 1989, Bulletin of mathematical biology.

[69]  R. Prim Shortest connection networks and some generalizations , 1957 .

[70]  L. Mcquitty Elementary Linkage Analysis for Isolating Orthogonal and Oblique Types and Typal Relevancies , 1957 .