Strahler based graph clustering using convolution

We propose a method for the visualization of large graphs. Our approach is based on the calculation of a density function resulting from the application of a metric on the vertices of a graph. The density function is then filtered using a convolution, leading to a partition of the graph. The choice of an appropriate kernel for the convolution makes it possible to control the number of clusters, and their size. Our algorithm can be executed automatically, but the parameters can also be interactively fixed by the user. We applied the algorithm to the problem of legacy code extraction from inclusion relation of C++ source files and film sequence analysis. The metric used here is defined from Strahler numbers, which measure the "ramification" level of graph vertices.

[1]  R. Horton EROSIONAL DEVELOPMENT OF STREAMS AND THEIR DRAINAGE BASINS; HYDROPHYSICAL APPROACH TO QUANTITATIVE MORPHOLOGY , 1945 .

[2]  A. N. Strahler Hypsometric (area-altitude) analysis of erosional topography. , 1952 .

[3]  Andrei P. Ershov On programming of arithmetic operations , 1958, CACM.

[4]  Philippe Flajolet,et al.  The Number of Registers Required for Evaluating Arithmetic Expressions , 1979, Theor. Comput. Sci..

[5]  Xavier Gérard Viennot,et al.  Combinatorial analysis of ramified patterns and computer imagery of trees , 1989, SIGGRAPH.

[6]  Andrew B. Kahng,et al.  Recent directions in netlist partitioning: a survey , 1995, Integr..

[7]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[8]  Stefano Rizzi,et al.  Dynamic Clustering of Maps in Autonomous Agents , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[10]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[11]  Stefano Rizzi,et al.  Genetic operators for hierarchical graph clustering , 1998, Pattern Recognit. Lett..

[12]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[13]  Richard C. Holt,et al.  MoJo: a distance metric for software clusterings , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[14]  Richard C. Holt,et al.  On the stability of software clustering algorithms , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[15]  Ivan Herman,et al.  Density functions for visual attributes and effective partitioning in graph visualization , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[16]  Rainer Koschke,et al.  A framework for experimental evaluation of clustering techniques , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[17]  H Kawaji,et al.  A graph-based clustering method for a large set of sequences using a graph partitioning algorithm. , 2001, Genome informatics. International Conference on Genome Informatics.

[18]  John R. Smith,et al.  MPEG-7 multimedia description schemes , 2001, IEEE Trans. Circuits Syst. Video Technol..

[19]  Ning Chen,et al.  A graph-based clustering algorithm in large transaction databases , 2001, Intell. Data Anal..

[20]  David Auber Outils de visualisation de larges structures de données , 2002 .

[21]  David Auber,et al.  USING STRAHLER NUMBERS FOR REAL TIME VISUAL EXPLORATION OF HUGE GRAPHS , 2002 .

[22]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[23]  Philippe Duchon,et al.  New Strahler Numbers for Rooted Plane Trees , 2004 .

[24]  David Auber,et al.  Tulip - A Huge Graph Visualization Framework , 2004, Graph Drawing Software.

[25]  Jenny Benois-Pineau,et al.  DAG-based visual interfaces for navigation in indexed video content , 2006, Multimedia Tools and Applications.