论文信息 - Network Clustering

Network Clustering

Clustering can be loosely defined as the process of grouping objects into sets called clusters, so that each cluster consists of elements that are similar in some way. The similarity criterion can be defined in several different ways, depending on applications of interest and the objectives that the clustering aims to achieve. For example, in distance-based clustering (see Figure 1) two or more elements belong to the same cluster if they are close with respect to a given distance metric. On the other hand, in conceptual clustering, which can be traced back to Aristotle and his work on classifying plants and animals, the similarity of elements is based on descriptive concepts. Clustering is used for multiple purposes, including finding “natural” clusters (modules) and describing their properties, classifying the data, and detecting unusual data objects (outliers). In addition, treating a cluster or one of its elements as a single representative unit allows us to achieve data reduction. Network clustering, which is the subject of this chapter, deals with clustering the data represented as a network, or a graph. Indeed, many data types can be conveniently modeled using graphs. This process is sometimes called link analysis. Data points are represented by vertices and an edge exists if two data points are similar or related in a certain way. It is important

Sergiy Butenko | Balabhaskar Balasundaram | S. Butenko | Balabhaskar Balasundaram

[1] Aidong Zhang,et al. Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2] Teofilo F. GONZALEZ,et al. Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[3] Tao Jiang,et al. Algorithmic Approaches to Clustering Gene Expression Data , 2002 .

[4] Valmir Carneiro Barbosa,et al. A distributed algorithm to find k-dominating sets , 2004, Discret. Appl. Math..

[5] G. Nemhauser,et al. The k-Domination and k-Stability Problems on Sun-Free Chordal Graphs , 1984 .

[6] Sergiy Butenko,et al. Novel Approaches for Analyzing Biological Networks , 2005, J. Comb. Optim..

[7] Mauricio G. C. Resende,et al. Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[8] Valerie Guralnik,et al. A scalable algorithm for clustering protein sequences , 2001, BIOKDD.

[9] Laurence A. Wolsey,et al. Formulations and valid inequalities for the node capacitated graph partitioning problem , 1996, Math. Program..

[10] Panos M. Pardalos,et al. On maximum clique problems in very large graphs , 1999, External Memory Algorithms.

[11] M E J Newman,et al. Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12] Hsueh-I Lu,et al. The Power of Local Optimization: Approximation Algorithms for Maximum-Leaf Spanning Tree , 2007 .

[13] Heping Zhang,et al. Correcting the loss of cell-cycle synchrony in clustering analysis of microarray data using weights , 2004, Bioinform..

[14] John Scott. Social Network Analysis , 1988 .

[15] David K. Smith. Network Flows: Theory, Algorithms, and Applications , 1994 .

[16] Ron Shamir,et al. A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[17] Reinhard Diestel,et al. Graph Theory , 1997 .

[18] Kyoungrim Lee,et al. Study of protein–protein interaction using conformational space annealing , 2005, Proteins.

[19] Michael R. Anderberg,et al. Cluster Analysis for Applications , 1973 .

[20] J. Rothberg,et al. Gaining confidence in high-throughput protein interaction networks , 2004, Nature Biotechnology.

[21] George L. Nemhauser,et al. Min-cut clustering , 1993, Math. Program..

[22] M. Jambu,et al. Cluster analysis and data analysis , 1985 .

[23] Michael A. Langston,et al. Combinatorial Genetic Regulatory Network Analysis Tools for High Throughput Transcriptomic Data , 2005, Systems Biology and Regulatory Genomics.