Ranking and sparsifying edges of a graph

Many problems of practical interest can be represented by graphs. In practice, each edge of a graph is associated with some scalar eight denoting the similarity or the relevance between two vertices. However these edge weights might not directly reflect the relative "importance'' of edges in maintaining the structure of graphs. The study of edge ranking of graphs is to determine the graph edges by the relative importance under various criteria and is therefore essential for our fundamental understanding of graphs. In this thesis, we study various methods to determine the relative importance of the edges under several graphs models. Then we will use the edge ranking to examine two interrelated problems—graph sparsification and graph partitioning. First, we study the edge ranking problem in the (usual) graph. We use PageRank vectors to define the edge ranking and give an improved algorithm for computing approximate (personalized) PageRank vectors on a graph of n vertices with tight error bounds which can be as small as O 1nP for any fixed positive integer p. The improved PageRank algorithm is crucial for computing the quantitative ranking of edges in a given graph. Our graph sparsification algorithm samples edges of a given graph with probabilities proportional to the edge ranking defined by PageRank. It can be used as a preprocess for graph partitioning. The combination of the graph sparsification and the partitioning algorithms using PageRank vectors leads an improved partitioning algorithm. Next, we consider the edge ranking and sparsification in connection graphs. Connection graphs arise in dealing with high-dimensional data sets in which each edge is associated with both a scalar edge weight and a d-dimensional linear transformation. We generalized the PageRank and the effective resistance in the usual graph to their vectorized versions in connection graphs. They can be used as basic tools for organizing and analyzing complex data sets. For example, the generalized PageRank and effective resistance can be utilized to derive and modify diffusion distances for vector diffusion maps in data and image processing. Furthermore, the edge ranking of connection graphs determined by the vectorized PageRank and effective resistance are an essential part of sparsification algorithms which simplify and preserve the global structure of connection graphs. Finally, we further explore the use of the vectorized version of PageRank vector in measuring the (local) consistency or inconsistency of a small area around a vertex in the connection graph. Since a PageRank vector can also be expressed as a geometric sum of random walks with different number of steps, we use the PageRank to define the (local) inconsistency coefficient for measuring the portion of "probability'' vanished in the local random walk process. Furthermore, we develop an algorithm by using vectorized version of PageRank for finding the local consistent area around a vertex.