Incremental DFS algorithms: a theoretical and experimental study

Depth First Search (DFS) tree is a fundamental data structure for solving graph problems. The DFS tree of a graph $G$ with $n$ vertices and $m$ edges can be built in $O(m+n)$ time. Till date, only a few algorithms have been designed for maintaining incremental DFS. For undirected graphs, the two algorithms, namely, ADFS1 and ADFS2 [ICALP14] achieve total $O(n^{3/2}\sqrt{m})$ and $O(n^2)$ time respectively. For DAGs, the only non-trivial algorithm, namely, FDFS [IPL97] requires total $O(mn)$ time. In this paper, we carry out extensive experimental and theoretical evaluation of existing incremental DFS algorithms in random and real graphs, and derive the following results. 1- For insertion of uniformly random sequence of $n \choose 2$ edges, ADFS1, ADFS2 and FDFS perform equally well and are found to take $\Theta(n^2)$ time experimentally. This is quite surprising because the worst case bounds of ADFS1 and FDFS are greater than $\Theta(n^2)$ by a factor of $\sqrt{m/n}$ and $m/n$ respectively. We complement this result by probabilistic analysis of these algorithms proving $\tilde{O}(n^2)$ bound on the update time. Here, we derive results about the structure of a DFS tree in random graphs, which are of independent interest. 2- These insights led us to design an extremely simple incremental DFS algorithm for both undirected and directed graphs. This algorithm theoretically matches and experimentally outperforms the state-of-the-art in dense random graphs. It can also be used as a single-pass semi-streaming algorithm for incremental DFS and strong connectivity in random graphs. 3- Even for real graphs, both ADFS1 and FDFS perform much better than their theoretical bounds. Here again, we present two simple algorithms for incremental DFS for directed and undirected real graphs. In fact, our algorithm for directed graphs almost always matches the performance of FDFS.

[1]  Surender Baswana,et al.  Incremental Algorithm for Maintaining a DFS Tree for Undirected Graphs , 2017, Algorithmica.

[2]  Abhabongse Janthong,et al.  Streaming Algorithm for Determining a Topological Ordering of a Digraph , 2014 .

[3]  Ulrik Brandes,et al.  Computing Wikipedia Edit-Networks , 2009 .

[4]  Paolo Avesani,et al.  Controversial Users Demand Local Trust Metrics: An Experimental Study on Epinions.com Community , 2005, AAAI.

[5]  Robert E. Tarjan,et al.  Dynamic trees as search trees via euler tours, applied to the network simplex algorithm , 1997, Math. Program..

[6]  Vicenç Gómez,et al.  Statistical analysis of the social network and discussion threads in slashdot , 2008, WWW.

[7]  Jean-Marc Vincent,et al.  Random graph generation for scheduling simulations , 2010, SimuTools.

[8]  Tore Opsahl,et al.  Clustering in weighted networks , 2009, Soc. Networks.

[9]  Clifford Stein,et al.  Extending Search Phases in the Micali-Vazirani Algorithm , 2017, SEA.

[10]  R. Bruce Mattingly,et al.  Implementing on O(/NM) Cardinality Matching Algorithm , 1991, Network Flows And Matching.

[11]  Ulrik Brandes,et al.  Network analysis of collaboration structure in Wikipedia , 2009, WWW '09.

[12]  Giuseppe F. Italiano,et al.  Experimental analysis of dynamic all pairs shortest path algorithms , 2004, SODA '04.

[13]  Peter Druschel,et al.  Online social networks: measurement, analysis, and applications to distributed information systems , 2009 .

[14]  Rajeev Motwani,et al.  Average-case analysis of algorithms for matchings and related problems , 1994, JACM.

[15]  Giuseppe Cattaneo,et al.  An empirical study of dynamic graph algorithms , 1996, SODA '96.

[16]  Munmun De Choudhury,et al.  Social Synchrony: Predicting Mimicry of User Actions in Online Social Media , 2009, 2009 International Conference on Computational Science and Engineering.

[17]  Mikkel Thorup,et al.  An Experimental Study of Polylogarithmic, Fully Dynamic, Connectivity Algorithms , 2001, JEAL.

[18]  John D. Kececioglu,et al.  Computing Maximum-Cardinality Matchings in Sparse General Graphs , 1998, Algorithm Engineering.

[19]  Peter Sanders,et al.  Dynamic Highway-Node Routing , 2007, WEA.

[20]  Celso C. Ribeiro,et al.  Experimental Analysis of Algorithms for Updating Minimum Spanning Trees on Graphs Subject to Changes on Edge Weights , 2007, WEA.

[21]  Surender Baswana,et al.  On Dynamic DFS Tree in Directed Graphs , 2015, MFCS.

[22]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[23]  Thomas C. O'Connell,et al.  A Survey of Graph Algorithms Under Extended Streaming Models of Computation , 2013, Fundamental Problems in Computing.

[24]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[25]  Umberto Ferraro Petrillo,et al.  Maintaining dynamic minimum spanning trees: An experimental study , 2002, Discret. Appl. Math..

[26]  Dorothea Wagner,et al.  Dynamic graph clustering combining modularity and smoothness , 2013, JEAL.

[27]  Giorgio Gambosi,et al.  The Incremental Maintenance of a Depth-First-Search Tree in Directed Acyclic Graphs , 1997, Inf. Process. Lett..

[28]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[29]  Jure Leskovec,et al.  Governance in Social Media: A Case Study of the Wikipedia Promotion Process , 2010, ICWSM.

[30]  Ulrich Meyer,et al.  Single-source shortest-paths on arbitrary directed graphs in linear average-case time , 2001, SODA '01.

[31]  Tad Hogg,et al.  Social dynamics of Digg , 2010, EPJ Data Science.

[32]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[33]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[34]  Robert Sedgewick,et al.  Average case analysis of graph-searching algorithms , 1990 .

[35]  Ian T. Foster,et al.  Mapping the Gnutella Network , 2002, IEEE Internet Comput..

[36]  Guy Melançon,et al.  Just how dense are dense graphs in the real world?: a methodological note , 2006, BELIV '06.

[37]  A. Frieze,et al.  Introduction to Random Graphs , 2016 .

[38]  Surender Baswana,et al.  Incremental Algorithm for Maintaining DFS Tree for Undirected Graphs , 2014, ICALP.

[39]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[40]  Steven T. Crocker An Experimental Comparison of Two Maximum Cardinality Matching Programs , 1991, Network Flows And Matching.

[41]  Matthias Ruhl,et al.  Efficient algorithms for new computational models , 2003 .

[42]  Robert E. Tarjan,et al.  Efficient Planarity Testing , 1974, JACM.

[43]  Robert E. Tarjan,et al.  Finding Dominators in Directed Graphs , 1974, SIAM J. Comput..

[44]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[45]  Surender Baswana,et al.  Dynamic DFS in Undirected Graphs: breaking the O(m) barrier , 2015, SODA.

[46]  Daniel Massey,et al.  Collecting the internet AS-level topology , 2005, CCRV.

[47]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[48]  Theresa Migler,et al.  Lower Bounds for Testing Digraph Connectivity with One-Pass Streaming Algorithms , 2014, IEEE Letters of the Computer Society.

[49]  Alberto Marchetti-Spaccamela,et al.  Average Case Analysis of Fully Dynamic Reachability for Directed Graphs , 1996, RAIRO Theor. Informatics Appl..

[50]  Robert E. Tarjan,et al.  Maintaining bridge-connected and biconnected components on-line , 1992, Algorithmica.

[51]  Michael Ley,et al.  The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives , 2002, SPIRE.

[52]  Tsan-sheng Hsu,et al.  Finding Articulation Points of Large Graphs in Linear Time , 2015, WADS.

[53]  Krishna P. Gummadi,et al.  Growth of the flickr social network , 2008, WOSN '08.

[54]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .