Implicit Decomposition for Write-Efficient Connectivity Algorithms

The future of main memory appears to lie in the direction of new technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy. Motivated by this trend, we propose sequential and parallel algorithms to solve graph connectivity problems using significantly fewer writes than conventional algorithms. Our primary algorithmic tool is the construction of an o(n)-sized implicit decomposition of a bounded-degree graph G on n nodes, which combined with read-only access to G enables fast answers to connectivity and biconnectivity queries on G. The construction breaks the linear-write "barrier", resulting in costs that are asymptotically lower than conventional algorithms while adding only a modest cost to querying time. For general non-sparse graphs on m edges, we also provide the first o(m) writes and O(m) operations parallel algorithms for connectivity and biconnectivity. These algorithms provide insight into how applications can efficiently process computations on large graphs in systems with read-write asymmetry.

[1]  H. T. Kung,et al.  I/O complexity: The red-blue pebble game , 1981, STOC '81.

[2]  Robert E. Tarjan,et al.  An Efficient Parallel Biconnectivity Algorithm , 2011, SIAM J. Comput..

[3]  Baruch Awerbuch,et al.  Complexity of network synchronization , 1985, JACM.

[4]  Hillel Gazit,et al.  An optimal randomized parallel algorithm for finding connected components in a graph , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[5]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[6]  Andrew V. Goldberg,et al.  Network decomposition and locality in distributed computation , 1989, 30th Annual Symposium on Foundations of Computer Science.

[7]  Michael E. Saks,et al.  Decomposing graphs into regions of small diameter , 1991, SODA '91.

[8]  Lenore Cowen,et al.  Low-Diameter Graph Decomposition is in NC , 1992, SWAT.

[9]  Uzi Vishkin,et al.  Recursive Star-Tree Parallel Data Structure , 1993, SIAM J. Comput..

[10]  Uri Zwick,et al.  Optimal randomized EREW PRAM algorithms for finding spanning forests and for other basic graph connectivity problems , 1996, SODA '96.

[11]  Richard Cole,et al.  Finding minimum spanning forests in logarithmic time and linear work using random sampling , 1996, SPAA '96.

[12]  Uri Zwick,et al.  An Optimal Randomised Logarithmic Time Connectivity Algorithm for the EREW PRAM , 1996, J. Comput. Syst. Sci..

[13]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[14]  Seth Pettie,et al.  A Randomized Time-Work Optimal Parallel Algorithm for Finding a Minimum Spanning Forest , 1999, RANDOM-APPROX.

[15]  Vijaya Chung,et al.  A Randomized Linear-Work EREW PRAM Algorithm to Find a Minimum Spanning Forest , 2003, Algorithmica.

[16]  Kunihiko Sadakane,et al.  Space-Efficient Data Structures for Flexible Text Retrieval Systems , 2002, ISAAC.

[17]  Arnold L. Rosenberg,et al.  Graph Separators, with Applications , 2001, Frontiers of Computer Science.

[18]  Konstantin Andreev,et al.  Balanced graph partitioning , 2004, SPAA.

[19]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[20]  Bernard Chazelle,et al.  Approximating the Minimum Spanning Tree Weight in Sublinear Time , 2001, ICALP.

[21]  Seung-Yun Lee,et al.  A Low Power Phase-Change Random Access Memory using a Data-Comparison Write Scheme , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[22]  Yiran Chen,et al.  Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[23]  Yuan Xie,et al.  PCRAMsim: System-level performance, energy, and area modeling for Phase-Change RAM , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[24]  Ronald L. Rivest,et al.  Introduction to Algorithms, 3rd Edition , 2009 .

[25]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[26]  Hyunjin Lee,et al.  Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[27]  Kyuseok Shim,et al.  FAST: Flash-aware external sorting for mobile database systems , 2009, J. Syst. Softw..

[28]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[29]  Suman Nath,et al.  Rethinking Database Algorithms for Phase Change Memory , 2011, CIDR.

[30]  Sivan Toledo,et al.  Competitive analysis of flash memory algorithms , 2011, TALG.

[31]  Rajesh K. Gupta,et al.  Onyx: A Prototype Phase Change Memory Storage Array , 2011, HotStorage.

[32]  Sudhanva Gurumurthi,et al.  Phase Change Memory: From Devices to Systems , 2011, Phase Change Memory.

[33]  Cong Xu,et al.  Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.

[34]  Ilia Petrov,et al.  Making cost-based query optimization asymmetry-aware , 2012, DaMoN '12.

[35]  Rami G. Melhem,et al.  Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems , 2012, TACO.

[36]  Stratis Viglas,et al.  Adapting the B + -tree for Asymmetric I/O , 2012, ADBIS.

[37]  Ittai Abraham,et al.  Using petal-decompositions to build a low stretch spanning tree , 2012, STOC '12.

[38]  Gary L. Miller,et al.  Parallel graph decompositions using random shifts , 2013, SPAA.

[39]  Sivan Toledo,et al.  Phase-change memory: An architectural perspective , 2013, CSUR.

[40]  Guy E. Blelloch,et al.  Near linear-work parallel SDD solvers, low-diameter decomposition, and low-stretch subgraphs , 2011, SPAA '11.

[41]  Yuan Xie,et al.  WADE: Writeback-aware dynamic cache management for NVM-based main memory system , 2013, TACO.

[42]  Guy E. Blelloch,et al.  A simple and practical linear-work parallel algorithm for connectivity , 2014, SPAA.

[43]  Petra Berenbrink,et al.  Estimating the number of connected components in sublinear time , 2014, Inf. Process. Lett..

[44]  Wei-Che Tseng,et al.  Scheduling to Optimize Cache Utilization for Non-Volatile Main Memories , 2014, IEEE Transactions on Computers.

[45]  Hyojun Kim,et al.  Evaluating Phase Change Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches , 2014, TOS.

[46]  Jagan Singh Meena,et al.  Overview of emerging nonvolatile memory technologies , 2014, Nanoscale Research Letters.

[47]  David Eppstein,et al.  Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket , 2014, SEA.

[48]  Stratis Viglas,et al.  Write-limited sorts and joins for persistent memory , 2014, Proc. VLDB Endow..

[49]  Qin Jin,et al.  Persistent B+-Trees in Non-Volatile Main Memory , 2015, Proc. VLDB Endow..

[50]  Guy E. Blelloch,et al.  Sorting with Asymmetric Read and Write Costs , 2015, SPAA.

[51]  James Demmel,et al.  Write-Avoiding Algorithms , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[52]  Guy E. Blelloch,et al.  Parallel Algorithms for Asymmetric Read-Write Costs , 2016, SPAA.

[53]  Guy E. Blelloch,et al.  Efficient Algorithms with Asymmetric Read and Write Costs , 2015, ESA.

[54]  Ismail Oukid,et al.  FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory , 2016, SIGMOD Conference.

[55]  Andrew Pavlo,et al.  How to Build a Non-Volatile Memory Database Management System , 2017, SIGMOD Conference.

[56]  Nodari Sitchinava,et al.  Lower Bounds in the Asymmetric External Memory Model , 2017, SPAA.