Active messages as a spanning model for parallel graph computation
暂无分享,去创建一个
[1] P. Geoffray. Myrinet express (MX): Is your interconnect smart ? , 2004, Proceedings. Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region, 2004..
[2] K. Mani Chandy,et al. How processes learn , 1985, ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing.
[3] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[4] David A. Bader,et al. An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances , 2007, ALENEX.
[5] Ralph Johnson,et al. design patterns elements of reusable object oriented software , 2019 .
[6] Sandeep Koranne,et al. Boost C++ Libraries , 2011 .
[7] B. Ramkumar,et al. A dynamic and adaptive quiescence detection algorithmAmitabh , 1993 .
[8] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.
[9] Brad Richards,et al. Java-Based DSM with Object-Level Coherence Protocol Selection , 2003 .
[10] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .
[11] Eli Upfal,et al. Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems , 1997, IEEE Trans. Parallel Distributed Syst..
[12] Robert J. Harrison,et al. Performance and experience with LAPI-a new high-performance communication library for the IBM RS/6000 SP , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[13] David A. Bader,et al. Massive streaming data analytics: A case study with clustering coefficients , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[14] Andrew Lumsdaine,et al. Effecting parallel graph eigensolvers through library composition , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[15] Paul Erdös,et al. On random graphs, I , 1959 .
[16] Toyotaro Suzumura,et al. Introducing ScaleGraph: an X10 library for billion scale graph analytics , 2012, X10 '12.
[17] Jack J. Dongarra,et al. Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[18] Robert Thurlow,et al. RPC: Remote Procedure Call Protocol Specification Version 2 , 2009, RFC.
[19] Yogish Sabharwal,et al. Software Routing and Aggregation of Messages to Optimize the Performance of HPCC Randomaccess Benchmark , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[20] Vern Paxson,et al. Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.
[21] Richard P. Martin,et al. Assessing Fast Network Interfaces , 1996, IEEE Micro.
[22] Christos Faloutsos,et al. Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.
[23] Carlos Guestrin,et al. Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .
[24] Jim Waldo. Remote procedure calls and Java Remote Method Invocation , 1998, IEEE Concurr..
[25] Arnold L. Rosenberg,et al. Graph Separators, with Applications , 2001, Frontiers of Computer Science.
[26] P. Erdoes,et al. On sparse graphs with dense long paths. , 1975 .
[27] Jose Sreeram,et al. UPC Queues for Scalable Graph Traversals: Design and Evaluation on InfiniBand Clusters , 2011 .
[28] John H. Reif,et al. Depth-First Search is Inherently Sequential , 1985, Inf. Process. Lett..
[29] Biswanath Mukherjee,et al. DIDS (distributed intrusion detection system)—motivation, architecture, and an early prototype , 1997 .
[30] Christos Faloutsos,et al. R-MAT: A Recursive Model for Graph Mining , 2004, SDM.
[31] Robert C. Daley,et al. The Multics virtual memory , 1972, Commun. ACM.
[32] Edmond Chow,et al. A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[33] Jesse Davis,et al. Method for module interaction in a Modular Architecture for Sensor Systems (MASS). , 2005 .
[34] Jonathan W. Berry,et al. Software and Algorithms for Graph Queries on Multithreaded Architectures , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[35] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[36] Douglas P. Gregor,et al. The Parallel BGL : A Generic Library for Distributed Graph Computations , 2005 .
[37] Laxmikant V. Kalé,et al. Chare Kernel - a Runtime Support System for Parallel Computations , 1991, J. Parallel Distributed Comput..
[38] Sebastian Burckhardt,et al. The design of a task parallel library , 2009, OOPSLA.
[39] Salvatore J. Stolfo,et al. Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..
[40] Ulrich Meyer,et al. Improved External Memory BFS Implementation , 2007, ALENEX.
[41] Monika Henzinger,et al. Maintaining Minimum Spanning Forests in Dynamic Graphs , 2001, SIAM J. Comput..
[42] W. E Nagel. 1988 International conference on supercomputing , 1988 .
[43] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.
[44] Amith R. Mamidala,et al. PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[45] Torsten Hoefler,et al. A space-efficient parallel algorithm for computing betweenness centrality in distributed memory , 2010, 2010 International Conference on High Performance Computing.
[46] Timothy G. Mattson,et al. Patterns for parallel programming , 2004 .
[47] John R. Gilbert,et al. Linear algebraic primitives for parallel computing on large graphs , 2010 .
[48] Andrew Lumsdaine,et al. Extensible PGAS semantics for C++ , 2010, PGAS '10.
[49] Seth Copen Goldstein,et al. Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[50] John R. Gilbert,et al. High-Performance Graph Algorithms from Parallel Sparse Matrices , 2006, PARA.
[51] Allan Porterfield,et al. The Tera computer system , 1990 .
[52] Ulrich Meyer,et al. [Delta]-stepping: a parallelizable shortest path algorithm , 2003, J. Algorithms.
[53] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.
[54] Nir Shavit,et al. Software transactional memory , 1995, PODC '95.
[55] David A. Bader,et al. Parallel Algorithms for Evaluating Centrality Indices in Real-world Networks , 2006, 2006 International Conference on Parallel Processing (ICPP'06).
[56] Friedemann Mattern,et al. Algorithms for distributed termination detection , 1987, Distributed Computing.
[57] Kurt Mehlhorn,et al. A Parallelization of Dijkstra's Shortest Path Algorithm , 1998, MFCS.
[58] Duncan J. Watts,et al. Collective dynamics of ‘small-world’ networks , 1998, Nature.
[59] Albert Chan,et al. CGMgraph/CGMlib: Implementing and Testing CGM Graph Algorithms on PC Clusters , 2003, PVM/MPI.
[60] Keshav Pingali,et al. Optimistic parallelism requires abstractions , 2009, CACM.
[61] D. Corneil,et al. An Efficient Algorithm for Graph Isomorphism , 1970, JACM.
[62] Andrew Lumsdaine,et al. Single-Source Shortest Paths with the Parallel Boost Graph Library , 2006, The Shortest Path Problem.
[63] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[64] David A. Bader,et al. SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[65] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[66] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[67] Burton J. Smith. Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.
[68] John R. Gilbert,et al. A Unified Framework for Numerical and Combinatorial Computing , 2008, Computing in Science & Engineering.
[69] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[70] Albert,et al. Emergence of scaling in random networks , 1999, Science.
[71] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[72] Bjarne Stroustrup,et al. The Design and Evolution of C , 1994 .
[73] John R. Gilbert,et al. Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..
[74] Torsten Hoefler,et al. Implementation and performance analysis of non-blocking collective operations for MPI , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[75] Kevin J. Lang. Fixing two weaknesses of the Spectral Method , 2005, NIPS.
[76] Dennis Shasha,et al. Algorithmics and applications of tree and graph searching , 2002, PODS.
[77] Torsten Hoefler,et al. AM++: A generalized active message framework , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[78] Santa Barbara,et al. Linear Algebraic Primitives for Parallel Computing on Large Graphs , 2010 .
[79] Donald Yeung,et al. Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.
[80] V. Jacobson,et al. Congestion avoidance and control , 1988, CCRV.
[81] Petr Konecny. Introducing the Cray XMT , 2007 .
[82] Christos Faloutsos,et al. Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication , 2005, PKDD.
[83] Daisuke Takahashi,et al. The HPC Challenge (HPCC) benchmark suite , 2006, SC.
[84] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[85] Katherine A. Yelick,et al. Titanium: A High-performance Java Dialect , 1998, Concurr. Pract. Exp..
[86] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.
[87] Jonathan W. Berry,et al. Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..
[88] Matthew H. Austern. Generic programming and the STL - using and extending the C++ standard template library , 1999, Addison-Wesley professional computing series.
[89] K. Glasgow,et al. Los Angeles, California , 2003 .
[90] Andrew Lumsdaine,et al. Lifting sequential graph algorithms for distributed-memory parallel computation , 2005, OOPSLA '05.
[91] Carl Hewitt,et al. The incremental garbage collection of processes , 1977, Artificial Intelligence and Programming Languages.
[92] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[93] Jeremiah Willcock,et al. Expressing graph algorithms using generalized active messages , 2013, PPoPP 2013.
[94] Jack J. Dongarra,et al. Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs , 1988, TOMS.
[95] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .
[96] Lawrence Rauchwerger,et al. Identifying Strongly Connected Components in Parallel , 2000, PPSC.
[97] Veljko M. Milutinovic,et al. Distributed shared memory: concepts and systems , 1997, IEEE Parallel Distributed Technol. Syst. Appl..
[98] Maurice Herlihy,et al. A methodology for implementing highly concurrent data objects , 1993, TOPL.
[99] Francisco Jose Arzu. Standard Templates Adaptive Parallel Library , 2000 .
[100] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[101] Philip Heidelberger,et al. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.
[102] Keshav Pingali,et al. The tao of parallelism in algorithms , 2011, PLDI '11.
[103] Arti Mohanpurkar,et al. Credit card fraud detection using Hidden Markov Model , 2011, 2011 World Congress on Information and Communication Technologies.
[104] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[105] Andrew A. Chien,et al. Architecture of a message-driven processor , 1987, ISCA '87.
[106] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[107] Nancy M. Amato,et al. The STAPL parallel container framework , 2011, PPoPP '11.
[108] Dan Bonachea. GASNet Specification, v1.1 , 2002 .
[109] David Gelernter,et al. Generative communication in Linda , 1985, TOPL.
[110] Steven Fortune,et al. Parallelism in random access machines , 1978, STOC.
[111] Hans P. Zima,et al. The cascade high productivity language , 2004 .
[112] Donald Yeung,et al. THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .
[113] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[114] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[115] Daniele Frigioni,et al. Fully Dynamic Algorithms for Maintaining Shortest Paths Trees , 2000, J. Algorithms.
[116] Douglas Thain,et al. Qthreads: An API for programming with millions of lightweight threads , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[117] Jeong-Hoon Lee,et al. An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..
[118] Samuel T. Chanson,et al. Process groups and group communications: classifications and requirements , 1990, Computer.
[119] Torsten Hoefler,et al. Active pebbles: parallel programming for data-driven applications , 2011, ICS '11.
[120] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[121] Torsten Hoefler,et al. Kanor - A Declarative Language for Explicit Communication , 2011, PADL.
[122] José E. Moreira,et al. Dissecting Cyclops: a detailed analysis of a multithreaded architecture , 2003, CARN.
[123] Edsger W. Dijkstra,et al. Termination Detection for Diffusing Computations , 1980, Inf. Process. Lett..
[124] Jehoshua Bruck,et al. Efficient algorithms for all-to-all communications in multi-port message-passing systems , 1994, SPAA '94.
[125] Lawrence Rauchwerger,et al. Armi: a High Level Communication Library for Stapl , 2006, Parallel Process. Lett..
[126] Jaakko Järvi,et al. Concept-Controlled Polymorphism , 2003, GPCE.
[127] Ralph Duncan,et al. A Survey of Parallel Computer , 1990 .
[128] Uzi Vishkin,et al. An O(log n) Parallel Connectivity Algorithm , 1982, J. Algorithms.
[129] Maurice Herlihy,et al. The Aleph Toolkit: Support for Scalable Distributed Shared Objects , 1999, CANPC.
[130] Sartaj Sahni,et al. Handbook of Data Structures and Applications , 2004 .
[131] Anthony Skjellum,et al. An initial implementation of MPI , 1993 .
[132] Anoop Gupta,et al. Interleaving: a multithreading technique targeting multiprocessors and workstations , 1994, ASPLOS VI.
[133] Pradeep Dubey,et al. Larrabee: A Many-Core x86 Architecture for Visual Computing , 2009, IEEE Micro.
[134] Julian R. Ullmann,et al. An Algorithm for Subgraph Isomorphism , 1976, J. ACM.
[135] Daniel P. Friedman,et al. Aspects of Applicative Programming for Parallel Processing , 1978, IEEE Transactions on Computers.
[136] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[137] Tamara G. Kolda,et al. Community structure and scale-free collections of Erdös-Rényi graphs , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.
[138] Sriram Krishnamoorthy,et al. Global Futures: A Multithreaded Execution Model for Global Arrays-based Applications , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[139] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[140] Satoru Kawai,et al. An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..
[141] L. Ridgway Scott,et al. Scientific Parallel Computing , 2005 .
[142] Peter Sanders,et al. [Delta]-stepping: a parallelizable shortest path algorithm , 2003, J. Algorithms.
[143] John R. Gilbert,et al. Sparse Matrices in Matlab*P: Design and Implementation , 2004, HiPC.
[144] Andrew Lumsdaine,et al. PFunc: modern task parallelism for modern high performance computing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[145] Feipei Lai,et al. Adsmith: an efficient object-based distributed shared memory system on PVM , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).
[146] A. Gupta,et al. Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results , 1989, ISCA '89.
[147] Courtenay T. Vaughan,et al. A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark , 2006, 2006 IEEE International Conference on Cluster Computing.
[148] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[149] David A. Bader,et al. Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2 , 2006, 2006 International Conference on Parallel Processing (ICPP'06).
[150] M. Snir,et al. Ghost Cell Pattern , 2010, ParaPLoP '10.
[151] Gul A. Agha,et al. ACTORS - a model of concurrent computation in distributed systems , 1985, MIT Press series in artificial intelligence.
[152] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[153] Jesse Davis,et al. MASS: modular architecture for sensor systems , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..
[154] John R. Gilbert,et al. The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..
[155] Reaz Hoque. Corba 3 , 1998 .
[156] Christos Faloutsos,et al. PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.
[157] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[158] Nissim Francez,et al. Distributed Termination , 1980, TOPL.
[159] Nancy M. Amato,et al. STAPL: A Standard Template Adaptive Parallel C++ Library , 2001 .
[160] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[161] Torsten Hoefler,et al. Scalable communication protocols for dynamic sparse data exchange , 2010, PPoPP '10.
[162] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[163] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[164] David A. Bader,et al. Practical parallel algorithms for personalized communication and integer sorting , 1996, JEAL.
[165] Jinyang Li,et al. Piccolo: Building Fast, Distributed Programs with Partitioned Tables , 2010, OSDI.
[166] Edward M. Reingold,et al. Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..