Large-scale energy-efficient graph traversal: A path to efficient data-intensive supercomputing
暂无分享,去创建一个
Pradeep Dubey | Nadathur Satish | Changkyu Kim | Jatin Chhugani | P. Dubey | N. Satish | J. Chhugani | Changkyu Kim
[1] Ana Paula Appel,et al. Radius Plots for Mining Tera-byte Scale Graphs: Algorithms, Patterns, and Observations , 2010, SDM.
[2] Shuang Chen,et al. The entropy of ordered sequences and order statistics , 1990, IEEE Trans. Inf. Theory.
[3] Anthony Skjellum,et al. A Multithreaded Message Passing Interface (MPI) Architecture: Performance and Program Issues , 2001, J. Parallel Distributed Comput..
[4] David A. Bader,et al. On the architectural requirements for efficient execution of graph algorithms , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[5] Pradeep Dubey,et al. Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[6] Satoshi Matsuoka,et al. Performance characteristics of Graph500 on large-scale distributed environment , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).
[7] D. Patterson,et al. Searching for a Parent Instead of Fighting Over Children : A Fast Breadth-First Search Implementation for Graph 500 , 2011 .
[8] Hugh E. Williams,et al. Compressing Integers for Fast File Access , 1999, Comput. J..
[9] Eduard Ayguadé,et al. Overlapping communication and computation by using a hybrid MPI/SMPSs approach , 2010, ICS '10.
[10] Kamesh Madduri,et al. Parallel breadth-first search on distributed memory systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[11] Pradeep Dubey,et al. FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.
[12] Jose Sreeram,et al. UPC Queues for Scalable Graph Traversals: Design and Evaluation on InfiniBand Clusters , 2011 .
[13] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[14] Russ Bubley,et al. Randomized algorithms , 1995, CSUR.
[15] David A. Bader,et al. Advanced Shortest Paths Algorithms on a Massively-Multithreaded Architecture , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[16] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .
[17] Vicki H. Allan,et al. Software pipelining , 1995, CSUR.
[18] Bo Song,et al. Overlapping Communication and Computation in MPI by Multithreading , 2006, PDPTA.
[19] Mark Anderson. Better benchmarking for supercomputers , 2011 .
[20] Fabrizio Petrini,et al. Efficient Breadth-First Search on the Cell/BE Processor , 2008, IEEE Transactions on Parallel and Distributed Systems.
[21] Kunle Olukotun,et al. Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.
[22] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[23] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.
[24] David A. Bader,et al. Scalable Graph Exploration on Multicore Processors , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[25] Charles E. Leiserson,et al. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers) , 2010, SPAA '10.
[26] Alexander Zeier,et al. SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units , 2009, Proc. VLDB Endow..
[27] Hosung Park,et al. What is Twitter, a social network or a news media? , 2010, WWW '10.
[28] Krishna P. Gummadi,et al. Measurement and analysis of online social networks , 2007, IMC '07.
[29] J. Koomey. Worldwide electricity used in data centers , 2008 .
[30] Dan Suciu,et al. A query language for a Web-site management system , 1997, SGMD.
[31] David A. Bader,et al. Approximating Betweenness Centrality , 2007, WAW.
[32] Yinglong Xia. TOPOLOGICALLY ADAPTIVE PARALLEL BREADTH-FIRST SEARCH ON MULTICORE PROCESSORS , 2010 .
[33] Edmond Chow,et al. A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L , 2005, ACM/IEEE SC 2005 Conference (SC'05).