Toolchain for Programming, Simulating and Studying the XMT Many-Core Architecture
暂无分享,去创建一个
Fuat Keceli | Alexandros Tzannes | George C. Caragea | Rajeev Barua | Uzi Vishkin | U. Vishkin | R. Barua | Fuat Keceli | Alexandros Tzannes
[1] Sarita V. Adve,et al. Shared Memory Consistency Models: A Tutorial , 1996, Computer.
[2] Todd M. Austin,et al. SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.
[3] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[4] Uzi Vishkin,et al. A pilot study to compare programming effort for two parallel programming models , 2007, J. Syst. Softw..
[5] Uzi Vishkin,et al. Using simple abstraction to reinvent computing for parallelism , 2011, Commun. ACM.
[6] J. Banks,et al. Discrete-Event System Simulation , 1995 .
[7] Ralph Grishman,et al. The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract) , 1982, ISCA 1982.
[8] George C. Caragea,et al. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform , 2006, Handbook of Parallel Computing.
[9] Greg Hamerly,et al. SimPoint 3.0: Faster and More Flexible Program Analysis , 2005 .
[10] S. Sitharama Iyengar,et al. Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.
[11] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .
[12] George C. Caragea,et al. Brief announcement: better speedups for parallel max-flow , 2011, SPAA '11.
[13] Hyunjin Lee,et al. TPTS: A Novel Framework for Very Fast Manycore Processor Architecture Simulation , 2008, 2008 37th International Conference on Parallel Processing.
[14] Fuat Keceli,et al. Resource-Aware Compiler Prefetching for Many-Cores , 2010, 2010 Ninth International Symposium on Parallel and Distributed Computing.
[15] Uzi Vishkin,et al. Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract) , 1998, SPAA '98.
[16] Jiang Zhu,et al. Building a RCP (Rate Control Protocol) Test Network , 2007 .
[17] Alexandros Tzannes,et al. Lazy binary-splitting: a run-time adaptive work-stealing scheduler , 2010, PPoPP '10.
[18] Uzi Vishkin,et al. XMT-GPU: A PRAM Architecture for Graphics Computation , 2008, 2008 37th International Conference on Parallel Processing.
[19] George C. Caragea,et al. General-Purpose vs . GPU : Comparison of Many-Cores on Irregular Workloads , 2010 .
[20] Gang Qu,et al. A Mesh-of-Trees Interconnection Network for Single-Chip Parallel Processing , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).
[21] Uzi Vishkin,et al. Towards a First Vertical Prototyping of an Extremely Fine-Grained Parallel Programming Approach , 2003, Theory of Computing Systems.
[22] David A. Bader,et al. An experimental study of parallel biconnected components algorithms on symmetric multiprocessors (SMPs) , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[23] Uzi Vishkin,et al. Using Simple Abstraction to Guide the Reinvention of Computing for Parallelism , 2009 .
[24] A. B. Saybasili. HIGHLY PARALLEL MULTI-DIMENSIONAL FAST FOURIER TRANSFORM ON FINE-AND COARSE-GRAINED MANY-CORE APPROACHES , 2022 .
[25] Sanguthevar Rajasekaran,et al. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform , 2007 .
[26] Uzi Vishkin,et al. PRAM-on-chip: first commitment to silicon , 2007, SPAA '07.
[27] George C. Necula,et al. CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.
[28] R. M. Fujimoto,et al. Parallel discrete event simulation , 1989, WSC '89.
[29] Uzi Vishkin,et al. A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors , 2011, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.
[30] Gang Qu,et al. Layout-Accurate Design and Implementation of a High-Throughput Interconnection Network for Single-Chip Parallel Processing , 2007, 15th Annual IEEE Symposium on High-Performance Interconnects (HOTI 2007).
[31] Hans-Juergen Boehm,et al. HP Laboratories , 2006 .
[32] Sheng Liang,et al. Java Native Interface: Programmer's Guide and Reference , 1999 .
[33] Uzi Vishkin,et al. Is teaching parallel algorithmic thinking to high school students possible?: one teacher's experience , 2010, SIGCSE.
[34] Uzi Vishkin,et al. Fpga-based prototype of a pram-on-chip processor , 2008, CF '08.
[35] Christoph W. Kessler,et al. Practical PRAM programming , 2000, Wiley series on parallel and distributed computing.
[36] Jeremy Manson,et al. The Java memory model , 2005, POPL '05.
[37] Brad Calder,et al. SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.
[38] Kevin Skadron,et al. Temperature-aware microarchitecture , 2003, ISCA '03.
[39] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[40] Zhengyu He,et al. Dynamically tuned push-relabel algorithm for the maximum flow problem on CPU-GPU-Hybrid platforms , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[41] Laurie J. Hendren,et al. SableCC, an object-oriented compiler framework , 1998, Proceedings. Technology of Object-Oriented Languages. TOOLS 26 (Cat. No.98EX176).
[42] George C. Caragea,et al. Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor , 2009, SPAA '09.
[43] Sanguthevar Rajasekaran,et al. Handbook of Parallel Computing - Models, Algorithms and Applications , 2007 .
[44] Fuat Keceli,et al. Power-Performance Comparison of Single-Task Driven Many-Cores , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.
[45] Alexandros Tzannes,et al. The compiler for the XMTC parallel language: Lessons for compiler developers and in-depth description , 2011 .