Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture

DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on existing simple cores (in-order pipelines, no branch predictors, no ROBs).

[1]  Krishna M. Kavi,et al.  Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation , 2001, IEEE Trans. Computers.

[2]  Roberto Giorgi,et al.  DTA-C: A Decoupled multi-Threaded Architecture for CMP Systems , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[3]  Roberto Giorgi,et al.  Introducing Hardware TLP Support in the Cell Processor , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[4]  N. Gura,et al.  UltraSPARC T2: A highly-treaded, power-efficient, SPARC SOC , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[5]  Ben H. H. Juurlink,et al.  Analyzing Scalability of Deblocking Filter of H.264 via TLP Exploitation in a New Many-Core Architecture , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[6]  Seth Copen Goldstein,et al.  TAM - A Compiler Controlled Threaded Abstract Machine , 1993, J. Parallel Distributed Comput..

[7]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[8]  R. Engelbrecht,et al.  DIGEST of TECHNICAL PAPERS , 1959 .

[9]  José E. Moreira,et al.  Dissecting Cyclops: a detailed analysis of a multithreaded architecture , 2003, CARN.

[10]  Paraskevas Evripidou,et al.  Data-Driven Multithreading Using Conventional Microprocessors , 2006, IEEE Transactions on Parallel and Distributed Systems.

[11]  Jaehyuk Huh,et al.  Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.

[12]  Ken Mai,et al.  The future of wires , 2001, Proc. IEEE.

[13]  Guang R. Gao,et al.  A design study of the EARTH multiprocessor , 1995, PACT.

[14]  Xavier Martorell Bofill,et al.  A module-based cell processor simulator , 2006 .

[15]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[16]  Olivier Temam,et al.  UNISIM: An Open Simulation Environment and Library for Complex Architecture Design and Collaborative Development , 2007, IEEE Computer Architecture Letters.