Efficient compilation for queue size constrained queue processors
暂无分享,去创建一个
[1] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[2] Dezsö Sima,et al. The Design Space of Register Renaming Techniques , 2000, IEEE Micro.
[3] Lenwood S. Heath,et al. Stack and Queue Layouts of Directed Acyclic Graphs: Part I , 1999, SIAM J. Comput..
[4] Liam Goudge,et al. Thumb: reducing the cost of 32-bit RISC performance in portable and consumer applications , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.
[5] Thomas D. Burd,et al. Processor design for portable systems , 1996, J. VLSI Signal Process..
[6] Tsutomu Yoshinaga,et al. Parallel Queue Processor Architecture Based on Produced Order Computation Model , 2005, The Journal of Supercomputing.
[7] Aviral Shrivastava,et al. Compilation framework for code size reduction using reduced bit-width ISAs (rISAs) , 2006, TODE.
[8] Masahiro Sowa,et al. Design of a superscalar processor based on queue machine computation model , 1999, 1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368).
[9] Gary S. Tyson,et al. Register queues: a new hardware/software approach to efficient software pipelining , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).
[10] Herman Schmit,et al. Queue machines: hardware compilation in hardware , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[11] Manish Gupta,et al. Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.
[12] Charles H. Moore,et al. The evolution of Forth , 1996 .
[13] Jozo J. Dujmovic,et al. Evolution and evaluation of SPEC benchmarks , 1998, PERV.
[14] Philip H. Sweany,et al. A Code Generation Framework for VLIW Architectures with Partitioned Register Banks , 2007 .
[15] Huiyang Zhou,et al. Code size efficiency in global scheduling for ILP processors , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.
[16] Arquimedes Canedo,et al. Queue Register File Optimization Algorithm for QueueCore Processor , 2007 .
[17] Kevin D. Kissell. MIPS16: High-density MIPS for the Embedded Market1 , 1997 .
[18] Javier Zalamea,et al. Software and Hardware Techniques to Optimize Register File Utilization in VLIW Architectures , 2004, International Journal of Parallel Programming.
[19] Lenwood S. Heath,et al. Laying out Graphs Using Queues , 1992, SIAM J. Comput..
[20] Ikuya Kawasaki,et al. SH3: high code density, low power , 1995, IEEE Micro.
[21] Scott A. Mahlke,et al. Partitioning variables across register windows to reduce spill code in a low-power processor , 2005, IEEE Transactions on Computers.
[22] Josep Llosa,et al. Quantitative Evaluation of Register Pressure on Software Pipelined Loops , 1998, International Journal of Parallel Programming.
[23] Alexander V. Veidenbaum,et al. Power-Aware Compilation for Register File Energy Reduction , 2004, International Journal of Parallel Programming.
[24] Manuel E. Benitez,et al. Code generation for streaming: an access/execute mechanism , 1991, ASPLOS IV.
[25] Jr. Philip J. Koopman,et al. Stack computers: the new wave , 1989 .
[26] Bruno R. Preiss,et al. Data flow on a queue machine , 1985, ISCA 1985.
[27] Huibin Shi,et al. Investigating available instruction level parallelism for stack based machine architectures , 2004 .
[28] Gürhan Küçük,et al. Energy Efficient Register Renaming , 2003, PATMOS.
[29] Mike O'Connor,et al. PicoJava: A Direct Execution Engine For Java Bytecode , 1998, Computer.
[30] Hyuk-Jae Lee,et al. PARE: instruction set architecture for efficient code size reduction , 1999 .
[31] Henk Corporaal,et al. Partitioned register file for TTAs , 1995, MICRO 1995.
[32] Makoto Hasegawa,et al. High-speed top-of-stack scheme for VLSI processor: a management algorithm and its analysis , 1985, ISCA '85.
[33] Norman P. Jouppi,et al. Register file design considerations in dynamically scheduled processors , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[34] Arquimedes Canedo,et al. A new code generation algorithm for 2-offset producer order queue computation model , 2008, Comput. Lang. Syst. Struct..
[35] Jack B. Dennis,et al. A preliminary architecture for a basic data-flow processor , 1974, ISCA '98.
[36] Andrew Kennedy,et al. Design and implementation of generics for the .NET Common language runtime , 2001, PLDI '01.
[37] Russell P. Blake. Exploring a Stack Architecture , 1977, Computer.
[38] Tsutomu Yoshinaga,et al. High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core , 2006, The Journal of Supercomputing.
[39] Frank Yellin,et al. The Java Virtual Machine Specification , 1996 .
[40] Kenneth C. Louden. P-code and compiler portability: experience with a Modula-2 optimizing compiler , 1990, SIGP.
[41] Wm. A. Wulf. Evaluation of the WM architecture , 1992, ISCA '92.
[42] Masahiro Sowa,et al. Design and architecture for an embedded 32-bit QueueCore , 2006, J. Embed. Comput..