c-level programming of parallel coprocessor accelerators
暂无分享,去创建一个
[1] T. Knight,et al. Pathfinder : A Negotiation-Based Performance-Driven Router for FPGAs , 2012 .
[2] Alan Edelman,et al. Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[3] Chi-Bang Kuan,et al. Automated Empirical Optimization , 2011, Encyclopedia of Parallel Computing.
[4] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[5] William J. Dally,et al. Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures , 2010, SPAA '10.
[6] Dah-Jye Lee,et al. A Comparison Study on Implementing Optical Flow and Digital Communications on FPGAs and GPUs , 2010, TRETS.
[7] Anders Logg,et al. DOLFIN: Automated finite element computing , 2010, TOMS.
[8] Sean Rul,et al. An experimental study on performance portability of OpenCL kernels , 2010, HiPC 2010.
[9] Apan Qasem,et al. Evaluating the Role of Optimization-Specific Search Heuristics in Effective Autotuning ? , 2010 .
[10] Cristian Grozea,et al. FPGA vs. Multi-core CPUs vs. GPUs: Hands-On Experience with a Sorting Application , 2010, Facing the Multicore-Challenge.
[11] Tao Wang,et al. An Implementation of Viterbi Algorithm on GPU , 2009, 2009 First International Conference on Information Science and Engineering.
[12] Carl Ebeling,et al. Static versus scheduled interconnect in Coarse-Grained Reconfigurable Arrays , 2009, 2009 International Conference on Field Programmable Logic and Applications.
[13] Wayne Luk,et al. Exploring Reconfigurable Architectures for Tree-Based Option Pricing Models , 2009, TRETS.
[14] Walter F. Tichy,et al. Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications , 2009, Euro-Par.
[15] Vahid Tabatabaee,et al. Tuning parallel applications in parallel , 2009, Parallel Comput..
[16] Jason Cong,et al. FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs , 2009, 2009 IEEE 7th Symposium on Application Specific Processors.
[17] Alan Edelman,et al. PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.
[18] Chun Chen,et al. Model-guided autotuning of high-productivity languages for petascale computing , 2009, HPDC '09.
[19] Jason Cong,et al. High-performance CUDA kernel execution on FPGAs , 2009, ICS.
[20] Chun Chen,et al. A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[21] P. Sadayappan,et al. Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[22] Christoph A. Schaefer,et al. Reducing search space of auto-tuners using parallel patterns , 2009, 2009 ICSE Workshop on Multicore Software Engineering.
[23] Catalin Bogdan Ciobanu,et al. Wave field synthesis for 3D audio: architectural prospectives , 2009, CF '09.
[24] Victor Pankratius,et al. Auto-tuning support for manycore applications: perspectives for operating systems and compilers , 2009, OPSR.
[25] Wayne Luk,et al. A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation , 2009, FPGA '09.
[26] Scott Hauck,et al. FPGA-based front-end electronics for positron emission tomography , 2009, FPGA '09.
[27] Carl Ebeling,et al. SPR: an architecture-adaptive CGRA mapping tool , 2009, FPGA '09.
[28] Gérard Boudol,et al. Relaxed memory models: an operational approach , 2009, POPL '09.
[29] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[30] A. DeHon,et al. Pipelining saturated accumulation , 2005, IEEE Transactions on Computers.
[31] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[32] William J. Dally,et al. A tuning framework for software-managed memory hierarchies , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[33] Meikang Qiu,et al. Timing optimization via nest-loop pipelining considering code size , 2008, Microprocess. Microsystems.
[34] Peter Y. K. Cheung,et al. Outer Loop Pipelining for Application Specific Datapaths in FPGAs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[35] Changhee Lee,et al. Trash removal algorithm for fast construction of the elliptic Gabriel graph using Delaunay triangulation , 2008, Comput. Aided Des..
[36] Kevin Skadron,et al. Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.
[37] Hans-Juergen Boehm,et al. Foundations of the C++ concurrency memory model , 2008, PLDI '08.
[38] Fang Zhong,et al. Parallel architecture for PCA image feature detection using FPGA , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.
[39] Kristina Lerman,et al. Model-guided performance tuning of parameter values: A case study with molecular dynamics visualization , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[40] Jeff Mason,et al. CHiMPS: A C-level compilation flow for hybrid CPU-FPGA architectures , 2008, 2008 International Conference on Field Programmable Logic and Applications.
[41] Joseph M. Lancaster,et al. A Banded Smith-Waterman FPGA Accelerator for Mercury BLASTP , 2007, 2007 International Conference on Field Programmable Logic and Applications.
[42] R. C. Whaley,et al. Automated transformation for performance-critical kernels , 2007, LCSD '07.
[43] Albert Cohen,et al. Code-size conscious pipelining of imperfectly nested loops , 2007, MEDEA '07.
[44] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[45] Michael F. P. O'Boyle,et al. Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.
[46] Maya Gokhale,et al. Matched Filter Computation on FPGA, Cell and GPU , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).
[47] M. Butts,et al. A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).
[48] Georgi Gaydadjiev,et al. Architectural Exploration of the ADRES Coarse-Grained Reconfigurable Array , 2007, ARC.
[49] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[50] Uday Bondhugula,et al. Automatic mapping of nested loops to FPGAS , 2007, PPoPP.
[51] V.K. Prasanna,et al. Preliminary Investigation of Advanced Electrostatics in Molecular Dynamics on Reconfigurable Computers , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[52] Scott A. Mahlke,et al. Streamroller:: automatic synthesis of prescribed throughput accelerator pipelines , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).
[53] T. Mohsenin,et al. An asynchronous array of simple processors for dsp applications , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.
[54] Rudolf Eigenmann,et al. Fast, automatic, procedure-level performance tuning , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[55] David A. Padua,et al. In search of a program generator to implement generic transformations for high-performance computing , 2006, Sci. Comput. Program..
[56] Carl Ebeling,et al. Reducing the Space Complexity of Pipelined Routing Using Modified Range Encoding , 2006, 2006 International Conference on Field Programmable Logic and Applications.
[57] Byoung Kyu Choi,et al. Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation , 2006, Comput. Aided Des..
[58] Ken Kennedy,et al. Automatic tuning of whole applications using direct search and a performance-based transformation system , 2006, The Journal of Supercomputing.
[59] Carl Ebeling,et al. A Type Architecture for Hybrid Micro-Parallel Computers , 2006, FCCM.
[60] Mark Stephenson,et al. Automating the construction of compiler heuristics using machine learning , 2006 .
[61] Albert Cohen,et al. A Practical Method for Quickly Evaluating Program Optimizations , 2005, HiPEAC.
[62] Maya Gokhale,et al. Trident: an FPGA compiler framework for floating-point algorithms , 2005, International Conference on Field Programmable Logic and Applications, 2005..
[63] William Thies,et al. Optimizing stream programs using linear state space analysis , 2005, CASES '05.
[64] Alexandru Nicolau,et al. Enhanced Loop Coalescing: A Compiler Technique for Transforming Non-uniform Iteration Spaces , 2005, ISHPC.
[65] Scott Hauck,et al. SPIHT image compression on FPGAs , 2005, IEEE Transactions on Circuits and Systems for Video Technology.
[66] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[67] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[68] Keshav Pingali,et al. Think globally, search locally , 2005, ICS '05.
[69] Grigori Fursin,et al. Probabilistic source-level optimisation of embedded programs , 2005, LCTES '05.
[70] Keith D. Cooper,et al. ACME: adaptive compilation made efficient , 2005, LCTES '05.
[71] William Thies,et al. Teleport messaging for distributed stream programs , 2005, PPoPP.
[72] João M. P. Cardoso. Dynamic loop pipelining in data-driven architectures , 2005, CF '05.
[73] Karl S. Hemmert,et al. An analysis of the double-precision floating-point FFT on FPGAs , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).
[74] Daniel S. Poznanovic,et al. Application development on the SRC Computers, Inc. systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[75] Keith D. Underwood,et al. RC-BLAST: towards a portable, cost-effective open source hardware implementation , 2005, IEEE International Parallel and Distributed Processing Symposium.
[76] Chun Chen,et al. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy , 2005, International Symposium on Code Generation and Optimization.
[77] Yuan Zhao,et al. Scalarization on Short Vector Machines , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[78] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[79] Jeremy Manson,et al. The Java memory model , 2005, POPL '05.
[80] David Pellerin,et al. Practical FPGA programming in C , 2005 .
[81] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[82] Yuan Zhao,et al. Scalarization Using Loop Alignment and Loop Skewing , 2005, The Journal of Supercomputing.
[83] A.P. Kakarountas,et al. Speedups from partitioning software kernels to FPGA hardware in embedded SoCs , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..
[84] Jack Dongarra,et al. An Effective Empirical Search Method for Automatic Software Tuning , 2005 .
[85] Carl Ebeling,et al. QuickRoute: a fast routing algorithm for pipelined architectures , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).
[86] I-Hsin Chung,et al. Using Information from Prior Runs to Improve Automated Tuning Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[87] Lai-Man Po,et al. Enhanced hexagonal search for fast block motion estimation , 2004, IEEE Transactions on Circuits and Systems for Video Technology.
[88] Edwin Hsing-Mean Sha,et al. General loop fusion technique for nested loops considering timing and code size , 2004, CASES '04.
[89] William J. Dally,et al. Evaluating the Imagine stream architecture , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[90] Douglas L. Jones,et al. Fast searches for effective optimization phase sequences , 2004, PLDI '04.
[91] Margo I. Seltzer,et al. Using probabilistic reasoning to automate software tuning , 2004, SIGMETRICS '04/Performance '04.
[92] Guang R. Gao,et al. Single-dimension software pipelining for multi-dimensional loops , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[93] Dominique Lavenier,et al. Experience with a Hybrid Processor: K-Means Clustering , 2004, The Journal of Supercomputing.
[94] Keith D. Cooper,et al. Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.
[95] Zoran Jovanovic,et al. Control Flow Regeneration for Software Pipelined Loops with Conditions , 2004, International Journal of Parallel Programming.
[96] Software pipelining: an effective scheduling technique for VLIW machines , 1988, SIGP.
[97] Allen,et al. Optimizing Compilers for Modern Architectures , 2004 .
[98] Seth Copen Goldstein,et al. C to Asynchronous Dataflow Circuits: An End-to-End Toolflow , 2004 .
[99] Jung Ho Ahn,et al. Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[100] Rudy Lauwereins,et al. ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.
[101] William J. Dally,et al. Programmable Stream Processors , 2003, Computer.
[102] Yunheung Paek,et al. Finding effective optimization phase sequences , 2003, LCTES '03.
[103] William R. Mark,et al. Cg: a system for programming graphics hardware in a C-like language , 2003, ACM Trans. Graph..
[104] Saman P. Amarasinghe,et al. Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.
[105] Gang Ren,et al. A comparison of empirical and model-driven optimization , 2003, PLDI '03.
[106] M. Forina,et al. Cluster analysis: significance, empty space, clustering tendency, non-uniformity. II--Empty Space index. , 2003, Annali di chimica.
[107] Herman Schmit,et al. Efficient application representation for HASTE: Hybrid Architectures with a Single, Transformable Executable , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..
[108] David I. August,et al. Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[109] Brad Calder,et al. Phi-predication for light-weight if-conversion , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[110] Scott A. Mahlke,et al. Predicate-aware scheduling: a technique for reducing resource constraints , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[111] John Wawrzynek,et al. Post-placement C-slow retiming for the xilinx virtex FPGA , 2003, FPGA '03.
[112] Mihai Budiu,et al. Spatial Computation — Summary of the Ph , 2003 .
[113] Tamara G. Kolda,et al. Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods , 2003, SIAM Rev..
[114] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[115] Krishna V. Palem,et al. Software bubbles: using predication to compensate for aliasing in software pipelines , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[116] Seth Copen Goldstein,et al. Compiling Application-Specific Hardware , 2002, FPL.
[117] Brad L. Hutchings,et al. Sea Cucumber: A Synthesizing Compiler for FPGAs , 2002, FPL.
[118] Fan Xiao,et al. Uniformity testing using minimal spanning tree , 2002, Object recognition supported by user interaction for service robots.
[119] Michael F. P. O'Boyle,et al. Evaluating Iterative Compilation , 2002, LCPC.
[120] Philip H. Sweany,et al. Loop fusion for clustered VLIW architectures , 2002, LCTES/SCOPES '02.
[121] Randolph E. Harr,et al. Efficient pipelining of nested loops: unroll-and-squash , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[122] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[123] George C. Necula,et al. CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.
[124] I. D. Coope,et al. A Convergent Variant of the Nelder–Mead Algorithm , 2002 .
[125] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[126] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.
[127] Permalink. Mapping a Single Assignment Programming Language to Reconfigurable Systems , 2002 .
[128] Peter Mattson,et al. A programming system for the imagine media processor , 2002 .
[129] Rudy Lauwereins,et al. DRESC: a retargetable compiler for coarse-grained reconfigurable architectures , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..
[130] Matthew R. Guthaus,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001, Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538).
[131] Robert A. van de Geijn,et al. FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.
[132] Preeti Ranjan Panda,et al. SystemC - a modeling platform supporting multiple design abstractions , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).
[133] Eric Stotzer,et al. Software Pipelining Irregular Loops on the TMS320C6000 VLIW DSP Architecture , 2001, LCTES/OM.
[134] Kalyan Muthukumar,et al. Software Pipelining of Nested Loops , 2001, CC.
[135] William J. Dally,et al. Imagine: Media Processing with Streams , 2001, IEEE Micro.
[136] S. Ramachandran,et al. FPGA implementation of a novel, fast motion estimation algorithm for real-time video compression , 2001, FPGA '01.
[137] John Paul Shen,et al. Register renaming and scheduling for dynamic execution of predicated code , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[138] Alok N. Choudhary,et al. FPGA hardware synthesis from MATLAB , 2001, VLSI Design 2001. Fourteenth International Conference on VLSI Design.
[139] John Wawrzynek,et al. Adapting software pipelining for reconfigurable computing , 2000, CASES '00.
[140] Seth Copen Goldstein,et al. BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations , 2000, Euro-Par.
[141] John Wawrzynek,et al. Stream Computations Organized for Reconfigurable Execution (SCORE) , 2000, FPL.
[142] Mark Stephenson,et al. Bidwidth analysis with application to silicon compilation , 2000, PLDI '00.
[143] Maya Gokhale,et al. Stream-oriented FPGA computing in the Streams-C high level language , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).
[144] John Wawrzynek,et al. The Garp Architecture and C Compiler , 2000, Computer.
[145] Seth Copen Goldstein,et al. PipeRench: A Reconfigurable Architecture and Compiler , 2000, Computer.
[146] Daniel D. Gajski,et al. SPECC: Specification Language and Methodology , 2000 .
[147] Andrew W. Moore,et al. Q2: memory-based active learning for optimizing noisy continuous functions , 1998, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).
[148] Ranette Halverson,et al. A Study of Software Pipelining for Multi-dimensional Problems , 2000 .
[149] M. Budiu,et al. PipeRench: a coprocessor for streaming multimedia acceleration , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).
[150] Keith D. Cooper,et al. Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.
[151] Carl Ebeling,et al. Architecture design of reconfigurable pipelined datapaths , 1999, Proceedings 20th Anniversary Conference on Advanced Research in VLSI.
[152] Scott Hauck,et al. Adaptive Computing in NASA Multi-Spectral Image Processing , 1999 .
[153] Yossi Matias,et al. The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms , 1999, SIAM J. Comput..
[154] Bradford L. Chamberlain,et al. The case for high-level parallel programming in ZPL , 1998 .
[155] Lawrence Snyder,et al. The implementation and evaluation of fusion and contraction in array languages , 1998, PLDI '98.
[156] Maya Gokhale,et al. NAPA C: compiling for a hybrid RISC/FPGA architecture , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).
[157] Carl Ebeling,et al. Specifying and compiling applications for RaPiD , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).
[158] Ray Andraka,et al. A survey of CORDIC algorithms for FPGA based computers , 1998, FPGA '98.
[159] Joseph A. Fisher,et al. Clustered Instruction-Level Parallel Processors , 1998 .
[160] W. PeterM.,et al. FlatteningVLIW code generation for imperfectly nested loops , 1998 .
[161] Tao Yu,et al. Control mechanism for software pipelining on nested loop , 1997, Proceedings. Advances in Parallel and Distributed Computing.
[162] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[163] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[164] Sarita V. Adve,et al. Shared Memory Consistency Models: A Tutorial , 1996, Computer.
[165] Guang R. Gao,et al. Identifying loops using DJ graphs , 1996, TOPL.
[166] Josep Llosa,et al. Swing module scheduling: a lifetime-sensitive approach , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[167] Allan L. Fisher,et al. Flattening and parallelizing irregular, recurrent loop nests , 1995, PPOPP '95.
[168] Scott A. Mahlke,et al. A comparison of full and partial predicated execution support for ILP processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[169] Amit Ganesh. Fusing loops with backward inter loop data dependence , 1994, SIGP.
[170] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[171] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[172] Shumeet Baluja,et al. A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .
[173] J. Ramanujam,et al. Optimal software pipelining of nested loops , 1994, Proceedings of 8th International Parallel Processing Symposium.
[174] Henry G. Dietz,et al. Loop Coalescing and Scheduling for Barrier MIMD Architectures , 1993, IEEE Trans. Parallel Distributed Syst..
[175] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[176] Scott A. Mahlke,et al. Reverse If-Conversion , 1993, PLDI '93.
[177] Robert K. Brayton,et al. ESPRESSO-SIGNATURE: A New Exact Minimizer for Logic Functions , 1993, 30th ACM/IEEE Design Automation Conference.
[178] Thomas Ball,et al. Slicing Programs with Arbitrary Control-flow , 1993, AADEBUG.
[179] Grant E. Haab,et al. Enhanced Modulo Scheduling For Loops With Conditional Branches , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.
[180] Ken Kennedy,et al. Relaxing SIMD control flow constraints using loop transformations , 1992, PLDI '92.
[181] Thomas W. Reps,et al. The use of program dependence graphs in software engineering , 1992, International Conference on Software Engineering.
[182] Scott A. Mahlke,et al. Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..
[183] Jack J. Dongarra,et al. A comparative study of automatic vectorizing compilers , 1991, Parallel Comput..
[184] Lauren L. Smith. Vectorizing C compilers: how good are they? , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[185] Vivek Sarkar,et al. Compact representations for control dependence , 1990, PLDI '90.
[186] Jack Dongarra,et al. Automatic Blocking of Nested Loops , 1990 .
[187] Steve Johnson,et al. Compiling C for vectorization, parallelization, and inline expansion , 1988, PLDI '88.
[188] Constantine D. Polychronopoulos. Loop Coalesing: A Compiler Transformation for Parallel Machines , 1987, ICPP.
[189] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[190] Lawrence Snyder,et al. Type architectures, shared memory, and the corollary of modest potential , 1986 .
[191] Mark Weiser,et al. Program Slicing , 1981, IEEE Transactions on Software Engineering.
[192] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[193] William W. Wadge,et al. Lucid, a nonprocedural language with iteration , 1977, CACM.
[194] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..