Parallelizing Compiler Techniques Based on Linear Inequalities
暂无分享,去创建一个
[1] George B. Dantzig,et al. Linear programming and extensions , 1965 .
[2] Constantine D. Polychronopoulos,et al. Symbolic Analysis: A Basis for Parallelization, Optimization, and Scheduling of Programs , 1993, LCPC.
[3] Ken Kennedy,et al. A linear-time algorithm for computing the memory access sequence in data-parallel programs , 1995, PPOPP '95.
[4] Alexander V. Veidenbaum,et al. Detecting redundant accesses to array data , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[5] Brian W. Kernighan,et al. The UNIX™ programming environment , 1979, Softw. Pract. Exp..
[6] Mary Hall,et al. Interprocedural analysis for parallelization: design and experience , 1995 .
[7] Rice UniversityCORPORATE,et al. High performance Fortran language specification , 1993 .
[8] Monica S. Lam,et al. Interprocedural Analysis for Parallelization , 1995, LCPC.
[9] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[10] I-Chen Wu,et al. An architecture independent programming language for low-level vision , 1989, Comput. Vis. Graph. Image Process..
[11] Manish Gupta,et al. Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..
[12] Monica S. Lam,et al. Multiprocessors from a software perspective , 1996, IEEE Micro.
[13] Monica S. Lam,et al. Efficient context-sensitive pointer analysis for C programs , 1995, PLDI '95.
[14] Chau-Wen Tseng,et al. Compiler optimizations for eliminating barrier synchronization , 1995, PPOPP '95.
[15] François Irigoin,et al. Interprocedural Array Region Analyses , 1996, International Journal of Parallel Programming.
[16] Anne Rogers,et al. Process decomposition through locality of reference , 1989, PLDI '89.
[17] Thomas Rauber,et al. Automatic Parallelization for Distributed Memory Multiprocessors , 1994, Automatic Parallelization.
[18] Charles Koelbel. Compile-time generation of regular communications patterns , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[19] Kleanthis Psarris,et al. On the perfect accuracy of an approximate subscript analysis test , 1990, ICS '90.
[20] Anoop Gupta,et al. An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors , 1993, Supercomputing '93. Proceedings.
[21] Thomas R. Gross,et al. Compiling task and data parallel programs for iWarp , 1993, SIGP.
[22] Jean-Louis Pazat,et al. Compiling sequential programs for distributed memory parallel computers with Pandore II , 1993 .
[23] A. Gupta,et al. The Stanford FLASH multiprocessor , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[24] Guang R. Gao,et al. Scheduling and mapping: software pipelining in the presence of structural hazards , 1995, PLDI '95.
[25] Lawrence Rauchwerger,et al. Effective Automatic Parallelization with Polaris , 1995 .
[26] Barbara M. Chapman,et al. Programming in Vienna Fortran , 1992, Sci. Program..
[27] Monica S. Lam,et al. Maximizing parallelism and minimizing synchronization with affine transforms , 1997, POPL '97.
[28] Williams Ludwell HarrisonIII. The interprocedural analysis and automatic parallelization of Scheme programs , 1989 .
[29] Geoffrey C. Fox,et al. A Compilation Approach for Fortran 90D/HPF Compilers on Distributed Memory MIMD Computers , 1993 .
[30] David M. Fenwick,et al. The AlphaServer 8000 Series: High-end Server Platform Development , 1995, Digit. Tech. J..
[31] Charles Koelbel,et al. High Performance Fortran Handbook , 1993 .
[32] Scott W. Haney,et al. Is C++ fast enough for scientific computing? , 1994 .
[33] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS 1987.
[34] Margaret Martonosi,et al. Evaluating the impact of advanced memory systems on compiler-parallelized codes , 1995, PACT.
[35] W. Kelly,et al. Code generation for multiple mappings , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.
[36] Ruben W. Castelino,et al. Internal Organization of the Alpha 21164, a 300-MHz 64-bit Quad-issue CMOS RISC Microprocessor , 1995, Digit. Tech. J..
[37] Susan J. Eggers,et al. Eliminating False Sharing , 1991, ICPP.
[38] Edith Schonberg,et al. An HPF Compiler for the IBM SP2 , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[39] Ken Kennedy,et al. Automatic Data Layout Using 0-1 Integer Programming , 1994, IFIP PACT.
[40] Eugene W. Myers,et al. A precise inter-procedural data flow algorithm , 1981, POPL '81.
[41] Computer Staff. Parallel processors were the future ... and may yet be , 1996 .
[42] Todd C. Mowry,et al. Compiler-directed page coloring for multiprocessors , 1996, ASPLOS VII.
[43] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS.
[44] Dror Eliezer Maydan. Accurate analysis of array references , 1993 .
[45] Donald Yeung,et al. THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .
[46] Henry G. Dietz,et al. Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation , 1991, LCPC.
[47] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[48] Ronald L. Graham,et al. Concrete mathematics - a foundation for computer science , 1991 .
[49] David K. Smith. Theory of Linear and Integer Programming , 1987 .
[50] Guy L. Steele,et al. Fortran at ten gigaflops: the connection machine convolution compiler , 1991, PLDI '91.
[51] Susan J. Eggers,et al. Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.
[52] Marina C. Chen,et al. Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[53] Ken Kennedy,et al. The ParaScope parallel programming environment , 1993, Proc. IEEE.
[54] Monica S. Lam,et al. Data Dependence and Data-Flow Analysis of Arrays , 1992, LCPC.
[55] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[56] Jeffrey D. Ullman,et al. Global Data Flow Analysis and Iterative Algorithms , 1976, J. ACM.
[57] William Pugh,et al. Minimizing communication while preserving parallelism , 1996, ICS '96.
[58] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.
[59] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[60] Zbigniew Chamski,et al. Nested loop sequences: towards efficient loop structures in automatic parallelization , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.
[61] J. Palmer,et al. Connection Machine model CM-5 system overview , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.
[62] François Irigoin. Interprocedural analyses for programming environments , 1993 .
[63] Rudolf Eigenmann,et al. Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..
[64] Ken Kennedy,et al. A technique for summarizing data access and its use in parallelism enhancing transformations , 1989, PLDI '89.
[65] Rudolf Eigenmann,et al. Automatic program parallelization , 1993, Proc. IEEE.
[66] Paul Feautrier,et al. Construction of Do Loops from Systems of Affine Constraints , 1995, Parallel Process. Lett..
[67] Ken Kennedy,et al. Incremental dependence analysis , 1990 .
[68] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[69] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[70] Randy H. Katz,et al. The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS III.
[71] George B. Dantzig,et al. Fourier-Motzkin Elimination and Its Dual , 1973, J. Comb. Theory A.
[72] John R. Grout,et al. Inline Expansion For The Polaris Research Compiler , 1995 .
[73] Monica S. Lam,et al. Data and computation transformations for multiprocessors , 1995, PPOPP '95.
[74] Monica S. Lam,et al. Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.
[75] Michael E. Wolf,et al. Improving locality and parallelism in nested loops , 1992 .
[76] Monica S. Lam,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.
[77] Michael E. Wolf,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[78] William F. Appelbe,et al. Optimizing Parallel Programs Using Affinity Regions , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[79] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[80] M. Schlansker,et al. The Cydra 5 computer system architecture , 1988, Proceedings 1988 IEEE International Conference on Computer Design: VLSI.
[81] Chau-Wen Tseng. An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .
[82] Samuel P. Midkiff,et al. An Empirical Study of Precise Interprocedural Array Analysis , 1994, Sci. Program..
[83] Thomas R. Gross,et al. Structured dataflow analysis for arrays and its use in an optimizing compiler , 1990, Softw. Pract. Exp..
[84] Michael Gerndt,et al. Automatic parallelization for distributed-memory multiprocessing systems , 1989 .
[85] P.-S. Tseng,et al. A parallelizing compiler for distributed memory parallel computers , 1989, PLDI 1989.
[86] W. Jalby,et al. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.
[87] Monica S. Lam,et al. Interprocedural Parallelization Analysis: Preliminary Results , 1995 .
[88] David A. Padua,et al. Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs , 1991, LCPC.
[89] Charles Koelbel,et al. Semi-Automatic Domain Decomposition in BLAZE , 1987, ICPP.
[90] Fred C. Chow,et al. A portable machine-independent global optimizer--design and measurements , 1984 .
[91] Steven W. K. Tjiang,et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.
[92] Anoop Gupta,et al. The DASH prototype: implementation and performance , 1992, ISCA '92.
[93] Randolph G. Scarborough,et al. A Vectorizing Fortran Compiler , 1986, IBM J. Res. Dev..
[94] Yunheung Paek,et al. Parallel Programming with Polaris , 1996, Computer.
[95] William Pugh,et al. Eliminating false data dependences using the Omega test , 1992, PLDI '92.
[96] Monica S. Lam,et al. An Overview of a Compiler for Scalable Parallel Machines , 1993, LCPC.
[97] Marina C. Chen,et al. The Data Alignment Phase in Compiling Programs for Distrubuted-Memory Machines , 1991, J. Parallel Distributed Comput..
[98] Williams Ludwell Harrison,et al. The interprocedural analysis and automatic parallelization of Scheme programs , 1990, LISP Symb. Comput..
[99] P. Feautrier. Parametric integer programming , 1988 .
[100] Guy L. Steele,et al. The High Performance Fortran Handbook , 1993 .
[101] Peter Michielse. Programming the Convex Exemplar Series SPP System , 1994, PARA.
[102] John R. Gilbert,et al. Generating local addresses and communication sets for data-parallel programs , 1993, PPOPP '93.
[103] Jean-Louis Pazat,et al. PANDORE: a system to manage data distribution , 1992 .
[104] Barbara G. Ryder,et al. Interprocedural modification side effect analysis with pointer aliasing , 1993, PLDI '93.
[105] William Pugh,et al. A practical algorithm for exact array dependence analysis , 1992, CACM.
[106] John M. Mellor-Crummey,et al. FIAT: A Framework for Interprocedural Analysis and Transfomation , 1993, LCPC.
[107] Chau-Wen Tseng,et al. An Overview of the SUIF Compiler for Scalable Parallel Machines , 1995, PPSC.
[108] Monica S. Lam,et al. Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[109] Hudson Benedito Ribas. Obtaining Dependence Vectors for Nested-Loop Computations , 1990, ICPP.
[110] Ken Kennedy,et al. A Methodology for Procedure Cloning , 1993, Computer languages.
[111] Pierre Jouvelot,et al. Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .
[112] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[113] Barbara M. Chapman,et al. Handling Distributed Data in Vienna Fortran Procedures , 1992, LCPC.
[114] P. Feautrier. Array expansion , 1988 .
[115] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[116] John R. Gilbert,et al. Aligning parallel arrays to reduce communication , 1994, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.
[117] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[118] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[119] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[120] Anant Agarwal,et al. Automatic Partitioning of Parallel Loops for Cache-Coherent Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[121] William Pugh,et al. The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[122] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[123] Josep Torrellas,et al. Share Data Placement Optimizations to Reduce Multiprocessor Cache Miss Rates , 1990, ICPP.
[124] Michael L. Scott,et al. False sharing and its effect on shared memory performance , 1993 .
[125] Martine Ancourt. Generation automatique de codes de transfert pour multiprocesseurs a memoires locales , 1991 .
[126] Peng Tu,et al. Automatic array privatization and demand-driven symbolic analysis , 1996 .
[127] Piyush Mehrotra,et al. Programming distributed memory architectures using Kali , 1990 .
[128] Paul Havlak,et al. Interprocedural symbolic analysis , 1995 .
[129] Paul Feautrier,et al. Direct parallelization of call statements , 1986, SIGPLAN '86.
[130] Thierry Jéron,et al. Towards Automatic Distribution of Testers for Distributed Conformance Testing , 1998, FORTE.
[131] Barbara M. Chapman,et al. Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.