Dynamic feedback: an effective technique for adaptive computing
暂无分享,去创建一个
[1] John Zahorjan,et al. Improving the performance of runtime parallelization , 1993, PPOPP '93.
[2] Scott A. Mahlke,et al. Profile‐guided automatic inline expansion for C programs , 1992, Softw. Pract. Exp..
[3] Eric A. Brewer,et al. High-level optimization via automated statistical modeling , 1995, PPOPP '95.
[4] L. Rauchwerger,et al. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..
[5] Robert H. Halstead,et al. Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.
[6] Andrew A. Chien,et al. A Hybrid Execution Model for Fine-Grained Languages on Distributed Memory Multicomputers , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[7] Urs Hölzle,et al. Optimizing dynamically-dispatched calls with run-time type feedback , 1994, PLDI '94.
[8] Micha Sharir,et al. Experience with the SETL Optimizer , 1983, TOPL.
[9] Monica S. Lam,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.
[10] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.
[11] S LamMonica,et al. Communication optimization and code generation for distributed memory machines , 1993 .
[12] Brian N. Bershad,et al. Fast, effective dynamic compilation , 1996, PLDI '96.
[13] Robert J. Fowler,et al. Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.
[14] Seth Copen Goldstein,et al. Lazy Threads: Implementing a Fast Parallel Call , 1996, J. Parallel Distributed Comput..
[15] James R. Larus,et al. Application-specific protocols for user-level shared memory , 1994, Proceedings of Supercomputing '94.
[16] Harry Berryman,et al. Multiprocessors and run-time compilation , 1991, Concurr. Pract. Exp..
[17] Andrew A. Chien,et al. Obtaining sequential efficiency for concurrent object-oriented languages , 1995, POPL '95.
[18] Ken Kennedy,et al. Automatic Data Layout for High Performance Fortran , 1995, SC.
[19] Reinaldo J. Michelena,et al. Tomographic string inversion , 1990 .
[20] W. G. Morris,et al. CCG: a prototype coagulating code generator , 1991, PLDI '91.
[21] Donald E. Knuth,et al. An empirical study of FORTRAN programs , 1971, Softw. Pract. Exp..
[22] Monica S. Lam,et al. Heterogeneous parallel programming in Jade , 1992, Proceedings Supercomputing '92.
[23] S LamMonica,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993 .
[24] Monica S. Lam,et al. Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.
[25] Gregor Kiczales,et al. Beyond the Black Box: Open Implementation , 1996, IEEE Softw..
[26] Martin C. Rinard,et al. Commutativity analysis: a new analysis framework for parallelizing compilers , 1996, PLDI '96.
[27] Mary F. Fernández,et al. Simple and effective link-time optimization of Modula-3 programs , 1995, PLDI '95.
[28] David Grove,et al. Profile-guided receiver class prediction , 1995, OOPSLA.
[29] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[30] Karl Pettis,et al. Profile guided code positioning , 1990, PLDI '90.
[31] Dawson R. Engler,et al. VCODE: a retargetable, extensible, very fast dynamic code generation system , 1996, PLDI '96.
[32] Peter Lee,et al. Optimizing ML with run-time code generation , 1996, PLDI '96.
[33] Daeyeon Park,et al. Improving the effectiveness of software prefetching with adaptive executions , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[34] Manish Gupta,et al. Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..
[35] Daniel E. Lenoski,et al. The design and analysis of DASH: a scalable directory-based multiprocessor , 1992 .
[36] Brian N. Bershad,et al. Dynamic Page Mapping Policies for Cache Conflict Resolution on Standard Hardware , 1994, OSDI.
[37] David W. Wall,et al. Global register allocation at link time , 1986, SIGPLAN '86.
[38] Martin Rinard,et al. Synchronization transformations for parallel computing , 1999, ACM-SIGACT Symposium on Principles of Programming Languages.