A hierarchical model of data locality
暂无分享,去创建一个
Chen Ding | Mitsunori Ogihara | Chengliang Zhang | Youfeng Wu | Yutao Zhong | M. Ogihara | Youfeng Wu | C. Ding | Chengliang Zhang | Y. Zhong | Mitsunori Ogihara
[1] Michael D. Smith,et al. Procedure placement using temporal-ordering information , 1999, TOPL.
[2] Ken Kennedy,et al. Automatic data layout for distributed-memory machines , 1998, TOPL.
[3] KennedyKen,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2004 .
[4] Alan Jay Smith,et al. Cache Memories , 1982, CSUR.
[5] Hwansoo Han,et al. Locality Optimizations For Adaptive Irregular Scientific Codes , 2000 .
[6] Robert E. Tarjan,et al. Self-adjusting binary search trees , 1985, JACM.
[7] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[8] KremerUlrich,et al. Automatic data layout for distributed-memory machines , 1998 .
[9] Christos H. Papadimitriou,et al. Computational complexity , 1993 .
[10] Steve Carr,et al. Instruction based memory distance analysis and its application to optimization , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[11] Kristof Beyls,et al. Reuse Distance-Based Cache Hint Selection , 2002, Euro-Par.
[12] John M. Mellor-Crummey,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.
[13] Mihalis Yannakakis,et al. The complexity of multiway cuts (extended abstract) , 1992, STOC '92.
[14] Donald E. Knuth,et al. An empirical study of FORTRAN programs , 1971, Softw. Pract. Exp..
[15] Gabriel Marin mgabi. Scalable Cross-Architecture Predictions of Memory Hierarchy Response for Scientific Applications , 2005 .
[16] Ken Kennedy,et al. Improving memory hierarchy performance for irregular applications , 1999, ICS '99.
[17] Matteo Frigo,et al. Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).
[18] Chen Ding,et al. Locality phase prediction , 2004, ASPLOS XI.
[19] Irving L. Traiger,et al. Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..
[20] Chen Ding,et al. Array regrouping and structure splitting using whole-program reference affinity , 2004, PLDI '04.
[21] Yan Solihin,et al. Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.
[22] Ken Kennedy,et al. Typed Fusion with Applications to Parallel and Sequential Code Generation , 1994 .
[23] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[24] Paul D. Hovland,et al. Metrics and models for reordering transformations , 2004, MSP '04.
[25] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[26] Dror Rawitz,et al. The hardness of cache conscious data placement , 2002, POPL '02.
[27] Ken Kennedy,et al. Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.
[28] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.
[29] Chen Ding,et al. Regression-Based Multi-Model Prediction of Data Reuse Signature , 2003 .
[30] Robert E. Tarjan,et al. Amortized efficiency of list update and paging rules , 1985, CACM.
[31] M. Ogihara,et al. Finding the Reference Affinity Groups in Trace using Sampling Method , 2004 .
[32] Bowen Alpern,et al. A model for hierarchical memory , 1987, STOC.
[33] Alain Darte. On the Complexity of Loop Fusion , 2000, Parallel Comput..
[34] Michael A. Bender,et al. Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[35] Mithuna Thottethodi,et al. Nonlinear array layouts for hierarchical memory systems , 1999, ICS '99.
[36] KennedyKen,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999 .
[37] James R. Larus,et al. Cache-conscious structure layout , 1999, PLDI '99.
[38] Larry Carter,et al. Compile-time composition of run-time data and iteration reorderings , 2003, PLDI '03.
[39] Marc Snir,et al. On the Theory of Spatial and Temporal Locality , 2005 .
[40] Jeremy D. Frens,et al. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code , 1997, PPOPP '97.
[41] Jeremy D. Frens,et al. QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism , 2003, PPoPP '03.
[42] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[43] Jeffrey Scott Vitter,et al. External memory algorithms and data structures: dealing with massive data , 2001, CSUR.
[44] Chen Ding,et al. Miss rate prediction across all program inputs , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[45] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..
[46] Richard E. Hank,et al. Region-based compilation: an introduction and motivation , 1995, MICRO 1995.
[47] Khalid Omar Thabit,et al. Cache management by the compiler , 1982 .
[48] Chandra Krintz,et al. Cache-conscious data placement , 1998, ASPLOS VIII.
[49] Galen C. Hunt,et al. The Coign automatic distributed partitioning system , 1999, OSDI '99.
[50] Trishul M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality , 2001, PLDI '01.
[51] Yutao Zhong,et al. Predicting whole-program locality through reuse distance analysis , 2003, PLDI.
[52] Bowen Alpern,et al. The uniform memory hierarchy model of computation , 2005, Algorithmica.
[53] Peter J. Denning,et al. Working Sets Past and Present , 1980, IEEE Transactions on Software Engineering.
[54] Sally A. McKee,et al. Improving the computational intensity of unstructured mesh applications , 2005, ICS '05.
[55] Ken Kennedy,et al. Transforming loops to recursion for multi-level memory hierarchies , 2000, PLDI '00.