Toward Modeling Cache-Miss Ratio for Dense-Data-Access-Based Optimization
暂无分享,去创建一个
[1] Abhishek Bhattacharjee,et al. Efficient Address Translation for Architectures with Multiple Page Sizes , 2017, ASPLOS.
[2] Henri-Pierre Charles,et al. deGoal a Tool to Embed Dynamic Code Generators into Applications , 2014, CC.
[3] Wesley W. Chu,et al. The page fault frequency replacement algorithm , 1972, AFIPS '72 (Fall, part I).
[4] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[5] Daniele G. Spampinato,et al. A basic linear algebra compiler for structured matrices , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[6] Xin-She Yang,et al. Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.
[7] Alvis Cheuk M. Fong,et al. Applying Supervised Learning to the Static Prediction of Locality-Pattern Complexity in Scientific Code , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).
[8] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[9] David A. Patterson,et al. A new golden age for computer architecture , 2019, Commun. ACM.
[10] Millad Ghane,et al. False Sharing Detection in OpenMP Applications Using OMPT API , 2015, IWOMP.
[11] John McCarthy,et al. History of LISP , 1978, SIGP.
[12] Albert Cohen,et al. The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.
[13] Vania Marangozova-Martin,et al. BOAST: Bringing Optimization through Automatic Source-to-Source Transformations , 2013, 2013 IEEE 7th International Symposium on Embedded Multicore Socs.
[14] Lars Lundberg,et al. Optimizing dynamic memory management in a multithreaded application executing on a multiprocessor , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).
[15] Chen Ding,et al. Miss Rate Prediction Across Program Inputs and Cache Configurations , 2007, IEEE Transactions on Computers.
[16] Sid Lakhdar,et al. On the Impact of Asynchronous I/O on the performance of the Cube re-mapper at High Performance Computing Scale , 2017 .
[17] Richard O’Neil,et al. Convolution operators and $L(p,q)$ spaces , 1963 .