暂无分享,去创建一个
[1] Hyesoon Kim,et al. An integrated GPU power and performance model , 2010, ISCA.
[2] Andreas Klöckner. Loo.py: from fortran to performance via transformation and substitution rules , 2015, ARRAY@PLDI.
[3] William Gropp,et al. An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.
[4] Alexander I. Barvinok,et al. A Polynomial Time Algorithm for Counting Integral Points in Polyhedra when the Dimension Is Fixed , 1993, FOCS.
[5] Andreas Klöckner,et al. Loo.py: transformation-based code generation for GPUs and CPUs , 2014, ARRAY@PLDI.
[6] Kapil Vaswani,et al. A Predictive Performance Model for Superscalar Processors , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[7] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[8] Sven Verdoolaege,et al. isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.
[9] Yao Zhang,et al. A quantitative performance analysis model for GPU architectures , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[10] Timothy G. Mattson,et al. OpenCL Programming Guide , 2011 .
[11] Michael F. P. O'Boyle,et al. Automatic performance model construction for the fast software exploration of new hardware designs , 2006, CASES '06.
[12] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[13] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[14] Vincent Loechner,et al. Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions , 2007, Algorithmica.
[15] Philip J. Fleming,et al. How not to lie with statistics: the correct way to summarize benchmark results , 1986, CACM.