A Principled Kernel Testbed for Hardware/Software Co-Design Research
暂无分享,去创建一个
James Demmel | Erich Strohmaier | David H. Bailey | Khaled Z. Ibrahim | Kamesh Madduri | Samuel Williams | Alexander D. Kaiser | J. Demmel | D. Bailey | Samuel Williams | E. Strohmaier | Kamesh Madduri | K. Ibrahim | A. Kaiser
[1] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[2] David B. Yoffie,et al. Intel Corporation 2005 , 2005 .
[3] Kunle Olukotun,et al. STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.
[4] J. Demmel,et al. A TESTING INFRASTRUCTURE FOR LAPACK ’ S SYMMETRIC EIGENSOLVERS , 2007 .
[5] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[6] Samuel Williams,et al. A design methodology for domain-optimized power-efficient supercomputing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[7] Samuel Williams,et al. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[8] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[9] Volume Assp,et al. ACOUSTICS. SPEECH. AND SIGNAL PROCESSING , 1983 .
[10] David A. Bader,et al. BioPerf: a benchmark suite to evaluate high-performance computer architecture on bioinformatics applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[11] Samuel Williams,et al. Lattice Boltzmann simulation optimization on leading multicore platforms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[12] David A. Bader. Designing Scalable Synthetic Compact Applications for Benchmarking High Productivity Computing Systems , 2006 .
[13] Edward A. Lee,et al. The Parallel Computing Laboratory at U.C. Berkeley: A Research Agenda Based on the Berkeley View , 2008 .
[14] Samuel Williams,et al. An auto-tuning framework for parallel multicore stencil computations , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[15] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[16] Berkin Özisikyilmaz,et al. MineBench: A Benchmark Suite for Data Mining Workloads , 2006, 2006 IEEE International Symposium on Workload Characterization.
[17] Glenn Reinman,et al. ParallAX: an architecture for real-time physics , 2007, ISCA '07.
[18] Miodrag Potkonjak,et al. MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[19] Samuel Williams,et al. Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[20] Yen-Kuang Chen,et al. The ALPBench benchmark suite for complex multimedia applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[21] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[22] Keshav Pingali,et al. Lonestar: A suite of parallel irregular programs , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[23] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[24] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[25] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[26] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[27] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[28] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..