Architecture comparisons between Nvidia and ATI GPUs: Computation parallelism and data communications
暂无分享,去创建一个
Bin Li | Lu Peng | Ying Zhang | Jianmin Chen | Jih-Kwon Peir | B. Li | Jianmin Chen | Ying Zhang | Lu Peng | J. Peir | Bin Li
[1] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[2] D. N. Geary. Mixture Models: Inference and Applications to Clustering , 1989 .
[3] A. Raftery,et al. Model-based Gaussian and non-Gaussian clustering , 1993 .
[4] G. Celeux,et al. Comparison of the mixture and the classification maximum likelihood in cluster analysis , 1993 .
[5] D. Botstein,et al. Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.
[6] Adrian E. Raftery,et al. Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .
[7] Lieven Eeckhout,et al. Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[8] Lu Peng,et al. Memory Performance and Scalability of Intel's and AMD's Dual-Core Processors: A Case Study , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.
[9] Majid Sarrafzadeh,et al. Energy-aware high performance computing with graphic processing units , 2008, CLUSTER 2008.
[10] Carl Staelin,et al. Memory hierarchy performance measurement of commercial dual-core desktop processors , 2008, J. Syst. Archit..
[11] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[12] Song Huang,et al. On the energy efficiency of graphics processing units for scientific computing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[13] Wolfgang E. Nagel,et al. Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[14] J. Xu. OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .
[15] Andreas Moshovos,et al. Demystifying GPU microarchitecture through microbenchmarking , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).
[16] Hyesoon Kim,et al. An integrated GPU power and performance model , 2010, ISCA.
[17] Tao Li,et al. Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).
[18] Reiji Suda,et al. Investigation on the power efficiency of multi-core and GPU Processing Element in large scale SIMD computation with CUDA , 2010, International Conference on Green Computing.
[19] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.
[20] Xiaoming Li,et al. A Micro-benchmark Suite for AMD GPUs , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[21] Bin Li,et al. Performance and Power Analysis of ATI GPU: A Statistical Approach , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.
[22] Bin Li,et al. Tree structured analysis on GPU power study , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).
[23] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[24] Mohamed F. Ahmed,et al. A comparative benchmarking of the FFT on Fermi and Evergreen GPUs , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[25] Yao Zhang,et al. A quantitative performance analysis model for GPU architectures , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[26] Jack J. Dongarra,et al. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..