Importance of explicit vectorization for CPU and GPU software performance
暂无分享,去创建一个
[1] F. Guerra. Spin Glasses , 2005, cond-mat/0507581.
[2] Junyi Xia,et al. High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy. , 2008, Medical physics.
[3] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.
[4] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[5] Takuji Nishimura,et al. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.
[6] Donald Ervin Knuth,et al. The Art of Computer Programming , 1968 .
[7] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..
[8] Saraju P. Mohanty. GPU-CPU multi-core for real-time signal processing , 2009, 2009 Digest of Technical Papers International Conference on Consumer Electronics.
[9] Reuven Y. Rubinstein,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.
[10] Fabián A. Chudak,et al. Investigating the performance of an adiabatic quantum optimization processor , 2010, Quantum Inf. Process..
[11] M. Suzuki,et al. Generalized Trotter's formula and systematic approximants of exponential operators and inner derivations with applications to many-body problems , 1976 .
[12] Donald E. Knuth,et al. Sorting and Searching , 1973 .
[13] Donald E. Knuth,et al. The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .
[14] Allen,et al. Optimizing Compilers for Modern Architectures , 2004 .
[15] Firas Hamze,et al. High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context , 2011, Int. J. High Perform. Comput. Appl..
[16] Firas Hamze,et al. Robust Parameter Selection for Parallel Tempering , 2010 .
[17] Jason Wittenberg,et al. Clarify: Software for Interpreting and Presenting Statistical Results , 2003 .
[18] Peter Stone,et al. Improving particle filter performance using SSE instructions , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] James B. Anderson,et al. Quantum Monte Carlo: Origins, Development, Applications , 2007 .
[20] M. V. Wilkes,et al. The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .
[21] Wolfgang Paul,et al. GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model , 2009, J. Comput. Phys..
[22] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.
[23] Jens H. Krüger,et al. A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.
[24] Michael Gschwind,et al. Using advanced compiler technology to exploit the performance of the Cell Broadband EngineTM architecture , 2006, IBM Syst. J..
[25] Hamid Sarbazi-Azad,et al. Efficient SIMD Numerical Interpolation , 2005, HPCC.
[26] Stanimire Tomov,et al. Benchmarking and implementation of probability-based simulations on programmable graphics cards , 2003, Comput. Graph..
[27] Nobuhiko Saitô,et al. Statistical Physics I : Equilibrium Statistical Mechanics , 1983 .
[28] L. Ridgway Scott,et al. Scientific Parallel Computing , 2005 .
[29] Emile H. L. Aarts,et al. Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.