Compute bottlenecks on the new 64-bit ARM
暂无分享,去创建一个
Ananta Tiwari | Michael Laurenzano | Laura Carrington | Adam Jundt | Allyson Cauble-Chantrenne | Joshua Peraza
[1] Vincent M. Weaver,et al. Design and Analysis of a 32-bit Embedded High-Performance Cluster Optimized for Energy and Performance , 2014, 2014 Hardware-Software Co-Design for High Performance Computing.
[2] Pascal Bouvry,et al. Performance Evaluation and Energy Efficiency of High-Density HPC Platforms Based on Intel, AMD and ARM Processors , 2013, EE-LSDS.
[3] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.
[4] Ananta Tiwari,et al. Characterizing the Performance-Energy Tradeoff of Small ARM Cores in HPC Computation , 2014, Euro-Par.
[5] Andreas Moshovos,et al. Instruction flow-based front-end throttling for power-aware high-performance processors , 2001, ISLPED '01.
[6] Alejandro Rico,et al. Tibidabo: Making the case for an ARM-based HPC system , 2014, Future Gener. Comput. Syst..
[7] Richard F. Gunst,et al. Applied Regression Analysis , 1999, Technometrics.
[8] Phillip Stanley-Marbell,et al. Performance, Power, and Thermal Analysis of Low-Power Processors for Scale-Out Systems , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[9] Brian Bockelman,et al. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi , 2014, ArXiv.
[10] Sandia Report,et al. Improving Performance via Mini-applications , 2009 .
[11] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[12] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[13] Francieli Zanon Boito,et al. Performance/energy trade-off in scientific computing: the case of ARM big.LITTLE and Intel Sandy Bridge , 2015, IET Comput. Digit. Tech..
[14] Ananta Tiwari,et al. Making the Most of SMT in HPC , 2014, ACM Trans. Archit. Code Optim..
[15] Pradeep Dubey,et al. Can traditional programming bridge the Ninja performance gap for parallel computing applications? , 2015, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[16] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[17] Mark S. Gordon,et al. Performance and energy efficiency analysis of 64-bit ARM using GAMESS , 2015, Co-HPC@SC.
[18] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[19] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[20] Antti Ylä-Jääski,et al. Energy- and Cost-Efficiency Analysis of ARM-Based Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[21] Simon D. Hammond,et al. Analysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q , 2013, PMBS@SC.
[22] Chris H. Q. Ding,et al. K-means clustering via principal component analysis , 2004, ICML.