Scale-Out vs Scale-Up
暂无分享,去创建一个
Reza Azimi | Sherief Reda | Tyler Fox | Wendy Gonzalez | R. Azimi | S. Reda | Tyler Fox | Wendy Gonzalez
[1] H. Abdi. Partial Least Square Regression PLS-Regression , 2007 .
[2] Jialin Li,et al. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.
[3] Juan Touriño,et al. Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures , 2009, PVM/MPI.
[4] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.
[5] Ananta Tiwari,et al. Compute bottlenecks on the new 64-bit ARM , 2015, E2SC '15.
[6] Brian Bockelman,et al. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi , 2014, ArXiv.
[7] Pascal Bouvry,et al. Performance Evaluation and Energy Efficiency of High-Density HPC Platforms Based on Intel, AMD and ARM Processors , 2013, EE-LSDS.
[8] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[9] Eduard Ayguadé,et al. The Mont-Blanc Prototype: An Alternative Approach for HPC Systems , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[10] Alex Ramírez,et al. The low-power architecture approach towards exascale computing , 2011, ScalA '11.
[11] Antti Ylä-Jääski,et al. Energy- and Cost-Efficiency Analysis of ARM-Based Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[12] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Karthikeyan Sankaralingam,et al. Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[14] Drago Zagar,et al. Towards an energy efficient SoC computing cluster , 2014, 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).
[15] Thomas F. Wenisch,et al. Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.
[16] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[17] Mateo Valero,et al. Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[20] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[21] Jignesh M. Patel,et al. Wimpy node clusters: what about non-wimpy workloads? , 2010, DaMoN '10.
[22] Daisuke Takahashi,et al. The HPC Challenge (HPCC) benchmark suite , 2006, SC.
[23] Geoffrey Fox,et al. Evaluating ARM HPC clusters for scientific workloads , 2015, Concurr. Comput. Pract. Exp..
[24] Luiz Marcos Garcia Gonçalves,et al. Towards green data centers: A comparison of x86 and ARM architectures power efficiency , 2012, J. Parallel Distributed Comput..
[25] Sherief Reda,et al. Scheduling challenges and opportunities in integrated CPU+GPU processors , 2016, 2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia).
[26] Brad Fitzpatrick,et al. Distributed caching with memcached , 2004 .
[27] Alejandro Rico,et al. Tibidabo: Making the case for an ARM-based HPC system , 2014, Future Gener. Comput. Syst..
[28] Andrzej Nowak,et al. Hierarchical cycle accounting: a new method for application performance tuning , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[29] Reza Azimi,et al. How Good Are Low-Power 64-Bit SoCs for Server-Class Workloads? , 2015, 2015 IEEE International Symposium on Workload Characterization.