论文信息 - Cost effective data center servers

Cost effective data center servers

The exploding growth of digitalized information has led to the rapid growth of data centers, both in numbers and in size. Cluster has been the dominating system architecture used in most data centers. However, the increasingly diversified data center applications have requirements beyond what the cluster architecture can deliver. For instance, clouding computing requires flexible sharing of all data center resources. Big data applications often need large memory capacity. A few applications can use GPGPU effectively. Existing system might be extended to a certain degree to meet those needs. Those extensions however would often be prohibitively expensive. The paper presents our attempt to design a system using commodity products that can meet the varying needs of many emerging data center applications in a cost-effective way. Our attempt is to create a system by connecting multiple nodes through a PCIe switch and then extend the software stack to support resource sharing among these nodes. In particular, a node can directly use the memory, NIC, and GPGPU of other nodes through the PCIe switch with no or little involvement from other nodes. We build a prototype as our evaluation platform. Our evaluation results indicate that those resources can be shared effectively in many cases. For using remote memory as block device, our prototype system has 5 times bandwidth, 11 times IOPS and 1/12 latency compared with the system connected by 10GigE in average for Orion benchmark; Using remote GPGPU via PCIe switch achieves average 60 times speedup than the case without GPGPU, and the performance loss is also acceptable (its average execution time is 1/3 of local GPGPU) for micro-benchmarks from GPU computing SDK; And using remote NIC via PCIe switch achieves average 95% bandwidth and 1.4 times latency of local NIC in httperf testing. While our prototype system offers multiple benefits, it is not perfect and has a lot room for further optimization and extension. We hope the outcome presented in this paper will encourage more researchers to join us in designing highly efficient and cost-effective servers.

[1] Trevor N. Mudge,et al. Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments , 2008, 2008 International Symposium on Computer Architecture.

[2] Parag Agrawal,et al. The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[3] Jeanna Neefe Matthews,et al. An Exploration of Network RAM , 1998 .

[4] Venkata Krishnan. Evaluation of an Integrated PCI Express IO Expansion and Clustering Fabric , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[5] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[6] John Byrne,et al. Power-efficient networking for balanced system designs: early experiences with PCIe , 2011, HotPower '11.

[7] Dhabaleswar K. Panda,et al. Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device , 2005, 2005 IEEE International Conference on Cluster Computing.

[8] Venkata Krishnan. Towards an integrated IO and clustering solution using PCI express , 2007, 2007 IEEE International Conference on Cluster Computing.

[9] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[10] Christoforos E. Kozyrakis,et al. Towards energy-proportional datacenter memory with mobile DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[11] Ren Wang,et al. A collaborative memory system for high-performance and cost-effective clustered architectures , 2011, ASBD '11.

[12] Thomas F. Wenisch,et al. Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[13] Jaspal Subhlok,et al. Fabric convergence implications on systems architecture , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[14] Umesh Deshpande,et al. MemX: Virtualization of Cluster-Wide Memory , 2010, 2010 39th International Conference on Parallel Processing.