An energy-efficient system on a programmable chip platform for cloud applications

Performance analysis of master-slave based reconfigurable architecture and standalone SOPC based reconfigurable architecture using an analytical model.A massive-sessions optimized TCP/IP offload engine that supports up to 100K TCP sessions under 10Gbps line rate.An online dynamic scheduling method that can reconfigure or power off FPGA nodes according to workload variance to reduce the runtime energy consumption.Prototype of a reconfigurable cluster system based on standalone SOPC to provide cloud services, achieving up to 38X speed up in performance and 418X improvement in energy efficiency compared to the software based cloud systems. Traditional cloud service providers build large data-centers with a huge number of connected commodity computers to meet the ever-growing demand on performance. However, the growth potential of these data-centers is limited by their corresponding energy consumption and thermal issues. Energy efficiency becomes a key issue of building large-scale cloud computing centers. To solve this issue, we propose a standalone SOPC (System on a Programmable Chip) based platform for cloud applications. We improve the energy efficiency for cloud computing platforms with two techniques. First, we propose a massive-sessions optimized TCP/IP hardware stack using a macro-pipeline architecture. It enables the hardware acceleration of pipelining execution of network packet offloading and application level data processing. This achieves higher energy efficiency while maintaining peak performance. Second, we propose a online dynamic scheduling strategy. It can reconfigure or shut down FPGA nodes according to workload variance to reduce the runtime energy consumption in a standalone SOPC based reconfigurable cluster system. Two case studies including a webserver application and a cloud based ECG (electrocardiogram) classification application are developed to validate the effectiveness of the proposed platform. Evaluation results show that our SOPC based cloud computing platform can achieve up to 418X improvement in terms of energy efficiency over commercial cloud systems.

[1]  Sung-Nien Yu,et al.  Electrocardiogram beat classification based on wavelet transformation and probabilistic neural network , 2007, Pattern Recognit. Lett..

[2]  Elif Derya Übeyli,et al.  ECG beat classifier designed by combined neural network model , 2005, Pattern Recognit..

[3]  Bin Wang,et al.  Quality of service aware power management for virtualized data centers , 2013, J. Syst. Archit..

[4]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  T. Uchida,et al.  Hardware-based TCP processor for Gigabit Ethernet , 2008, 2007 IEEE Nuclear Science Symposium Conference Record.

[6]  Mohammad Hosseinabady,et al.  Energy optimization of FPGA-based stream-oriented computing with power gating , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[7]  Kees A. Vissers,et al.  Dataflow architectures for 10Gbps line-rate key-value-stores , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).

[8]  Prashant J. Shenoy,et al.  Agile dynamic provisioning of multi-tier Internet applications , 2008, TAAS.

[9]  Judy Qiu,et al.  Cloud Technologies for Bioinformatics Applications , 2011, IEEE Trans. Parallel Distributed Syst..

[10]  Shan Huang,et al.  An Energy-Efficient Design for ECG Recording and R-Peak Detection Based on Wavelet Transform , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.

[11]  Peng Un Mak,et al.  ECG QRS Complex detection with programmable hardware. , 2008, Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference.

[12]  Chia Yee Ooi,et al.  Comparative study of electrocardiogram QRS complex detection algorithm on Field Programmable Gate Array platform , 2014, 2014 IEEE Conference on Biomedical Engineering and Sciences (IECBES).

[13]  Greg J. Regnier,et al.  TCP onloading for data center servers , 2004, Computer.

[14]  Lakshmi Ganesh,et al.  Integrated Approach to Data Center Power Management , 2013, IEEE Transactions on Computers.

[15]  Ying Liu,et al.  A Highly Parameterized and Efficient FPGA-Based Skeleton for Pairwise Biological Sequence Alignment , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[16]  Feng Wan,et al.  A 0.83-$\mu {\rm W}$ QRS Detection Processor Using Quadratic Spline Wavelet Transform for Wireless ECG Acquisition in 0.35- $\mu{\rm m}$ CMOS , 2012, IEEE Transactions on Biomedical Circuits and Systems.

[17]  Victor I. Chang,et al.  A model to compare cloud and non-cloud storage of Big Data , 2016, Future Gener. Comput. Syst..

[18]  Han-Chiang Chen,et al.  Design and Implementation of TCP/IP Offload Engine System over Gigabit Ethernet , 2006, Proceedings of 15th International Conference on Computer Communications and Networks.

[19]  Michael Friedewald,et al.  Ubiquitous computing: An overview of technology impacts , 2011, Telematics Informatics.

[20]  Martin Margala,et al.  An FPGA memcached appliance , 2013, FPGA '13.

[21]  Arnon Rosenthal,et al.  Methodological Review: Cloud computing: A new business paradigm for biomedical information sharing , 2010 .

[22]  Xi Jin,et al.  Efficient Query Processing for Web Search Engine with FPGAs , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[23]  Ludmila Cherkasova,et al.  Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites , 2002, IEEE Trans. Computers.

[24]  Min-Yu Tsai,et al.  MobileFBP: Designing portable reconfigurable applications for heterogeneous systems , 2014, J. Syst. Archit..

[25]  Thomas F. Wenisch,et al.  Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.

[26]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[27]  Xian-He Sun,et al.  Reevaluating Amdahl's law in the multicore era , 2010, J. Parallel Distributed Comput..

[28]  Victor I. Chang,et al.  Composable architecture for rack scale big data computing , 2017, Future Gener. Comput. Syst..

[29]  Rodney S. Tucker,et al.  Green Cloud Computing: Balancing Energy in Processing, Storage, and Transport , 2011, Proceedings of the IEEE.

[30]  G.B. Moody,et al.  The impact of the MIT-BIH Arrhythmia Database , 2001, IEEE Engineering in Medicine and Biology Magazine.

[31]  S. Himavathi,et al.  Feedforward Neural Network Implementation in FPGA Using Layer Multiplexing for Effective Resource Utilization , 2007, IEEE Transactions on Neural Networks.

[32]  Dong Kyue Kim,et al.  An Efficient Architecture for a TCP Offload Engine Based on Hardware/Software Co-design , 2011, J. Inf. Sci. Eng..

[33]  Gustavo Alonso,et al.  Scalable 10Gbps TCP/IP Stack Architecture for Reconfigurable Hardware , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[34]  Suman Nath,et al.  Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services , 2008, NSDI.

[35]  Denis Navarro,et al.  Janus: An FPGA-Based System for High-Performance Scientific Computing , 2007, Computing in Science & Engineering.

[36]  Rui Paulo Martins,et al.  A 0.83-µW QRS Detection Processor Using Quadratic Spline Wavelet Transform for Wireless ECG Acquisition in 0.35-µm CMOS , 2012, IEEE Trans. Biomed. Circuits Syst..

[37]  Yongxin Zhu,et al.  An FPGA-Assisted Cloud Framework for Massive ECG Signal Processing , 2014, 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing.

[38]  G. Boudreaux-Bartels,et al.  Wavelet transform-based QRS complex detector , 1999, IEEE Transactions on Biomedical Engineering.