Optimized Big Data Management across Multi-Cloud Data Centers: Software-Defined-Network-Based Analysis

With an exponential increase in smart device users, there is an increase in the bulk amount of data generation from various smart devices, which varies with respect to all the essential V's used to categorize it as big data. Generally, most service providers, including Google, Amazon, Microsoft and so on, have deployed a large number of geographically distributed data centers to process this huge amount of data generated from various smart devices so that users can get quick response time. For this purpose, Hadoop, and SPARK are widely used by these service providers for processing large datasets. However, less emphasis has been given on the underlying infrastructure (the network through which data flows), which is one of the most important components for successful implementation of any designed solution in this environment. In the worst case, due to heavy network traffic with respect to data migrations across different data centers, the underlying network infrastructure may not be able to transfer data packets from source to destination, resulting in performance degradation. Focusing on all these issues, in this article, we propose a novel SDN-based big data management approach with respect to the optimized network resource consumption such as network bandwidth and data storage units. We analyze various components at both the data and control planes that can enhance the optimized big data analytics across multiple cloud data centers. For example, we analyze the performance of the proposed solution using Bloom-filter-based insertion and deletion of an element in the flow table maintained at the OpenFlow controller, which makes most of the decisions for network traffic classification using the rule-and-action-based mechanism. Using the proposed solution, developers can deploy and analyze real-time traffic behavior for the future big data applications in MCE.

[1]  Sakir Sezer,et al.  Memory cost analysis for OpenFlow multiple table lookup , 2015, 2015 28th IEEE International System-on-Chip Conference (SOCC).

[2]  Wolfgang Kellerer,et al.  Control Plane Latency With SDN Network Hypervisors: The Cost of Virtualization , 2016, IEEE Transactions on Network and Service Management.

[3]  Erol Gelenbe,et al.  Optimizing Secure SDN-Enabled Inter-Data Centre Overlay Networks through Cognitive Routing , 2016, 2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS).

[4]  Raj Jain,et al.  Network virtualization and software defined networking for cloud computing: a survey , 2013, IEEE Communications Magazine.

[5]  Joel J. P. C. Rodrigues,et al.  Data Offloading in 5G-Enabled Software-Defined Vehicular Networks: A Stackelberg-Game-Based Approach , 2017, IEEE Communications Magazine.

[6]  Ian F. Akyildiz,et al.  A roadmap for traffic engineering in SDN-OpenFlow networks , 2014, Comput. Networks.

[7]  Neeraj Kumar,et al.  SDN-Based Data Center Energy Management System Using RES and Electric Vehicles , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[8]  Yu Hua,et al.  Using Parallel Bloom Filters for Multiattribute Representation on Network Services , 2010, IEEE Transactions on Parallel and Distributed Systems.

[9]  Hyunseung Choo,et al.  Intelligent eviction strategy for efficient flow table management in OpenFlow Switches , 2016, 2016 IEEE NetSoft Conference and Workshops (NetSoft).

[10]  Sasu Tarkoma,et al.  Theory and Practice of Bloom Filters for Distributed Systems , 2012, IEEE Communications Surveys & Tutorials.

[11]  Fernando M. V. Ramos,et al.  Software-Defined Networking: A Comprehensive Survey , 2014, Proceedings of the IEEE.