DScheduler: Dynamic Network Scheduling Method for MapReduce in Distributed Controllers

MapReduce is the most widely used distributed computing framework due to its excellent parallelism and scalability in dealing with large-scale data. It is one of the most important research point in distributed computing field to improve the performance of MapReduce application in datacenter network. OpenFlow protocol makes it possible to schedule network resource dynamically to provide better link bandwidth for shuffle traffic. Current OpenFlow-based scheduling method runs on a single controller, which cannot meet the needs of excessive switch requests in large scale data center networks. The performance of those scheduling method will decrease obviously due to some conflict problem when they run on distributed controllers. This paper proposed DScheduler, a dynamic network scheduling method for distributed controllers. DScheduler is running as an application on each SDN controller and avoid a majority of conflict problems in scheduling with small cost by using lock and communication between each controller. We implement a prototype system on Floodlight to demonstrate our design and test the performance. Experimental results show that DScheduler has a significant effect on decreasing the occurrence times of conflict situations and improving the performance of openflow-based scheduling method on distributed controllers.

[1]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[2]  Arunabha Sen,et al.  Finding a Path Subject to Many Additive QoS Constraints , 2007, IEEE/ACM Transactions on Networking.

[3]  Koji Okamura,et al.  Design and implementation of application based routing using OpenFlow , 2010, CFI.

[4]  Martín Casado,et al.  Onix: A Distributed Control Platform for Large-scale Production Networks , 2010, OSDI.

[5]  Ishai Menache,et al.  Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can , 2015, SIGCOMM.

[6]  Kostas Katrinis,et al.  MRemu: An Emulation-Based Framework for Datacenter Network Experimentation Using Realistic MapReduce Traffic , 2015, 2015 IEEE 23rd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[7]  Arunabha Sen,et al.  Finding a Path Subject to Many Additive QoS Constraints , 2007, IEEE/ACM Transactions on Networking.

[8]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM.

[9]  Muthu Dayalan,et al.  MapReduce : Simplified Data Processing on Large Cluster , 2018 .

[10]  Minyi Guo,et al.  OFScheduler: A Dynamic Network Optimizer for MapReduce in Heterogeneous Cluster , 2013, International Journal of Parallel Programming.

[11]  Stuart Bailey,et al.  Hadoop Acceleration in an OpenFlow-Based Cluster , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[12]  H. Jonathan Chao,et al.  Use of devolved controllers in data center networks , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[13]  Robert L. Grossman,et al.  OpenFlow Enabled Hadoop over Local and Wide Area Clusters , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  Mathieu Bouet,et al.  DISCO: Distributed multi-domain SDN controllers , 2013, 2014 IEEE Network Operations and Management Symposium (NOMS).

[16]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).