Network Map Reduce

Networking data analytics is increasingly used for enhanced network visibility and controllability. We draw the similarities between the Software Defined Networking (SDN) architecture and the MapReduce programming model. Inspired by the similarity, we suggest the necessary data plane innovations to make network data plane devices function as distributed mappers and optionally, reducers. A streaming network data MapReduce architecture can therefore conveniently solve a series of network monitoring and management problems. Unlike the traditional networking data analytical system, our proposed system embeds the data analytics engine directly in the network infrastructure. The affinity leads to a concise system architecture and better cost performance ratio. On top of this architecture, we propose a general MapReduce-like programming model for real-time and one-pass networking data analytics, which involves joint in-network and out-of-network computing. We show this model can address a wide range of interactive queries from various network applications. This position paper strives to make a point that the white-box trend does not necessarily lead to simple and dumb networking devices. Rather, the defining characteristics of the next generation white-box are open and programmable, so that the network devices can be made smart and versatile to support new services and applications.

[1]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[2]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[3]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[4]  Yon Dohn Chung,et al.  Parallel data processing with MapReduce: a survey , 2012, SGMD.

[5]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[6]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[7]  Prashant J. Shenoy,et al.  A platform for scalable one-pass analytics using MapReduce , 2011, SIGMOD '11.

[8]  Giuseppe Bianchi,et al.  OpenState: programming platform-independent stateful openflow applications inside the switch , 2014, CCRV.

[9]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[10]  Youngseok Lee,et al.  Toward scalable internet traffic measurement and analysis with Hadoop , 2013, CCRV.

[11]  Haoyu Song,et al.  Dynamic Network Probes: A Stepping Stone to Omni Network Visibility , 2016, ArXiv.

[12]  Shigeki Goto,et al.  Identifying elephant flows through periodically sampled packets , 2004, IMC '04.

[13]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[16]  Jianjun Zhou,et al.  Research Progress of Stream Data Query in Network Space , 2015 .

[17]  George Varghese,et al.  Programming Protocol-Independent Packet Processors , 2013, ArXiv.

[18]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[19]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[20]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[21]  Sajad Shirali-Shahreza,et al.  Traffic statistics collection with FleXam , 2014, SIGCOMM.

[22]  Nick McKeown,et al.  I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks , 2014, NSDI.

[23]  Rodrigo Fonseca,et al.  Planck , 2014, SIGCOMM.

[24]  Shaleeza Sohail,et al.  Automation of Network Management with Multidisciplinary Concepts , 2010 .

[25]  Minlan Yu,et al.  FlowRadar: A Better NetFlow for Data Centers , 2016, NSDI.

[26]  Haoyu Song,et al.  Protocol-oblivious forwarding: unleash the power of SDN through a future-proof forwarding plane , 2013, HotSDN '13.

[27]  Haoyu Song,et al.  Coherent SDN Forwarding Plane Programming , 2014, ONS.

[28]  Ben Y. Zhao,et al.  Packet-Level Telemetry in Large Datacenter Networks , 2015, SIGCOMM.

[29]  Andrey Brito,et al.  Scalable and Low-Latency Data Processing with Stream MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.