Research of a MapReduce Communication Data Stream Processing Model

In this paper, we propose CDS-MR , a MapReduce deep service analysis system based on Hive/Hadoop frameworks. Normally, the job of the switch is to transmit data. There is a tendency to put more capability into the switch, such as retain or query pass by data. Thus we definitely need to think about what can be kept in working storage and how to analysis it. Obviously, the ordinary database cannot handle the massive dataset and complex ad-hoc query. MapReduce is a popular and widely used fine grain parallel runtime, which is developed for high performance processing of large scale dataset.

[1]  David E. Culler,et al.  User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[2]  Chaki Ng,et al.  Mirage: a microeconomic resource allocation system for sensornet testbeds , 2005, The Second IEEE Workshop on Embedded Networked Sensors, 2005. EmNetS-II..

[3]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[4]  K. Arrow,et al.  Aspects of the theory of risk-bearing , 1966 .

[5]  David E. Culler,et al.  Market-based Proportional Resource Sharing for Clusters , 2000 .

[6]  Michael Georgiopoulos,et al.  A Grid Based System for Data Mining Using MapReduce , 2007 .

[7]  Randal E. Bryant,et al.  Data-Intensive Supercomputing: The case for DISC , 2007 .

[8]  Alvin AuYoung,et al.  Service contracts and aggregate utility functions , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.