Implementation of on-process aggregation for efficient big data processing in Hadoop MapReduce environment

The term Big Data, refers to sizably voluminous data whose volume, variability, and velocity make it very arduous to manage, process or analyzed. To analyze this sizably voluminous kind of data Hadoop will be utilized. However, Processing is very time-consuming. To resolve this quandary & to decrement replication time one solution is to executing the job partially, where an approximate, early result becomes available to the utilizer, afore completion of job. Proposed system gives a more incipient MapReduce architecture that sanctions data to be divided for easier & early processing. This is not time consuming and amends system utilization for batch jobs as well. Proposed system presents a more incipient version of the Hadoop MapReduce framework that fortifies on-Process aggregation, which sanctions & avails users to get early results of a job as it is computing. It will evaluate this technique utilizing authentic-world datasets and applications and endeavor to amend the systems performance in terms of precision and time. Also the combiner introduced in this system is local reducer. Combiner will get execute after map function & before reducer. Instead of processing complete file on-process aggregation divides the file into number of blocks which helps to gives the result in slots. Dividing the file into number of data sets helps to give result as early as possible by giving intermediate result to the user. The objective of the proposed technique is to amend the performance of Hadoop MapReduce for efficient & easy Immensely Big Data Processing time.

[1]  Dongping Fang,et al.  Usage analysis for smart meter management , 2011, 2011 8th International Conference & Expo on Emerging Technologies for a Smarter World.

[2]  Yang Wang,et al.  Secondary Forecasting Based on Deviation Analysis for Short-Term Load Forecasting , 2011, IEEE Transactions on Power Systems.

[3]  A. Gefen Simulations of foot stability during gait characteristic of ankle dorsiflexor weakness in the elderly , 2001, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[4]  E. Madhusudhana Reddy,et al.  Big Data - Solutions for RDBMS Problems - A Survey , 2013 .

[5]  Daswin De Silva,et al.  A Data Mining Framework for Electricity Consumption Analysis From Meter Data , 2011, IEEE Transactions on Industrial Informatics.

[6]  Kiran Kumar Reddi,et al.  Different Techniques to Transfer Big data: a Survey , 2013 .