A time efficient and accurate retrieval of range aggregate queries using fuzzy clustering means (FCM) approach

Massive growth in the big data makes difficult to analyse and retrieve the useful information from the set of available data’s. Statistical analysis: Existing approaches cannot guarantee an efficient retrieval of data from the database. In the existing work stratified sampling is used to partition the tables in terms of static variables. However k means clustering algorithm cannot guarantees an efficient retrieval where the choosing centroid in the large volume of data would be difficult. And less knowledge about the static variable might leads to the less efficient partitioning of tables. Findings: This problem is overcome in the proposed methodology by introducing the FCM clustering instead of k means clustering which can cluster the large volume of data which are similar in nature. Stratification problem is overcome by introducing the post stratification approach which will leads to efficient selection of static variable. Improvements: This methodology leads to an efficient retrieval process in terms of user query within less time and more accuracy.

[1]  Schahram Dustdar,et al.  Weighted fuzzy clustering for capability-driven service aggregation , 2010, 2010 IEEE International Conference on Service-Oriented Computing and Applications (SOCA).

[2]  Tao Yu,et al.  Adaptive algorithms for finding replacement services in autonomic distributed business processes , 2005, Proceedings Autonomous Decentralized Systems, 2005. ISADS 2005..

[3]  H. Stanley,et al.  Quantifying Trading Behavior in Financial Markets Using Google Trends , 2013, Scientific Reports.

[4]  Chris Jermaine,et al.  Online aggregation for large MapReduce jobs , 2011, Proc. VLDB Endow..

[5]  Keqin Li,et al.  FastRAQ: A Fast Approach to Range-Aggregate Queries in Big Data Environments , 2015, IEEE Transactions on Cloud Computing.

[6]  E. Michael Maximilien,et al.  Self-Adjusting Trust and Selection for Web Services , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[7]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[8]  Fabio Casati,et al.  Supporting the dynamic evolution of Web service protocols in service-oriented architectures , 2008, TWEB.

[9]  Yingshu Li,et al.  Optimizing Retransmission Threshold in Wireless Sensor Networks , 2016, Sensors.

[10]  P. Flajolet,et al.  HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .

[11]  Schahram Dustdar,et al.  Towards Composition as a Service - A Quality of Service Driven Approach , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[12]  Xiaofeng Meng,et al.  You can stop early with COLA: online processing of aggregate queries in the cloud , 2012, CIKM.

[13]  Joseph M. Hellerstein,et al.  Online aggregation and continuous query support in MapReduce , 2010, SIGMOD Conference.

[14]  Schahram Dustdar,et al.  Web service clustering using multidimensional angles as proximity measures , 2009, TOIT.

[15]  Peter Mika,et al.  Web Semantics in the Clouds , 2008, IEEE Intelligent Systems.

[16]  Nimrod Megiddo,et al.  Range queries in OLAP data cubes , 1997, SIGMOD '97.

[17]  Tore Risch,et al.  Massive scale-out of expensive continuous queries , 2011, Proc. VLDB Endow..

[18]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[19]  Maria E. Orlowska,et al.  Range queries in dynamic OLAP data cubes , 2000, Data Knowl. Eng..

[20]  Alexander Hall,et al.  HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm , 2013, EDBT '13.

[21]  Sushil Jajodia,et al.  Integrity for join queries in the cloud , 2013, IEEE Transactions on Cloud Computing.