Big data query optimization by using Locality Sensitive Bloom Filter

For faster access of data or in network bloom filter plays an important part in searching technique. It process data in short amount of time and frequently with probabilistic analysis. Bloom Filter also decreases the cost of analyzing data. Various applications are using this technology for accessing and processing the data. Thus by implementing Bloom's Filter over big data will result into efficient query accessing in big data. In this paper, an approach to implement Locality Sensitive Bloom Filter (LSBF) technique in big data is proposed. To remove the drawbacks of simple hashing technique, the LSBF must be implemented to store data in the bloom filter which will help to search the most approximate result by using the Locality Sensitive Hashing approach.

[1]  Sumit Kumar Yadav,et al.  Cost based Model for Big Data Processing with Hadoop Architecture , 2014 .

[2]  Andrei Z. Broder,et al.  Using multiple hash functions to improve IP lookups , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[3]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[4]  Dan Wu,et al.  A Bloom Filter-Based Approach for Efficient Mapreduce Query Processing on Ordered Datasets , 2013, 2013 International Conference on Advanced Cloud and Big Data.

[5]  Michael Mitzenmacher,et al.  Distance-Sensitive Bloom Filters , 2006, ALENEX.

[6]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[7]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[8]  Francis HEYLIGHEN,et al.  Fitness as Default: the evolutionary basis of cognitive complexity reduction , 1994 .

[9]  George Varghese,et al.  Scalable packet classification , 2001, SIGCOMM '01.

[10]  Larry Carter,et al.  Exact and approximate membership testers , 1978, STOC.

[11]  S. Srinivasa Rao,et al.  An optimal Bloom filter replacement , 2005, SODA '05.

[12]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[13]  Yu Hua,et al.  Using Parallel Bloom Filters for Multiattribute Representation on Network Services , 2010, IEEE Transactions on Parallel and Distributed Systems.

[14]  Sarang Dharmapurikar,et al.  Longest prefix matching using bloom filters , 2006, IEEE/ACM Transactions on Networking.

[15]  George Varghese,et al.  Beyond bloom filters: from approximate membership checks to approximate state machines , 2006, SIGCOMM.

[16]  Hong Jiang,et al.  False Rate Analysis of Bloom Filter Replicas in Distributed Systems , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[17]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[18]  Kang G. Shin,et al.  Stochastic fair blue: a queue management algorithm for enforcing fairness , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[19]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[20]  Dan Feng,et al.  Locality-Sensitive Bloom Filter for Approximate Membership Query , 2012, IEEE Transactions on Computers.