Locality Sensitive Hashing for Similarity Search Using MapReduce on Large Scale Data

The paper describes a very popular approach to the problem of similarity search, namely methods based on Locality Sensitive Hashing (LSH). To make coping with large scale data possible, these techniques have been used on the distributed and parallel computing framework for efficient processing using MapReduce paradigm from its open source implementation Apache Hadoop.