The twitaholic next door.: scalable friend recommender system using a concept-sensitive hash function

In this paper we present a Friend Recommender System for micro-blogging. Traditional batch processing of massive amounts of data makes it difficult to provide a near-real time friend recommender system or even a system that can properly scale to millions of users. In order to overcome these issues, we have designed a solution that represents user-generated micro posts as a set of pseudo-cliques. These graphs are assigned a hash value using an original Concept-Sensitive Hash function, a new sub-kind of Locally-Sensitive Hash functions. Finally, since the user profiles are represented as a binary footprint, the pairwise comparison of footprints using the Hamming distance provides scalability to the recommender system. The paper goes with an online application relying on a large Twitter dataset, so that the reader can freely experiment the system.