Scalable Processing of Location-Based Social Networking Queries

Using GPS-enabled smart phones, social network services are enriched with location information which allows users to share geo-tagged contents with their friends. This so called location-based social network (LBSN) data has a dual spatial and graph nature. The growing scale and importance of LBSN data necessitate a platform which (i) has both spatial and graph capabilities, (ii) supports a wide range of queries, e.g., selection, structural, and aggregate queries, (iii) supports scalable distributed processing of large data volumes. In this paper, we propose such a platform, called Geo Social-GraphX, that segregates the LBSN data into several specific graphs capturing user-user, user-location, and location-location relationships, and enables a wide range of LBSN queries by proposing a comprehensive set of query primitives that can be composed into more advanced queries. We implement the platform based on GraphX, a map-reduce infrastructure for distributed graph computation. We further improve the query performance in several ways. For social-related data, we use vertex-centric messaging operators which better address the recursive nature of graph data than traditional two-stage map-reduce. For spatial-related data, we use effective spatial partitioning and indexing methods. Experiments on both synthetic and real LBSN datasets show that Geo Social-GraphX can process a variety of LBSN queries efficiently, scales on multicore architectures, and achieves much better performance than the state of the art competing framework, Spatial Hadoop.