Analysis of Range-Based Key Properties for Sharded Cluster of MongoDB

MongoDB is one of the most popular NoSQL database nowadays. It is an open-source document-oriented database with flexible schema. To increase performances, it can easily scale both vertically and horizontally. For horizontal scaling, MongoDB uses auto-sharding technique to divide data and distribute it over multiple machines. However, in this technique, a DB administrator must choose a shard key for MongoDB to split its collection. Selecting a right key could improve the performance and capability of a database. Contrarily, a wrong key takes down performances and, in some serious case, could lead to a system halt. Therefore, it is important to choose a correct key. MongoDB has a general suggestion on the properties of its ideal shard key. For instance, a good shard key should have high degree randomness for write scaling and should contain high locality for range-query reading. In order to understand the impact of these properties on a shard key, this paper has analyzed and evaluated such suggested properties. We discussed how the variation of a shard key's choices could impact the DB performance and it gives the base to help a MongoDB admin to understand and could select a good shard key for his/her system.

[1]  Yi Jin,et al.  Research on the improvement of MongoDB Auto-Sharding in cloud environment , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).