K-mer clustering algorithm using a MapReduce approach

With recent advances in high throughput sequencing platforms, it is possible to sequence RNA obtained from biological samples more cost-effectively and comprehensively. Due to the ubiquity of the technology, massive volumes of RNA sequence data are now being generated, and as a result the need for more efficient analysis software has become an urgent challenge.