Improving Free Text Recommendation Time by Means of Clustering Algorithms

In this paper, we study the effects of applying clustering algorithms to free text recommendation systems. Usually recommendation systems do not scale well as the size of the recommendation space grows. One of the main techniques to scale them is by applying clustering, however clustering usually have a negative impact on the accuracy when applied without taking into consideration the recommended items. We construct a simple recommendation system for docu- ments and propose partition its search space using kMeans. We vary the number of clusters and analyze how it affects per- formance in relation of recommendation time and accuracy. We apply a word-embedding-based technique to represent the document’s bag-of-words, and therefore be able to compare how clustering algorithms performs in the task of partitioning these documents. One of the main findings of this work is that using clustering we can improve the recommendation time in almost 4 times without losing much off its initial accuracy. Another interesting finding is that the increment of the number of clusters is not directly translated into linear performance.