Scaling Up Recommendation Services in Many Dimensions

Gravity R&D has been providing recommendation engines as SaaS solutions since 2009. The company has a strong research focus and recommendation quality has always been their primary differentiating factor. Widely used or open source recommendation algorithms are of little use to our technology team as a result of the superiority of our in-house developed, proprietary algorithms. Gravity R&D experienced many challenges while scaling up their services. The sheer quantity of data handled on a daily basis increased exponentially. This presentation will cover how overcoming these challenges permanently shaped our algorithms and system architecture used to generate these recommendations. Serving personalized recommendations requires real-time computation and data access for every single request. To generate responses in real-time, current user inputs have to be compared against their history in order to deliver accurate recommendations. We then combine this user information with specific details about available items as the next step in the recommendation process. It becomes more difficult to provide accurate recommendations as the number of transactions and items increase. It also becomes difficult because this type of analysis requires the combination of multiple heterogeneous algorithms that all require different inputs. Initially, the architecture was designed for MF based models and serving huge numbers of requests but with a limited number of items. Now, Gravity is using MF, neighborhood based models and metadata based models to generate recommendations for millions of items within their databases. This required a shift from a monolithic architecture with in-process caching to a more service oriented architecture with multi-layer caching. As a result of an increase in the number of components and number of clients, managing the infrastructure can be quite difficult. Even with these challenges, we don't believe that it is worthwhile to use a fully distributed system. It adds unneeded complexity, resources, and overhead to the system. We prefer an approach of firstly optimizing current algorithms and architecture and only moving to a distributed system when no other options are left.