Reranking Strategies Based on Fine-Grained Business User Events Benchmarked on a Large E-commerce Data Set

As traditional search engines based on the text content often fail to efficiently display the products that the customers really desire, web companies commonly resort to reranking techniques in order to improve the products’ relevance given a user query. For that matter, one may take advantage of fine-grained past user events it is now feasible to collect and process, such as the clicks, add-to-basket or purchases. We use a real-world data set of such events collected over a five-month period on a leading e-commerce company in order to benchmark reranking algorithms. A simple strategy consists in reordering products according to the clicks they gather. We also propose a more sophisticated method, based on an autoregressive model to predict the number of purchases from past events. Since we work with retail data, we assert that the most relevant and objective performance metric is the percent revenue generated by the top reranked products, rather than subjective criteria based on relevance scores assigned manually. By evaluating in this way the algorithms against our database of purchase events, we find that the top four products displayed by a state-of-the-art search engine capture on average about 25% of the revenue; reordering products according to the clicks they gather increases this percentage to about 48%; the autoregressive method reaches approximately 55%. An analysis of the coefficients of the autoregressive model shows that the past user events lose most of their predicting power after 2–3 days.