Si-Fi: interactive similar item finder

Many recommender systems retrieve items similar to the given one. Similar items can be numerous items in the wide range of their properties, since, in users’ point of view, similarity can be any relationship that the given item has with other ones. Despite of this fact, top-k retrieval scheme that most information retrieval (IR) systems employ returns k items whose similarity score computed by the system is high regardless of users’ wide range of current interest. Let us say there is a user wants to find out artists similar to her favorite British Pop artist. Most IR systems adopting traditional top-k retrieval scheme would use the artist as a query and return a list of similar items. The result list may include almost identical British Pop artists on the top, with hundreds of related items below them. It would not be problematic when she likes the top-ranked items, but users are not always satisfied with them. She may know almost all of top-ranked artists, or she may want to try another genre of music related with her favorite one, by filtering out British Pop artists. Her needs could not be satisfied with the plain approach. As some previous work points out, plain top-k list should be refined to satisfy more. The biggest difficulty lies in that single measurement of similarity does not reveal the taste of users in most cases, and even it is hard to know what the users’ taste is, which is one of the goals for this kind of retrieval. Content-based methods can be used to find users’ preferred properties for items. But suggestions from content-based systems often annoy users with their huge size. Some work tried to overcome this by adding interactive exploration methods to IR systems. Google image swirl and [1] use iterative clustering of returned set of items to make structure of them and to give diversified choices to users. But their approach is not well-suited for similar item finding, because it runs based on users’ clear interest. We focus on situations where users’ interest is hard to be determined. To overcome the stated problem, we present Si-Fi (Similar

[1]  Bin Liu,et al.  Using Trees to Depict a Forest , 2009, Proc. VLDB Endow..