Mining feedback in ranking and recommendation systems

The amount of online information has grown exponentially over the past few decades, and users become more and more dependent on ranking and recommendation systems to address their information seeking needs. The advance in information technologies has enabled users to provide feedback on the utilities of the underlying ranking and recommendation systems, and in return the systems to utilize such feedback for enhanced service. It is increasingly important to be able to tailor relevant information for different users and applications given various feedback. In this dissertation, we study how feedback can be utilized to improve the service quality of ranking and recommendation systems, in the application context of Web search engines and large scale digital libraries. We first introduce a flow-based collaborative ranking model of users' collective feedback in an online information seeking process. The model constructs a flow-based network to describe the relationship among collaborating users, queries, and documents. This generic model allows us to quantitatively investigate the properties of a collaborative ranking process. We also present a collaborative ranking algorithm derived from this model. We then study the implicit user feedback in query reformulations in the context of general purpose Web search ranking. Specifically, we apply the knowledge of user feedback to address the problems introduced by the under-specified queries. We propose an algorithm to leverage the query context to refine the relevance ranking of the search results. We describe empirical evaluations which demonstrate the benefits of our proposal. We then study the utility of feedback in two vertical ranking and recommendation systems. The first is a geographic information retrieval system. We analyze users' historical clicks as their implicit feedback to the system, and study two click-based models to infer geographical preference based on mining the user click stream data. We are able to identify search queries and documents with spatial specificity, and generate effective relevance features for search ranking. The second vertical is a venue recommendation system for digital libraries. We study the feedback loop of publication quality and venue organization, and propose a set of heuristics to automatically discover prestigious (as well as low-quality) publication venues by exploring the characteristics of the venue organizers.