Focus Paragraph Detection for Online Zero-Effort Queries: Lessons learned from Eye-Tracking Data

In order to realize zero-effort retrieval in a web-context, it is crucial to identify the part of the web page the user is focusing on. In this paper, we investigate the identification of focus paragraphs in web pages. Starting from a naive baseline for paragraph and focus paragraph detection, we conducted an eye-tracking study to evaluate the most promising features. We found that single features (mouse position, paragraph position, mouse activity) are less predictive for gaze which confirms findings from other studies. The results indicate that an algorithm for focus paragraph detection needs to incorporate a weighted combination of those features as well as additional features, e.g. semantic context derived from the user's web history.