ViTOR: Learning to Rank Webpages Based on Visual Features

The visual appearance of a webpage carries valuable information about the page's quality and can be used to improve the performance of learning to rank (LTR). We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods: (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heat maps generated from webpage snapshots. Since there is currently no public dataset for the task of LTR with visual features, we also introduce and release the ViTOR dataset, containing visually rich and diverse webpages. The ViTOR dataset consists of visual snapshots, non-visual features and relevance judgments for ClueWeb12 webpages and TREC Web Track queries. We experiment with the proposed ViTOR model on the ViTOR dataset and show that it significantly improves the performance of LTR with visual features.

[1]  Qiuzhen Wang,et al.  An eye-tracking study of website complexity from cognitive load perspective , 2014, Decis. Support Syst..

[2]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[3]  Qi Zhao,et al.  Webpage Saliency , 2014, ECCV.

[4]  Yiqun Liu,et al.  Relevance Estimation with Multiple Information Sources on Search Engine Result Pages , 2018, CIKM.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[7]  Xueqi Cheng,et al.  Learning Visual Features from Snapshots for Web Search , 2017, CIKM.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Jakob Nielsen,et al.  Designing Web Usability: The Practice of Simplicity , 1999 .

[10]  Xiaofei Zhou,et al.  Two-Stage Transfer Learning of End-to-End Convolutional Neural Networks for Webpage Saliency Prediction , 2017, IScIDE.

[11]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[12]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  J. Nielsen F-shaped pattern for reading Web content, Jakob Nielsen's Alertbox , 2006 .

[16]  Ellen M. Voorhees,et al.  TREC 2014 Web Track Overview , 2015, TREC.