Automatically embedding newsworthy links to articles: From implementation to evaluation

News portals are a popular destination for web users. News providers are therefore interested in attaining higher visitor rates and promoting greater engagement with their content. One aspect of engagement deals with keeping users on site longer by allowing them to have enhanced click‐through experiences. News portals have invested in ways to embed links within news stories but so far these links have been curated by news editors. Given the manual effort involved, the use of such links is limited to a small scale. In this article, we evaluate a system‐based approach that detects newsworthy events in a news article and locates other articles related to these events. Our system does not rely on resources like Wikipedia to identify events, and it was designed to be domain independent. A rigorous evaluation, using Amazon's Mechanical Turk, was performed to assess the system‐embedded links against the manually‐curated ones. Our findings reveal that our system's performance is comparable with that of professional editors, and that users find the automatically generated highlights interesting and the associated articles worthy of reading. Our evaluation also provides quantitative and qualitative insights into the curation of links, from the perspective of users and professional editors.

[1]  James Allan,et al.  Automatic Hypertext Construction , 1995 .

[2]  Li Xiong,et al.  Automatic link detection: a sequence labeling approach , 2009, CIKM.

[3]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[4]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[5]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[6]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.

[7]  David Ellis,et al.  On the Creation of Hypertext Links in Full-Text Documents: Measurement of Retrieval Effectiveness , 1996, J. Am. Soc. Inf. Sci..

[8]  Aniket Kittur,et al.  He says, she says: conflict and coordination in Wikipedia , 2007, CHI.

[9]  Bee-Chung Chen,et al.  Explore/Exploit Schemes for Web Content Optimization , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[10]  Gabriella Kazai,et al.  Towards a science of user engagement (Position Paper) , 2011 .

[11]  Stephen J. Green,et al.  Automated Link Generation: Can we do Better than Term Repetition? , 1998, Comput. Networks.

[12]  Mounia Lalmas,et al.  Automatically embedding newsworthy links to articles , 2012, CIKM '12.

[13]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[14]  Sylvia L. Osborn,et al.  Hypertext versions of journal articles: computer-aided linking and realistic human-based evaluation , 1999 .

[15]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[16]  M. de Rijke,et al.  Linking Archives Using Document Enrichment and Term Selection , 2011, TPDL.

[17]  Zhaohui Zheng,et al.  Learning to model relatedness for news recommendation , 2011, WWW.

[18]  Maarten de Rijke,et al.  A ranking approach to target detection for automatic link generation , 2010, SIGIR '10.

[19]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[20]  M. de Rijke,et al.  Generating links to background knowledge: a case study using narrative radiology reports , 2011, CIKM '11.

[21]  Zdenek Zdrahal,et al.  Using Explicit Semantic Analysis for Cross-Lingual Link Discovery , 2011 .

[22]  Gabriella Kazai,et al.  Towards a science of user engagement. , 2011 .

[23]  Juan-Zi Li,et al.  Cross-lingual knowledge linking across wiki knowledge bases , 2012, WWW.

[24]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[25]  James Allan,et al.  Automatic hypertext link typing , 1996 .

[26]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[27]  Valentin Jijkoun,et al.  Named entity normalization in user generated content , 2008, AND '08.