Beyond entities: promoting explorative search with bundles

Search engines are increasingly going beyond the pure relevance of search results to entertain users with information items that are interesting and even surprising, albeit sometimes not fully related to their search intent. In this paper, we study this serendipitous search space in the context of entity search, which has recently emerged as a powerful paradigm for building semantically rich answers. Specifically, our work proposes to enhance an explorative search system that represents a large sample of Yahoo Answers as an entity network, with a result structuring that goes beyond ranked lists, using composite entity retrieval, which requires a bundling of the results. We propose and compare six bundling methods, which exploit topical categories, entity specializations, and sentiment, and go beyond simple entity clustering. Two large-scale crowd-sourced studies show that users find a bundled organization—especially based on the topical categories of the query entity—to be better at revealing the most useful results, as well as at organizing the results, helping to discover novel and interesting information, and promoting exploration. Finally, a third study of 30 simulated search tasks reveals the bundled search experience to be less frustrating and more rewarding, with more users willing to recommend it to others.

[1]  Berkant Barla Cambazoglu,et al.  A large-scale sentiment analysis for Yahoo! answers , 2012, WSDM '12.

[2]  Mike Thelwall,et al.  Synthesis Lectures on Information Concepts, Retrieval, and Services , 2009 .

[3]  Sihem Amer-Yahia,et al.  Composite Retrieval of Diverse and Complementary Bundles , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Robert Capra,et al.  Collaborative information seeking by the numbers , 2011, CIR '11.

[5]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[6]  Marti A. Hearst,et al.  Scatter/gather browsing communicates the topic structure of a very large text collection , 1996, CHI.

[7]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[8]  Mounia Lalmas,et al.  Penguins in sweaters, or serendipitous entity search on user-generated content , 2013, CIKM.

[9]  Christian Keimel,et al.  QualityCrowd — A framework for crowd-based quality evaluation , 2012, 2012 Picture Coding Symposium.

[10]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[11]  David R. Karger,et al.  Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections , 2017, SIGF.

[12]  Maximilian Walther,et al.  Geo-spatial Event Detection in the Twitter Stream , 2013, ECIR.

[13]  Wenfei Fan,et al.  On the Complexity of Package Recommendation Problems , 2013 .

[14]  Heather L. O'Brien,et al.  The influence of hedonic and utilitarian motivations on user engagement: The case of online shopping experiences , 2010, Interact. Comput..

[15]  Ricardo Baeza-Yates Searching the Web of Objects , 2010, ICOODB.

[16]  Hui Xiong,et al.  Understanding of Internal Clustering Validation Measures , 2010, 2010 IEEE International Conference on Data Mining.

[17]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[18]  Eugene Agichtein,et al.  ViewSer: a tool for large-scale remote studies of web search result examination , 2011, CHI EA '11.

[19]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[20]  Francesco Piazza,et al.  Sentic Web: A New Paradigm for Managing Social Media Affective Information , 2011, Cognitive Computation.

[21]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search: Beyond the Query-Response Paradigm.

[22]  Peter Ingwersen,et al.  The development of a method for the evaluation of interactive information retrieval systems , 1997, J. Documentation.

[23]  Pasquale Lops,et al.  Introducing Serendipity in a Content-Based Recommender System , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[24]  Susan T. Dumais,et al.  Bringing order to the Web: automatically categorizing search results , 2000, CHI.

[25]  Dawid Weiss,et al.  Carrot and Language Properties in Web Search Results Clustering , 2003, AWIC.

[26]  Eugene Agichtein,et al.  Proceedings of the fifth ACM international conference on Web search and data mining , 2012, WSDM 2012.

[27]  Bernard J. Jansen,et al.  A review of web searching studies and a framework for future research , 2001 .

[28]  Francesco Bonchi,et al.  From "Dango" to "Japanese Cakes": Query Reformulation Models and Patterns , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[29]  Jaap Kamps,et al.  Advances in Information Retrieval , 2013, Lecture Notes in Computer Science.

[30]  Joemon M. Jose,et al.  Composite retrieval of heterogeneous web search , 2014, WWW.

[31]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[32]  Mounia Lalmas,et al.  "Driving curiosity in search with large-scale entity networks" by Ilaria Bordino, Mounia Lalmas, Yelena Mejova, and Olivier Van Laere with Martin Vesely as coordinator , 2014, LINK.

[33]  22nd ACM International Conference on Information and Knowledge Management, CIKM'13, San Francisco, CA, USA, October 27 - November 1, 2013 , 2013, CIKM.

[34]  Deepa Paranjpe,et al.  Learning document aboutness from implicit user feedback and document structure , 2009, CIKM.

[35]  Vagelis Hristidis,et al.  Comprehension-based result snippets , 2012, CIKM '12.

[36]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.

[37]  Gary Marchionini,et al.  Editorial: Evaluating exploratory search systems , 2008 .

[38]  David Carmel,et al.  Towards expressive exploratory search over entity-relationship data , 2012, WWW.

[39]  Aditya G. Parameswaran,et al.  Recommendation systems with complex constraints: A course recommendation perspective , 2011, TOIS.

[40]  Aristides Gionis,et al.  The query-flow graph: model and applications , 2008, CIKM '08.

[41]  Kevin Chen-Chuan Chang,et al.  EntityRank: Searching Entities Directly and Holistically , 2007, VLDB.

[42]  K. Fujimura,et al.  BLOGRANGER – A Multi-faceted Blog Search Engine , 2006 .

[43]  The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012 , 2012, SIGIR.

[44]  Roi Blanco,et al.  From "Selena Gomez" to "Marlon Brando": Understanding Explorative Entity Search , 2015, WWW.

[45]  Ben Shneiderman,et al.  From Keyword Search to Exploration: Designing Future Search Interfaces for the Web , 2010, Found. Trends Web Sci..

[46]  Yoshiharu Ishikawa,et al.  Combination Skyline Queries , 2012, Trans. Large Scale Data Knowl. Centered Syst..

[47]  M. de Rijke,et al.  Adding semantics to microblog posts , 2012, WSDM '12.

[48]  Michael Gamon,et al.  Identifying salient entities in web pages , 2013, CIKM.

[49]  Krisztian Balog,et al.  Entity search: building bridges between two worlds , 2010, SEMSEARCH '10.

[50]  Francesco Bonchi,et al.  From machu_picchu to "rafting the urubamba river": anticipating information needs via the entity-query graph , 2013, WSDM '13.

[51]  Paolo Ferragina,et al.  The Anatomy of SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets , 2004, PKDD.

[52]  Phuoc Tran-Gia,et al.  CrowdTesting : A Novel Methodology for Subjective User Studies and QoE Evaluation , 2013 .

[53]  Fabrizio Silvestri,et al.  Efficient Diversification of Web Search Results , 2011, Proc. VLDB Endow..

[54]  Gary Marchionini,et al.  Evaluating exploratory search systems: Introduction to special topic issue of information processing and management , 2008, Inf. Process. Manag..

[55]  Shuchih Ernest Chang,et al.  (Advances in Web and Network Technologies,and Information Management:276-286)A User Study on the Adoption of Location Based Services , 2007 .

[56]  Jaime Arguello,et al.  Task complexity, vertical display and user interaction in aggregated search , 2012, SIGIR '12.

[57]  Xindong Wu,et al.  ICDM 2010, The 10th IEEE International Conference on Data Mining, Sydney, Australia, 14-17 December 2010 , 2010, ICDM.

[58]  Ranieri Baraglia,et al.  Document Similarity Self-Join with MapReduce , 2010, 2010 IEEE International Conference on Data Mining.

[59]  Amanda Spink,et al.  Defining a session on Web search engines , 2007, J. Assoc. Inf. Sci. Technol..

[60]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[61]  Joemon M. Jose,et al.  Exploring Composite Retrieval from the Users' Perspective , 2015, ECIR.

[62]  Cong Yu,et al.  Constructing and exploring composite items , 2010, SIGMOD Conference.

[63]  Ioana Manolescu,et al.  Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, USA, October 26-30, 2008 , 2008, CIKM.

[64]  Wanda Pratt,et al.  Research Paper: The Usefulness of Dynamically Categorizing Search Results , 2000, J. Am. Medical Informatics Assoc..

[65]  Gilbert Cockton,et al.  Proceedings of the 2003 Conference on Human Factors in Computing Systems, CHI 2003, Ft. Lauderdale, Florida, USA, April 5-10, 2003 , 2003, CHI.

[66]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[67]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[68]  Anh-Duc Nguyen,et al.  Integrating open data and generating travel itinerary in semantic-aware tourist information system , 2011, iiWAS '11.

[69]  Daqing He,et al.  Combining evidence for automatic Web session identification , 2002, Inf. Process. Manag..

[70]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[71]  Dinei A. F. Florêncio,et al.  Crowdsourcing subjective image quality evaluation , 2011, 2011 18th IEEE International Conference on Image Processing.

[72]  Ed H. Chi,et al.  Towards a model of understanding social search , 2008, SSM '08.

[73]  Shan Wang,et al.  Advances in Web and Network Technologies, and Information Management, APWeb/WAIM 2009 International Workshops: WCMT 2009, RTBI 2009, DBIR-ENQOIR 2009, PAIS 2009, Suzhou, China, April 2-4, 2009, Revised Selected Papers , 2009, APWeb/WAIM Workshops.

[74]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[75]  Eugene Agichtein,et al.  On the evolution of the yahoo! answers QA community , 2008, SIGIR '08.

[76]  Touradj Ebrahimi,et al.  Crowd-based quality assessment of multiview video plus depth coding , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[77]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[78]  Sofia Stamou,et al.  Towards Faceted Search for Named Entity Queries , 2009, APWeb/WAIM Workshops.

[79]  Mounia Lalmas,et al.  DEESSE: entity-Driven Exploratory and sErendipitous Search SystEm , 2014, CIKM.

[80]  Luanne Freund,et al.  Assigning search tasks designed to elicit exploratory search behaviors , 2012, HCIR '12.

[81]  Mika Käki,et al.  Findex: search result categories help users when document ranking fails , 2005, CHI.

[82]  Lan Nie,et al.  Resolving Surface Forms to Wikipedia Topics , 2010, COLING.

[83]  Amanda Spink,et al.  Defining a session on Web search engines: Research Articles , 2007 .

[84]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[85]  Yi-fang Brook Wu,et al.  Finding more useful information faster from web search results , 2003, CIKM '03.

[86]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[87]  Ulrike Cress,et al.  Learning by Foraging: The Impact of Social Tags on Knowledge Acquisition , 2009, EC-TEL.

[88]  Surajit Chaudhuri,et al.  Ranking objects based on relationships and fixed associations , 2009, EDBT '09.

[89]  Shuguang Han,et al.  An investigation of search processes in collaborative exploratory web search , 2012, ASIST.

[90]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[91]  Jian Pei,et al.  Proceedings of the 22nd ACM international conference on Information & Knowledge Management , 2013 .

[92]  Fabrizio Silvestri,et al.  Efficient query recommendations in the long tail via center-piece subgraphs , 2012, SIGIR '12.

[93]  Aya Soffer,et al.  Social search and discovery using a unified approach , 2009, HT '09.

[94]  Guoping Wang,et al.  Evaluation of set-based queries with aggregation constraints , 2011, CIKM '11.