"Driving curiosity in search with large-scale entity networks" by Ilaria Bordino, Mounia Lalmas, Yelena Mejova, and Olivier Van Laere with Martin Vesely as coordinator

In many search scenarios, users sometimes find unexpected, yet interesting and useful results, which make them curious; they experience serendipity. This curiosity encourages them to explore further. We developed an entity search system designed to support such an experience. The system explores the potential of entities extracted from two of the most popular sources of user-generated content -- Wikipedia, a user-curated online encyclopedia, and Yahoo Answers, a more unconstrained question & answering forum -- in promoting serendipitous search. The content of each data source is represented as a large network of entities, enriched with metadata about sentiment, writing quality, and topical category. A lazy random walk with restart is implemented to retrieve entities from the networks for a given entity query. This paper discusses our work, focusing on our experience in designing, developing, and evaluating such a system. We also discuss the challenges in developing large-scale systems that aim to drive curiosity in search.

[1]  K. Fujimura,et al.  BLOGRANGER – A Multi-faceted Blog Search Engine , 2006 .

[2]  Daniel Tunkelang Dynamic Category Sets: An Approach for Faceted Search , 2006 .

[3]  Deepa Paranjpe,et al.  Learning document aboutness from implicit user feedback and document structure , 2009, CIKM.

[4]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[5]  Matthew Zook,et al.  Placemarks and waterlines: Racialized cyberscapes in post-Katrina Google Earth , 2009 .

[6]  Marti A. Hearst,et al.  Flexible Search and Navigation using Faceted Metadata , 2002 .

[7]  Wisam Dakka Automatic Discovery of Useful Facet Terms , 2006 .

[8]  Christopher C. Miller,et al.  A Beast in the Field: The Google Maps Mashup as GIS/2 , 2006, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[9]  E. Wilde Knowledge Organization Mashups , 2006 .

[10]  Ben Carterette,et al.  Probabilistic models of ranking novel documents for faceted topic retrieval , 2009, CIKM.

[11]  Matthew Banta,et al.  What do exploratory searchers look at in a faceted search interface? , 2009, JCDL '09.

[12]  Mounia Lalmas,et al.  Penguins in sweaters, or serendipitous entity search on user-generated content , 2013, CIKM.

[13]  Fernando Diaz,et al.  A Methodology for Evaluating Aggregated Search Results , 2011, ECIR.

[14]  Ranieri Baraglia,et al.  Document Similarity Self-Join with MapReduce , 2010, 2010 IEEE International Conference on Data Mining.

[15]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[16]  Kevin Chen-Chuan Chang,et al.  Supporting entity search: a large-scale prototype search engine , 2007, SIGMOD '07.

[17]  Mary Lynn Rice-Lively,et al.  Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries , 2009, JCDL 2009.

[18]  Tobias Schreck,et al.  Visualizing Time-Dependent Data in Multivariate Hierarchic Plots - Design and Evaluation of an Economic Application , 2008, 2008 12th International Conference Information Visualisation.

[19]  Berkant Barla Cambazoglu,et al.  A large-scale sentiment analysis for Yahoo! answers , 2012, WSDM '12.

[20]  Sihem Amer-Yahia,et al.  Composite Retrieval of Diverse and Complementary Bundles , 2014, IEEE Transactions on Knowledge and Data Engineering.

[21]  Geert-Jan Houben,et al.  Serendipitous Browsing: Stumbling through Wikipedia , 2012 .

[22]  Jonathan Harris,et al.  We feel fine and searching the emotional web , 2011, WSDM '11.

[23]  Lan Nie,et al.  Resolving Surface Forms to Wikipedia Topics , 2010, COLING.

[24]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[25]  Francesco Piazza,et al.  Sentic Web: A New Paradigm for Managing Social Media Affective Information , 2011, Cognitive Computation.

[26]  Krisztian Balog,et al.  Entity search: building bridges between two worlds , 2010, SEMSEARCH '10.

[27]  Mouzhi Ge,et al.  Beyond accuracy: evaluating recommender systems by coverage and serendipity , 2010, RecSys '10.

[28]  Heidrun Schumann,et al.  Visualizing time-oriented data - A systematic view , 2007, Comput. Graph..

[29]  Maximilian Walther,et al.  Geo-spatial Event Detection in the Twitter Stream , 2013, ECIR.

[30]  Ricardo Baeza-Yates Searching the Web of Objects , 2010, ICOODB.

[31]  Susan T. Dumais,et al.  From x-rays to silly putty via Uranus: serendipity and its role in web search , 2009, CHI.