City-Stories: A Multimedia Hybrid Content and Entity Retrieval System for Historical Data

Information systems used in tourism rely mostly on up-to-date content on aŠractive places. In addition, these systems increasingly make use of archived photographs, documents, €lms, or even ancient paintings and other artwork by integrating such curated content from museums and memory institutions, possibly enriched with user-provided content. Hence the distinction between cultural heritage applications and tourism more and more blurs. Users are not only interested in the current appearance of landscapes, monuments, or buildings, but also in the evolution of these places over time. Œis requires large multimedia collections which integrate content from several cultural heritage institutions. As a consequence, interactive retrieval systems for historical multimedia are needed that support homogeneous content-based and semantic querying despite the heterogeneity of these collections. In this paper we present City-Stories, a multimedia hybrid content and entity retrieval system. City-Stories is based on a state-of-the-art open source multimedia retrieval system. Multimedia features in City-Stories represent multiple semantic levels: low-level (e.g., color, edge, motion), mid-level (e.g., date, location, objects), and high-level features (e.g., semantic entities, scene category). For the laŠer, CityStories applies entity recognition and entity linking for identifying semantic concepts and linking objects across media types. Consequently, City-Stories supports various types of cross-modal queries. Moreover, City-Stories uses a map-based visualization layer that facilitates spatial queries and browsing. Finally, City-Stories follows a crowdsourcing approach for content annotation and for enriching curated content with multimedia objects and documents provided by users. Œe paper shows how the City-Stories system seamlessly combines content-based search with entity-based navigation and leverages the wisdom of the crowd for manual annotations.

[1]  Pablo N. Mendes,et al.  Improving efficiency and accuracy in multilingual entity extraction , 2013, I-SEMANTICS '13.

[2]  Guoliang Li,et al.  Crowdsourced Data Management: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[3]  Gianluca Demartini,et al.  Pick-a-crowd: tell me what you like, and i'll tell you what to do , 2013, CIDR.

[4]  Gianluca Demartini,et al.  Effective named entity recognition for idiosyncratic web collections , 2014, WWW.

[5]  Chalitha Perera,et al.  Cross Media Entity and Concept Driven Search , 2016, SEMANTiCS.

[6]  Lukas Biewald,et al.  Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing , 2011, Human Computation.

[7]  Johan Oomen,et al.  Television heritage linked and visualized: The EUscreen virtual exhibitions and the Linked Open Data pilot , 2013, 2013 Digital Heritage International Congress (DigitalHeritage).

[8]  Arantxa Otegi,et al.  Personalised PageRank for making recommendations in digital cultural heritage collections , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[9]  Heiko Schuldt,et al.  Cineast: A Multi-feature Sketch-Based Video Retrieval Engine , 2014, 2014 IEEE International Symposium on Multimedia.

[10]  Lora Aroyo,et al.  Crowdsourcing in the cultural heritage domain: opportunities and challenges , 2011, C&T.

[11]  Paul H. Lewis,et al.  SCULPTEUR: Multimedia Retrieval for Museums , 2004, CIVR.

[12]  Lora Aroyo,et al.  INVENiT: Exploring Cultural Heritage Collections While Adding Annotations , 2014, IESD@ISWC.

[13]  Johan Oomen,et al.  Publishing Europe's Television Heritage on the Web , 2011, SDA.

[14]  Heiko Schuldt,et al.  vitrivr: A Flexible Retrieval Stack Supporting Multiple Query Modes for Searching in Multimedia Collections , 2016, ACM Multimedia.

[15]  Heiko Schuldt,et al.  ADAMpro: Database Support for Big Multimedia Retrieval , 2016, Datenbank-Spektrum.

[16]  Hongwei Li,et al.  Error Rate Bounds in Crowdsourcing Models , 2013, ArXiv.

[17]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[18]  Yiannis Kompatsiaris,et al.  A hybrid ontology and visual-based retrieval model for cultural heritage multimedia collections , 2008, Int. J. Metadata Semant. Ontologies.

[19]  Bolei Zhou,et al.  Places: An Image Database for Deep Scene Understanding , 2016, ArXiv.

[20]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[21]  Xuemin Shen,et al.  Exploiting mobile crowdsourcing for pervasive cloud services: challenges and solutions , 2015, IEEE Communications Magazine.

[22]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[23]  Philippe Cudré-Mauroux,et al.  SANAPHOR: Ontology-Based Coreference Resolution , 2015, SEMWEB.

[24]  Heiko Schuldt,et al.  City-Stories: A Spatio-Temporal Mobile Multimedia Search System , 2016, 2016 IEEE International Symposium on Multimedia (ISM).

[25]  Chih-Fong Tsai,et al.  A review of image retrieval methods for digital cultural heritage resources , 2007, Online Inf. Rev..

[26]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.