Using the Open Meta Kaggle Dataset to Evaluate Tripartite Recommendations in Data Markets

This work addresses the problem of providing and evaluating recommendations in data markets. Since most of the research in recommender systems is focused on the bipartite relationship between users and items (e.g., movies), we extend this view to the tripartite relationship between users, datasets and services, which is present in data markets. Between these entities, we identify four use cases for recommendations: (i) recommendation of datasets for users, (ii) recommendation of services for users, (iii) recommendation of services for datasets, and (iv) recommendation of datasets for services. Using the open Meta Kaggle dataset, we evaluate the recommendation accuracy of a popularity-based as well as a collaborative filtering-based algorithm for these four use cases and find that the recommendation accuracy strongly depends on the given use case. The presented work contributes to the tripartite recommendation problem in general and to the under-researched portfolio of evaluating recommender systems for data markets in particular.

[1]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[2]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[3]  Dominik Kowald,et al.  ScaR: Towards a Real-Time Recommender Framework Following the Microservices Architecture , 2015, RecSys 2015.

[4]  Lior Rokach,et al.  Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.

[5]  Daniela Godoy,et al.  Folksonomy-based Recommender Systems - A State-of-the-Art Review , 2015 .

[6]  Dominik Kowald,et al.  Towards a scalable social recommender engine for online marketplaces: the case of apache solr , 2014, WWW.

[7]  Wolfgang Wahlster,et al.  New Horizons for a Data-Driven Economy , 2016, Springer International Publishing.

[8]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[9]  Peter Brusilovsky,et al.  Collaborative filtering for social tagging systems: an experiment with CiteULike , 2009, RecSys '09.

[10]  Dominik Kowald,et al.  Consensus dynamics in online collaboration systems , 2018, Computational social networks.

[11]  Ronald Maier,et al.  Applying recommender systems in collaboration environments , 2015, Comput. Hum. Behav..

[12]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[13]  Edward Curry,et al.  The Big Data Value Chain: Definitions, Concepts, and Theoretical Approaches , 2016, New Horizons for a Data-Driven Economy.

[14]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.

[15]  Alan Said,et al.  Comparative recommender system evaluation: benchmarking recommendation frameworks , 2014, RecSys '14.

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.