Designing Better Playlists with Monte Carlo Tree Search

In recent years, there has been growing interest in the study of automated playlist generation - music recommender systems that focus on modeling preferences over song sequences rather than on individual songs in isolation. This paper addresses this problem by learning personalized models on the fly of both song and transition preferences, uniquely tailored to each user's musical tastes. Playlist recommender systems typically include two main components: i) a preference-learning component, and ii) a planning component for selecting the next song in the playlist sequence. While there has been much work on the former, very little work has been devoted to the latter. This paper bridges this gap by focusing on the planning aspect of playlist generation within the context of DJ-MC, our playlist recommendation application. This paper also introduces a new variant of playlist recommendation, which incorporates the notion of diversity and novelty directly into the reward model. We empirically demonstrate that the proposed planning approach significantly improves performance compared to the DJ-MC baseline in two playlist recommendation settings, increasing the usability of the framework in real world settings.

[1]  Natasa Milic-Frayling,et al.  Statistical models of music-listening sessions in social media , 2010, WWW '10.

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[4]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[5]  Gert R. G. Lanckriet,et al.  The Natural Language of Playlists , 2011, ISMIR.

[6]  Rebecca K. Ratner,et al.  Patterns of Hedonic Consumption Over Time , 1997 .

[7]  Beth Logan,et al.  Content-Based Playlist Generation: Exploratory Experiments , 2002, ISMIR.

[8]  Peter Stone,et al.  On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search , 2016, ICML.

[9]  Yehuda Koren,et al.  Build your own music recommender by modeling internet radio streams , 2012, WWW.

[10]  J. Jeffry Howbert,et al.  The Maximum Clique Problem , 2007 .

[11]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[12]  Thorsten Joachims,et al.  Playlist prediction via metric embedding , 2012, KDD.

[13]  Sylvain Gelly,et al.  Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[14]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[15]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[17]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[18]  Peter Stone,et al.  Leading the Way: An Efficient Multi-robot Guidance System , 2015, AAMAS.

[19]  David Hsu,et al.  Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach , 2013, TOMM.

[20]  Arto Lehtiniemi Evaluating SuperMusic: streaming context-aware mobile music service , 2008, ACE '08.

[21]  Gregoris Mentzas,et al.  Escape the bubble: guided exploration of music preferences for serendipity and novelty , 2013, RecSys.

[22]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[23]  Peter Stone,et al.  DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation , 2014, AAMAS.

[24]  Paul Lamere,et al.  Steerable Playlist Generation by Learning Song Similarity from Radio Station Playlists , 2009, ISMIR.