Hit Song Prediction for Pop Music by Siamese CNN with Ranking Loss

A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public. While most previous work formulates hit song prediction as a regression or classification problem, we present in this paper a convolutional neural network (CNN) model that treats it as a ranking problem. Specifically, we use a commercial dataset with daily play-counts to train a multi-objective Siamese CNN model with Euclidean loss and pairwise ranking loss to learn from audio the relative ranking relations among songs. Besides, we devise a number of pair sampling methods according to some empirical observation of the data. Our experiment shows that the proposed model with a sampling method called A/B sampling leads to much higher accuracy in hit song prediction than the baseline regression model. Moreover, we can further improve the accuracy by using a neural attention mechanism to extract the highlights of songs and by using a separate CNN model to offer high-level features of songs.

[1]  F. Pachet Hit Song Science , 2011 .

[2]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Òscar Celma,et al.  Music recommendation and discovery in the long tail , 2008 .

[5]  Yi-Hsuan Yang,et al.  Revisiting the problem of audio-based hit song prediction using convolutional neural networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Michael A. Casey,et al.  Study of Chinese and UK Hit Songs Prediction , 2013 .

[7]  Eva Zangerle,et al.  Can Microblogs Predict Music Charts? An Analysis of the Relationship Between #Nowplaying Tweets and Music Charts , 2016, ISMIR.

[8]  Chitra Dorai,et al.  Bridging the semantic gap with computational media aesthetics , 2003, IEEE MultiMedia.

[9]  Yi-Hsuan Yang,et al.  Event Localization in Music Auto-tagging , 2016, ACM Multimedia.

[10]  Yi-Hsuan Yang,et al.  Music thumbnailing via neural attention modeling of music emotion , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[11]  Beth Logan,et al.  Automatic Prediction of Hit Songs , 2005, ISMIR.

[12]  Radomír Mech,et al.  Photo Aesthetics Ranking Network with Attributes and Content Adaptation , 2016, ECCV.

[13]  George Tzanetakis,et al.  Music Data Mining , 2011 .

[14]  Kenneth Sörensen,et al.  Dance Hit Song Prediction , 2014, ArXiv.

[15]  Benjamin Schrauwen,et al.  Multiscale Approaches To Music Audio Feature Learning , 2013, ISMIR.

[16]  François Pachet,et al.  Hit Song Science Is Not Yet a Science , 2008, ISMIR.

[17]  Bongwon Suh,et al.  #nowplaying the future billboard: mining music listening behaviors of twitter users for hit song prediction , 2014, SoMeRA@SIGIR.

[18]  Michael I. Mandel,et al.  Evaluation of Algorithms Using Games: The Case of Music Tagging , 2009, ISMIR.