Musical Query-by-Semantic-Description Based on Convolutional Neural Network

We present a new music retrieval system based on query by semantic description (QBSD) system, by which a novel song can be used as query and transformed into semantic vector by a convolutional neural network. This method based on Supervised Multi-class labeling (SML), which a song can be annotated by some semantically meaningful tags and retrieved relevant song in semantically annotated database. CAL500 data set is used in experiment, we can learn a deep learning model for each tag in semantic space. To improve the annotation effect, loss function adjustment algorithm and SMOTE algorithm are employed. The experiment results show that this model can get songs with high semantically similarity, and provide a more nature way to music retrieval.

[1]  Emmanuel Bacry,et al.  tick: a Python Library for Statistical Learning, with an emphasis on Hawkes Processes and Time-Dependent Models , 2017, J. Mach. Learn. Res..

[2]  Antoni B. Chan,et al.  Time Series Models for Semantic Music Annotation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Riccardo Miotto,et al.  A Generative Context Model for Semantic Music Annotation and Retrieval , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Simon Dixon,et al.  Sequential Complexity as a Descriptor for Musical Similarity , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jun Wang,et al.  A Collaborative Model of Low-Level and High-Level Descriptors for Semantics-Based Music Information Retrieval , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[7]  Douglas Eck,et al.  Learning Features from Music Audio with Deep Belief Networks , 2010, ISMIR.

[8]  Benjamin Schrauwen,et al.  Audio-based Music Classification with a Pretrained Convolutional Network , 2011, ISMIR.

[9]  Gert R. G. Lanckriet,et al.  Combining audio content and social context for semantic music discovery , 2009, SIGIR.

[10]  Juan Pablo Bello,et al.  Learning a robust Tonnetz-space transform for automatic chord recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Gert R. G. Lanckriet,et al.  Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.

[12]  Augusto Sarti,et al.  A music search engine based on semantic text-based query , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[13]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Chun-Yen Wang,et al.  Semantic content-based music retrieval using audio and fuzzy-music-sense features , 2014, 2014 IEEE International Conference on Granular Computing (GrC).

[16]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[17]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[18]  Changshui Zhang,et al.  Audio Classical Composer Identification by Deep Neural Network , 2013, 1301.3195.

[19]  Augusto Sarti,et al.  A Dimensional Contextual Semantic Model for music description and retrieval , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[21]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..