Automatic Musical Pattern Feature Extraction Using Convolutional Neural Network

Music genre classification has been a challenging yet promising task in the field of music information retrieval (MIR). Due to the highly elu- sive characteristics of audio musical data, retrieving informative and reliable features from audio signals is crucial to the performance of any music genre classi- fication system. Previous work on audio music genre classification systems mainly concentrated on using timbral features, which limits the performance. To address this problem, we propose a novel approach to extract musical pattern features in audio music using convolutional neural network (CNN), a model widely adopted in image information retrieval tasks. Our experiments show that CNN has strong capacity to capture informative features from the variations of musical patterns with minimal prior knowledge pro- vided. is adopted in image information retrieval tasks. Migrat- ing technologies from another research field brings new opportunities to break through the current bottleneck of music genre classification. The proposed musical pat- tern feature extractor has advantages in several aspects. It requires minimal prior knowledge to build up. Once obtained, the process of feature extraction is highly ef- ficient. These two advantages guarantee the scalability of our feature extractors. Moreover, our musical pattern features are complementary to other main-stream feature sets used in other classification systems. Our experiments show that musical data have very similar characteristics to image data so that the variation of musical patterns can be captured using CNN. We also show that the mu- sical pattern features are informative for genre classifica- tion tasks.

[1]  Andreas Rauber,et al.  Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification , 2005, ISMIR.

[2]  Daniel P. W. Ellis,et al.  Classifying Music Audio with Timbral and Chroma Features , 2007, ISMIR.

[3]  J. Movshon,et al.  Spatial summation in the receptive fields of simple cells in the cat's striate cortex. , 1978, The Journal of physiology.

[4]  Roberto Basili,et al.  Classification of musical genre: a machine learning approach , 2004, ISMIR.

[5]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[7]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[8]  Tao Li,et al.  Factors in automatic musical genre classification of audio signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[9]  J.M. Inesta,et al.  Musical style identification using self-organising maps , 2002, Second International Conference on Web Delivering of Music, 2002. WEDELMUSIC 2002. Proceedings..

[10]  Douglas Eck,et al.  Aggregate features and ADABOOST for music classification , 2006, Machine Learning.

[11]  Oscar Castillo,et al.  Proceedings of the International MultiConference of Engineers and Computer Scientists 2007, IMECS 2007, March 21-23, 2007, Hong Kong, China , 2007, IMECS.

[12]  Jan Larsen,et al.  Improving music genre classification by short time feature integration , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[13]  Anju Vyas Print , 2003 .

[14]  Andreas Rauber,et al.  Improving Genre Classification by Combination of Audio and Symbolic Descriptors Using a Transcription Systems , 2007, ISMIR.

[15]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[16]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[17]  Hrishikesh Deshpande,et al.  CLASSIFICATION OF MUSIC SIGNALS IN THE VISUAL DOMAIN , 2001 .

[18]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[19]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[20]  Jean-Pierre Martens,et al.  A comparison of human and automatic musical genre classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.