Rhyme and Style Features for Musical Genre Classification by Song Lyrics

How individuals perceive music is influenced by many different factors. The audible part of a piece of music, its sound, does for sure contribute, but is only one aspect to be taken into account. Cultural information influences how we experience music, as does the songs’ text and its sound. Next to symbolic and audio based music information retrieval, which focus on the sound of music, song lyrics, may thus be used to improve classification or similarity ranking of music. Song lyrics exhibit specific properties different from traditional text documents – many lyrics are for example composed in rhyming verses, and may have different frequencies for certain parts-of-speech when compared to other text documents. Further, lyrics may use ‘slang’ language or differ greatly in the length and complexity of the language used, which can be measured by some statistical features such as word / verse length, and the amount of repetative text. In this paper, we present a novel set of features developed for textual analysis of song lyrics, and combine them with and compare them to classical bag-of-words indexing approaches. We present results for musical genre classification on a test collection in order to demonstrate our analysis.

[1]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[2]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[3]  Andreas Rauber,et al.  Multi-Modal Music Information Retrieval - Visualisation and Evaluation of Clusterings by Both Audio and Lyrics , 2007, RIAO.

[4]  Peter Knees,et al.  Multiple Lyrics Alignment: Automatic Retrieval of Song Lyrics , 2005, ISMIR.

[5]  Andreas Rauber,et al.  Integration of Text and Audio Features for Genre Classification in Music Information Retrieval , 2007, ECIR.

[6]  Ichiro Fujinaga,et al.  Musical genre classification: Is it worth pursuing and how can it be improved? , 2006, ISMIR.

[7]  Gerhard Widmer,et al.  Hierarchical Organization and Description of Music Collections at the Artist Level , 2005, ECDL.

[8]  Beth Logan,et al.  Semantic analysis of song lyrics , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[9]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[10]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[11]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[12]  Tim Pohle,et al.  Towards a Socio-cultural Compatibility of MIR Systems , 2004, ISMIR.

[13]  Markus Koppenberger,et al.  Natural language processing of lyrics , 2005, ACM Multimedia.