Genre Differences of Song Lyrics and Artist Wikis: An Analysis of Popularity, Length, Repetitiveness, and Readability

Music is known to exhibit different characteristics, depending on genre and style. While most research that studies such differences takes a musicological perspective and analyzes acoustic properties of individual pieces or artists, we conduct a large-scale analysis using various web resources. Exploiting content information from song lyrics, contextual information reflected in music artists' Wikipedia articles, and listening information, we particularly study the aspects of popularity, length, repetitiveness, and readability of lyrics and Wikipedia articles. We measure popularity in terms of song play count (PC) and listener count (LC), length in terms of character and word count, repetitiveness in terms of text compression ratio, and readability in terms of the Simple Measure of Gobbledygook (SMOG). Extending datasets of music listening histories and genre annotations from Last.fm, we extract and analyze 424,476 song lyrics by 18,724 artists from LyricWiki. We set out to answer whether there exist significant genre differences in song lyrics (RQ1) and artist Wikipedia articles (RQ2) in terms of repetitiveness and readability. We also assess whether we can find evidence to support the cliche´ that lyrics of very popular artists are particularly simple and repetitive (RQ3). We further investigate whether the characteristics of popularity, length, repetitiveness, and readability correlate within and between lyrics and Wikipedia articles (RQ4). We identify substantial differences in repetitiveness and readability of lyrics between music genres. In contrast, no significant differences between genres are found for artists' Wikipedia pages. Also, we find that lyrics of highly popular artists are repetitive but not necessarily simple in terms of readability. Furthermore, we uncover weak correlations between length of lyrics and of Wikipedia pages of the same artist, weak correlations between lyrics' reading difficulty and their length, and moderate correlations between artists' popularity and length of their lyrics.

[1]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[2]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[3]  Christine Bauer,et al.  Introducing Global and Regional Mainstreaminess for Improving Personalized Music Recommendation , 2017, MoMM.

[4]  Xavier Serra,et al.  Exploring Customer Reviews for Music Genre Classification and Evolutionary Studies , 2016, ISMIR.

[5]  Marián Boguñá,et al.  Measuring the Evolution of Contemporary Western Popular Music , 2012, Scientific Reports.

[6]  Jessica R. Levi,et al.  Readability Trends of Online Information by the American Academy of Otolaryngology—Head and Neck Surgery Foundation , 2017, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[7]  Markus Schedl,et al.  The LFM-1b Dataset for Music Retrieval and Recommendation , 2016, ICMR.

[8]  W. K. Campbell,et al.  Tuning in to psychological change: Linguistic markers of psychological traits and emotions over time in popular U.S. song lyrics. , 2011 .

[9]  L. G. Doak,et al.  Teaching Patients With Low Literacy Skills , 1985 .

[10]  Matthias Mauch,et al.  The Minor fall, the Major lift: inferring emotional valence of musical chords through lyrics , 2015, Royal Society Open Science.

[11]  Bruce Ferwerda,et al.  Large-Scale Analysis of Group-Specific Music Genre Taste from Collaborative Tags , 2017, 2017 IEEE International Symposium on Multimedia (ISM).

[12]  Markus Schedl,et al.  Investigating country-specific music preferences and music recommendation algorithms with the LFM-1b dataset , 2017, International Journal of Multimedia Information Retrieval.

[13]  Paul Lamere,et al.  Social Tagging and Music Information Retrieval , 2008 .

[14]  Peter Knees,et al.  Predicting user demographics from music listening information , 2019, Multimedia Tools and Applications.

[15]  E. Schellenberg,et al.  Emotional Cues in American Popular Music: Five Decades of the Top 40 , 2012 .