Focusing on Maximum Entropy classification of lyrics by Tom Waits

This paper reports on the text analysis (visually and computationally) carried out on the lyrics of the rock musician Tom Waits. The analysis mainly focuses on the generally agreed transition period of the musician with his album Swordfishtrombones which started a new phase in Waits' 40 year career. A total of ten supervised learners are tested with the aim to separate the high dimensional space of the word vector (based on his lyrics)into two phases. After initial tests particular focus is given to the Maximum Entropy classifier by further working with some additional pre-processing techniques. The classifier is able to shed some further light into the two classes by being able to separate the two classes with an accuracy of 95%.

[1]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[2]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[3]  François Pachet,et al.  Musical data mining for electronic music distribution , 2001, Proceedings First International Conference on WEB Delivering of Music. WEDELMUSIC 2001.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Andreas F. Ehmann,et al.  Lyric Text Mining in Music Mood Classification , 2009, ISMIR.

[6]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.