The Greek Music Dataset

Music Information Research (MIR) requires musical data in order to test methods and to compare results. Greek music presents a number of unique characteristics that make its musical pieces distinct from popular tracks existing in currently available datasets, leading thus to the MIR requirement of Greek datasets. This work presents the Greek Music Dataset (GMD), a collection of musical information pertaining to Greek musical pieces. GMD is a significant extension of the Greek Audio Dataset by addition of symbolic information, both features and raw MIDI files, inclusion of multi-label manual genre categorisation of the content as well as by extension of the included tracks and balancing of the content in terms of genre. GMD includes information for 1400 Greek tracks, while for each track, the dataset includes pre-computed audio, lyrics & symbolic features for immediate use in MIR tasks, manually annotated labels pertaining to mood & genre styles of music, generic objective metadata, a manually selected MIDI file (available for 500 of the tracks) and a manually selected link to a performance / audio content in YouTube for further research.

[1]  R. Thayer The biopsychology of mood and arousal , 1989 .

[2]  Spyridon Saroukos Enhancing a Greek Language Stemmer - Efficiency and Accuracy Improvements , 2009 .

[3]  Ichiro Fujinaga,et al.  jSymbolic: A Feature Extractor for MIDI Files , 2006, ICMC.

[4]  Tao Li,et al.  Towards Intelligent Music Information Retrieval , 2005 .

[5]  Nicola Orio,et al.  A professionally annotated and enriched multimodal data set on popular music , 2013, MMSys.

[6]  Yajie Hu,et al.  Lyric-based Song Emotion Detection with Affective Lexicon and Fuzzy Clustering Method , 2009, ISMIR.

[7]  Ichiro Fujinaga,et al.  Automatic music classification and the importance of instrument identification , 2005 .

[8]  Tao Li,et al.  Toward intelligent music information retrieval , 2006, IEEE Transactions on Multimedia.

[9]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[10]  Katia Kermanidis,et al.  The Greek Audio Dataset , 2014, AIAI Workshops.

[11]  Mert Bay,et al.  Creating a Simplified Music Mood Classification Ground-Truth Set , 2007, ISMIR.

[12]  Christopher Ariza,et al.  Music21: A Toolkit for Computer-Aided Musicology and Symbolic Music Data , 2010, ISMIR.

[13]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[14]  Youngmoo E. Kim,et al.  Exploring automatic music annotation with "acoustically-objective" tags , 2010, MIR '10.

[15]  Paul Lamere,et al.  Social Tagging and Music Information Retrieval , 2008 .

[16]  G. Papanikos,et al.  Athens Institute for Education and Research , 2010 .

[17]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[18]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[19]  Gert R. G. Lanckriet,et al.  Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.

[20]  Katia Kermanidis,et al.  Mood Classification Using Lyrics and Audio: A Case-Study in Greek Music , 2012, AIAI.

[21]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[22]  Hui He,et al.  Language Feature Mining for Music Emotion Classification via Supervised Learning from Lyrics , 2008, ISICA.

[23]  Edith Law,et al.  Input-agreement: a new mechanism for collecting data using human computation games , 2009, CHI.

[24]  Shlomo Argamon,et al.  Style mining of electronic messages for multiple authorship discrimination: first results , 2003, KDD '03.

[25]  Keikichi Hirose,et al.  Automatic alignment of a musical score to performed music , 2001 .

[26]  Richard M. Schwartz,et al.  Annotating Resources for Information Extraction , 2000, LREC.

[27]  Menno van Zaanen,et al.  Automatic Mood Classification Using TF*IDF Based on Lyrics , 2010, ISMIR.

[28]  Tao Li,et al.  Factors in automatic musical genre classification of audio signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).