The Greek Audio Dataset

The Greek Audio Dataset (GAD), is a freely available collection of audio features and metadata for a thousand popular Greek tracks. In this work, the creation process of the dataset is described together with its contents. Following the methodology of existing datasets, the GAD dataset does not include the audio content of the respective data due to intellectual property rights but it includes MIR important features extracted directly from the content in addition to lyrics and manually annotated genre and mood for each audio track. Moreover, for each track a link to available audio content in YouTube is provided in order to support researchers that require the extraction of new feature-sets, not included in the GAD. The selection of the features extracted has been based on the Million Song Dataset in order to ensure that researchers do not require new programming interfaces in order to take advantage of the GAD.