A Deep Neural Network Approach to the LifeCLEF 2014 Bird Task

This paper describes the methods that are used in our sub- mission to the LifeCLEF 2014 Bird task. A segmentation algorithm is cre- ated that is capable of segmenting the audio les of the Bird task dataset. These segments are used to select relevant Mel-Frequency Cepstral Co- ecients (MFCC) frames from the MFCC dataset. Three datasets are created, 48: containing only the mean MFCC per segment, 96: containing the mean and variance of the MFCCs in a segment, and 240: containing the mean, variance and the mean of three sections. These dataset are shued and split in a test and train set to train Deep Neural Networks with several topologies, which are capable to classify the segments of the datasets. It was found that the best network was capable of correctly classifying 73% of the segments. The results of a run from our system placed us 6th in the list of 10 participating teams. In a follow-up research it is found that shuing the data before splitting introduces overtting, which can be reduced by not shuing the datasets prior to splitting, and using dropout networks.