Supervised and unsupervised machine learning approaches to classifying chimpanzee vocalizations

Quantitative tools for classifying vocal repertoires have been constantly evolving with developments in machine learning and speech recognition research as well as increasing computing power. There are two main methodological considerations in classifying vocalizations: (i) choosing the classification technique and (ii) choosing the features for classification. Current state-of-the-art classification techniques are artificial neural networks (ANNs), support vector machines (SVMs), and ensemble methods like random forests (RFs). Current state-of-the-art features from speech recognition research include mel frequency cepstral coefficients (MFCCs). Bioacoustics researchers have applied these tools to problems including individual-, species-, and call-type identification, and vocal repertoire classification. However, researchers studying non-human primate vocalizations have only recently started adopting these approaches and none have applied them to study chimpanzee vocalizations. Here, we analyze vocalizations recorded in Gombe National Park, Tanzania. First, we use supervised classification techniques (ANNs, SVMs, and RFs) that involve training the models based on predefined call-types to evaluate the classification accuracy. Second, we use unsupervised techniques (that do not require prior knowledge of call-types), namely, K-means clustering, and self-organizing neural networks to identify discrete call types. We discuss the results from both supervised and unsupervised techniques and their strengths over traditional methods.Quantitative tools for classifying vocal repertoires have been constantly evolving with developments in machine learning and speech recognition research as well as increasing computing power. There are two main methodological considerations in classifying vocalizations: (i) choosing the classification technique and (ii) choosing the features for classification. Current state-of-the-art classification techniques are artificial neural networks (ANNs), support vector machines (SVMs), and ensemble methods like random forests (RFs). Current state-of-the-art features from speech recognition research include mel frequency cepstral coefficients (MFCCs). Bioacoustics researchers have applied these tools to problems including individual-, species-, and call-type identification, and vocal repertoire classification. However, researchers studying non-human primate vocalizations have only recently started adopting these approaches and none have applied them to study chimpanzee vocalizations. Here, we analyze vocalizati...