Improved automated classification of basketball crowd noise

This paper describes using both supervised and unsupervised machine learning (ML) methods to improve automatic classification of crowd responses to events at collegiate basketball games. This work builds on recent investigations by the research team where the two ML approaches were treated separately. In one case, crowd response events (cheers, applause, etc.) were manually labeled, and then, a subset of the labeled events were used as a training set for supervised-ML event classification. In the other, (unsupervised) k-means clustering was used to divide a game’s one-twelfth octave spectrogram into six distinct clusters. A comparison of the two approaches shows that the manually labeled crowd responses are grouped into only one or two of the six unsupervised clusters. This paper describes how the supervised ML labels guide improvements to the k-means clustering analysis, such as determining which additional audio features are required as inputs and how both approaches can be used in tandem to improve automated classification of crowd noise at basketball games. This paper describes using both supervised and unsupervised machine learning (ML) methods to improve automatic classification of crowd responses to events at collegiate basketball games. This work builds on recent investigations by the research team where the two ML approaches were treated separately. In one case, crowd response events (cheers, applause, etc.) were manually labeled, and then, a subset of the labeled events were used as a training set for supervised-ML event classification. In the other, (unsupervised) k-means clustering was used to divide a game’s one-twelfth octave spectrogram into six distinct clusters. A comparison of the two approaches shows that the manually labeled crowd responses are grouped into only one or two of the six unsupervised clusters. This paper describes how the supervised ML labels guide improvements to the k-means clustering analysis, such as determining which additional audio features are required as inputs and how both approaches can be used in tandem to improve au...