In this paper, we propose a framework for automated assessment of participation in classroom or professional meeting discussions using audio analysis. Participation is key to the success of businesses and schools; therefore, these establishments aim to measure, incentivise, and ultimately increase it. Currently, the assessment process is mostly subjective. The meeting chair or the teacher makes a subjective judgement on the level of participation by relying on memory or on recorded notes. The unreliability of this approach creates the need to propose objective tools through automation. We propose a collaborative framework using smart phones to capture multiple audio recordings of the meeting and merge them into a single audio-enhanced recording, offsetting audio volume differences in individual recordings due to varied distance from speakers. We use speech diarisation and speaker identification to segment the enhanced audio signal and recognise the identity of the participants in the discussion. We calculate participation statistics as percentages and present them to the discussion participants in graphical form, which helps keep the level balanced between participants. Our results validate the usefulness and potential of our framework in objectively estimating the participation level. The proposed framework is useful as a learning assessment tool and opens the door to automating minute taking when integrating speech recognition.
[1]
Slim Essid,et al.
A Multimodal Approach to Speaker Diarization on TV Talk-Shows
,
2013,
IEEE Transactions on Multimedia.
[2]
Theodoros Giannakopoulos,et al.
Introduction to Audio Analysis: A MATLAB® Approach
,
2014
.
[3]
Hervé Bourlard,et al.
Speaker Diarization and Linking of Meeting Data
,
2016,
IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4]
Cláudio Rosito Jung,et al.
Multimodal Multi-Channel On-Line Speaker Diarization Using Sensor Fusion Through SVM
,
2015,
IEEE Transactions on Multimedia.
[5]
Gerald Friedland,et al.
The ICSI RT-09 Speaker Diarization System
,
2012,
IEEE Transactions on Audio, Speech, and Language Processing.
[6]
Xi-Lin Li.
Blind Source Separation Using Decoupled Relative Newton Algorithm
,
2012,
IEEE Signal Processing Letters.
[7]
Mohammed Ghazal,et al.
A Fast Directional Sigma Filter for Noise Reduction in Digital TV Signals
,
2007,
IEEE Transactions on Consumer Electronics.
[8]
Francesco Nesta,et al.
Convolutive BSS of Short Mixtures by ICA Recursively Regularized Across Frequencies
,
2011,
IEEE Transactions on Audio, Speech, and Language Processing.