LOCUST - Longitudinal Corpus and Toolset for Speaker Verification

In this paper, we set forth a new longitudinal corpus and a toolset in an effort to address the influence of voice-aging on speaker verification. We have examined previous longitudinal research of agerelated voice changes as well as its applicability to real world use cases. Our findings reveal that scientists have treated agerelated voice changes as a hindrance instead of leveraging it to the advantage of the identity validator. Additionally, we found a significant dearth of publicly available corpora related to both the time span of and the number of participants in audio recordings. We also identified a significant bias toward the development of speaker recognition technologies applicable to government surveillance systems compared to speaker verification systems used in civilian IT security systems. To solve the aforementioned issues, we built an open project with the largest publicly available longitudinal speaker database, which includes 229 speakers with an average talking time exceeding 15 hours spanning across an average of 21 years per speaker. We assembled, cleaned, and normalized audio recordings and developed software tools for speech features extractions, all of which we are releasing to the public domain.

[1]  Stanley J. Wenndt,et al.  The multi-session audio research project (MARP) corpus: goals, design and initial findings , 2009, INTERSPEECH.

[2]  S. Linville,et al.  Vocal tract resonance analysis of aging voice using long-term average spectra. , 2001, Journal of voice : official journal of the Voice Foundation.

[3]  Nicholas W. D. Evans,et al.  Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification , 2016, INTERSPEECH.

[4]  Age. Acoustic Analysis of Adult Speaker Age , 2007 .

[5]  Nicholas Dujmović Playing to the edge: American intelligence in the age of terror , 2017 .

[6]  John H. L. Hansen,et al.  Score-Aging Calibration for Speaker Verification , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Yuri Matveev The Problem of Voice Template Aging in Speaker Recognition Systems , 2013, SPECOM.

[8]  Jonathan Harrington,et al.  Age-related changes in fundamental frequency and formants: a longitudinal study of four speakers , 2007, INTERSPEECH.

[9]  W Decoster,et al.  Longitudinal voice changes: facts and interpretation. , 2000, Journal of voice : official journal of the Voice Foundation.

[10]  Claude Montacié,et al.  High-level speech event analysis for cognitive load classification , 2014, INTERSPEECH.

[11]  Chng Eng Siong,et al.  Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Ludek Müller,et al.  Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task , 2001, INTERSPEECH.

[13]  Shivaji J Chaudhari,et al.  Automatic Speaker Age Estimation and Gender Dependent Emotion Recognition , 2015 .

[14]  Joan E Sussman,et al.  Changes in acoustic characteristics of the voice across the life span: measures from individuals 4-93 years of age. , 2011, Journal of speech, language, and hearing research : JSLHR.

[15]  Hasan Erokyar,et al.  Age and Gender Recognition for Speech Applications based on Support Vector Machines , 2014 .

[16]  Andrzej Drygajlo,et al.  Speaker verification with long-term ageing data , 2012, 2012 5th IAPR International Conference on Biometrics (ICB).

[17]  Douglas E. Sturim,et al.  Corpora for the Evaluation of Robust Speaker Recognition Systems , 2016, INTERSPEECH.

[18]  Steve Gold Financial services sector puts voice biometrics at heart of fraud battle , 2014 .

[19]  Niko Brümmer,et al.  Eigenageing compensation for speaker verification , 2013, INTERSPEECH.

[20]  Bin Ma,et al.  Multi-session PLDA scoring of i-vector for partially open-set speaker detection , 2013, INTERSPEECH.

[21]  Jessica A Barlow,et al.  Age-related changes in acoustic characteristics of adult speech. , 2009, Journal of communication disorders.

[22]  Aleksandr Sizov,et al.  ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge , 2015, INTERSPEECH.