Plug-and-play microphones for recording speech and voice with smart devices

INTRODUCTION: Smart devices are widely available and capable of quickly recording and uploading speech segments for health-related analysis. The switch from laboratory recordings with professional-grade microphone set ups to remote, smart device-based recordings offers immense potential for the scalability of voice assessment. Yet, a growing body of literature points to a wide heterogeneity among acoustic metrics for their robustness to variation in recording devices. The addition of consumer-grade plug-and-play microphones has been proposed as a possible solution. The aim of our study is to assess if the addition of consumer-grade plug-and-play microphones increases the acoustic measurement agreement between ultra-portable devices and a reference microphone. METHODS Speech was simultaneously recorded by a reference high-quality microphone commonly used in research, and by two different configurations with plug-and-play microphones. Twelve speech-acoustic features were calculated using recordings from each microphone to determine the agreement intervals in measurements between microphones. Agreement intervals were then compared to expected deviations in speech in various neurological conditions. Each microphone's response to speech and to silence were characterized through acoustic analysis to explore possible reasons for differences in acoustic measurements between microphones. Lastly, the statistical differentiation of two groups, neurotypical and people with Multiple Sclerosis, using metrics from each tested microphone was compared to that of the reference microphone. RESULTS The two consumer-grade plug-and-play microphones favored high frequencies (mean centre of gravity difference equal or more than 175.3Hz) and recorded more noise (mean difference in signal-to-noise equal or less than -4.2dB) when compared to the reference microphone. Between consumer-grade microphones, differences in relative noise were closely related to distance between the microphone and the speaker's mouth. Agreement intervals between the reference and consumer-grade microphones remained under disease-expected deviations only for fundamental frequency (f0, agreement interval equal or less than 0.06Hz), f0 instability (f0 CoV, agreement interval equal or less than 0.05%) and for tracking of second formant movement (agreement interval equal or less than 1.4Hz/millisecond). Agreement between microphones was poor for other metrics, particularly for fine timing metrics (mean pause length and pause length variability for various tasks). The statistical difference between the two groups of speakers was smaller with the plug-and-play than with the reference microphone. CONCLUSION Measurement of f0 and F2 slope were robust to variation in recording equipment while other acoustic metrics were not. Thus, the tested plug-and-play microphones should not be used interchangeably with professional-grade microphones for speech analysis. Plug-and-play microphones may assist in equipment standardization within speech studies, including remote or self-recording, possibly with small loss in accuracy and statistical power as observed in the current study.

[1]  S. Awan,et al.  Smartphone Recordings are Comparable to "Gold Standard" Recordings for Acoustic Measurements of Voice. , 2023, Journal of voice : official journal of the Voice Foundation.

[2]  Maude Desjardins,et al.  The Effect of Microphone Frequency Response on Spectral and Cepstral Measures of Voice: An Examination of Low-Cost Electret Headset Microphones. , 2022, American journal of speech-language pathology.

[3]  Marina Englert,et al.  Exploring The Validity of Acoustic Measurements and Other Voice Assessments. , 2022, Journal of voice : official journal of the Voice Foundation.

[4]  C. Sanker,et al.  (Don't) try this at home! The effects of recording devices and software on phonetic analysis: Supplementary material , 2021, Language.

[5]  Sarah L. Schneider,et al.  Observations and Considerations for Implementing Remote Acoustic Voice Recording and Analysis in Clinical Practice. , 2021, Journal of voice : official journal of the Voice Foundation.

[6]  A. Arvaniti,et al.  Comparing acoustic analyses of speech data collected remotelya) , 2021, The Journal of the Acoustical Society of America.

[7]  T. Perera,et al.  Speech metrics, general disability, brain imaging and quality of life in multiple sclerosis , 2020, European journal of neurology.

[8]  Pasquale Bottalico,et al.  Reproducibility of Voice Parameters: The Effect of Room Acoustics and Microphones. , 2020, Journal of voice : official journal of the Voice Foundation.

[9]  J. Stout,et al.  Speech in prodromal and symptomatic Huntington’s disease as a model of measuring onset and progression in dominantly inherited neurodegenerative diseases , 2019, Neuroscience & Biobehavioral Reviews.

[10]  Klara Novotna,et al.  Slowed articulation rate is associated with information processing speed decline in multiple sclerosis: A pilot study , 2019, Journal of Clinical Neuroscience.

[11]  Felix Schaeffler,et al.  Assessing voice health using smartphones: bias and random error of acoustic voice parameters captured by different smartphone types. , 2019, International journal of language & communication disorders.

[12]  Giuseppe Tradigo,et al.  Methodologies of speech analysis for neurodegenerative diseases evaluation , 2019, Int. J. Medical Informatics.

[13]  T. Perera,et al.  What speech can tell us: A systematic review of dysarthria characteristics in Multiple Sclerosis. , 2018, Autoimmunity reviews.

[14]  Mark Liberman,et al.  Validated automatic speech biomarkers in primary progressive aphasia , 2018, Annals of clinical and translational neurology.

[15]  Rita R. Patel,et al.  Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function. , 2018, American journal of speech-language pathology.

[16]  Raymond D. Kent,et al.  Static measurements of vowel formant frequencies and bandwidths: A review. , 2018, Journal of communication disorders.

[17]  Michal Novotný,et al.  Smartphone Allows Capture of Speech Abnormalities Associated With High Risk of Developing Parkinson’s Disease , 2018, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[18]  Alexandra Konig,et al.  Use of Speech Analyses within a Mobile Application for the Assessment of Cognitive Impairment in Elderly People. , 2018, Current Alzheimer research.

[19]  D. Darby,et al.  Motor speech signature of behavioral variant frontotemporal dementia , 2017, Neurology.

[20]  Melody Baglione,et al.  Mobile Digital Recording: Adequacy of the iRig and iOS Device for Acoustic and Perceptual Analysis of Normal Voice. , 2017, Journal of voice : official journal of the Voice Foundation.

[21]  Youri Maryn,et al.  Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures. , 2017, Journal of voice : official journal of the Voice Foundation.

[22]  Christian Poellabauer,et al.  Portable mTBI Assessment Using Temporal and Frequency Analysis of Speech , 2017, IEEE Journal of Biomedical and Health Informatics.

[23]  Hosung Nam,et al.  Comparing measurement errors for formants in synthetic and natural vowels. , 2016, Journal of the Acoustical Society of America.

[24]  Sheena Reilly,et al.  Comparability of Modern Recording Devices for Speech Analysis: Smartphone, Landline, Laptop, and Hard Disc Recorder , 2015, Folia Phoniatrica et Logopaedica.

[25]  Paul Maruff,et al.  Monitoring change requires a rethink of assessment practices in voice and speech , 2014, Logopedics, phoniatrics, vocology.

[26]  Adam P. Vogel,et al.  Speech acoustic markers of early stage and prodromal Huntington's disease: A marker of disease onset? , 2012, Neuropsychologia.

[27]  K. Tjaden,et al.  Influence of Cognitive Function on Speech and Articulation Rate in Multiple Sclerosis , 2012, Journal of the International Neuropsychological Society.

[28]  J. Mundt,et al.  Vocal Acoustic Biomarkers of Depression Severity and Treatment Response , 2012, Biological Psychiatry.

[29]  E. Lin,et al.  Evaluating iPhone Recordings for Acoustic Voice Assessment , 2012, Folia Phoniatrica et Logopaedica.

[30]  Joshua D. Reiss,et al.  Proximity Effect Detection for Directional Microphones , 2011 .

[31]  E. Růžička,et al.  Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease. , 2011, The Journal of the Acoustical Society of America.

[32]  P. Maruff,et al.  Acoustic analysis of the effects of sustained wakefulness on speech. , 2010, The Journal of the Acoustical Society of America.

[33]  Svante Granqvist,et al.  Guidelines for selecting microphones for human voice production research. , 2010, American journal of speech-language pathology.

[34]  D. Mitsikostas,et al.  The existence of phonatory instability in multiple sclerosis: an acoustic and electroglottographic study , 2010, Neurological Sciences.

[35]  Paul Maruff,et al.  Standardization of pitch-range settings in voice acoustic analysis , 2009, Behavior research methods.

[36]  A. Morgan,et al.  Factors affecting the quality of sound recording for speech and voice analysis , 2009, International journal of speech-language pathology.

[37]  Paul Maruff,et al.  Comparison of voice acquisition methodologies in speech research , 2008, Behavior research methods.

[38]  R. Robinson,et al.  A prospective longitudinal study of apathy in Alzheimer’s disease , 2005, Journal of Neurology, Neurosurgery & Psychiatry.

[39]  Dimitar D Deliyski,et al.  Adverse effects of environmental noise on acoustic voice quality measurements. , 2005, Journal of voice : official journal of the Voice Foundation.

[40]  K. Tjaden,et al.  Characteristics of Diadochokinesis in Multiple Sclerosis and Parkinson’s Disease , 2003, Folia Phoniatrica et Logopaedica.

[41]  D G Jamieson,et al.  Effects of microphone type on acoustic measures of voice. , 2001, Journal of voice : official journal of the Voice Foundation.

[42]  Ingo R. Titze,et al.  Toward standards in acoustic analysis of voice , 1994 .

[43]  W S Winholtz,et al.  Effect of microphone type and placement on voice perturbation measurements. , 1993, Journal of speech and hearing research.

[44]  P G Stelmachowicz,et al.  The effect of reference microphone placement on sound pressure levels at an ear level hearing aid microphone. , 1990, Ear and hearing.

[45]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[46]  J. Kurtzke Rating neurologic impairment in multiple sclerosis , 1983, Neurology.

[47]  Jan Alexandersson,et al.  Detecting Apathy in Older Adults with Cognitive Disorders Using Automatic Speech Analysis. , 2019, Journal of Alzheimer's disease : JAD.

[48]  J. Schoentgen,et al.  Smartphones Offer New Opportunities in Clinical Voice Research. , 2017, Journal of voice : official journal of the Voice Foundation.

[49]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .