Feature Space Visualization with Spatial Similarity Maps for Pathological Speech Data

The feature vectors of a data set encode information about relations between speaker groups, clusters and outliers. Based on the assumption that these relations are conserved within the spatial properties of feature vectors, we introduce similarity maps to visualize consistencies and deviations in magnitude and orientation between two feature vectors. We also present an iterative approach to find subspaces of a high-dimensional feature space that encode information about predefined speaker clusters. The methods were evaluated with two different data sets, one from chronically hoarse speakers and a second one from Parkinson’s Disease patients and a healthy control group. The results showed that similarity maps provide a decent visualization of speaker groups and the spatial properties of their respective feature vectors. With the iterative optimization, it was possible to find features that show pronounced spatial differences between predefined clusters.

[1]  Elmar Nöth,et al.  Automatic evaluation of prosodic features of tracheoesophageal substitute voice , 2007, European Archives of Oto-Rhino-Laryngology.

[2]  Huchuan Lu,et al.  Visual Tracking via Weighted Local Cosine Similarity , 2015, IEEE Transactions on Cybernetics.

[3]  Anna-Lan Huang,et al.  Similarity Measures for Text Document Clustering , 2008 .

[4]  Alfred Mertins,et al.  Automatic speech recognition and speech variability: A review , 2007, Speech Commun..

[5]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[6]  Miguel Rodríguez Mondoñedo,et al.  Handbook of the International Phonetic Association. A Guide to the Use of the lnternational Phonetic Alphahet. Cambridge: University Press, 1999. 204 pp. , 1999 .

[7]  Jesús Francisco Vargas-Bonilla,et al.  NeuroSpeech: An open-source software for Parkinson's speech analysis , 2017, Digit. Signal Process..

[8]  Najim Dehak,et al.  Discriminative and generative approaches for long- and short-term speaker characteristics modeling: application to speaker verification , 2009 .

[9]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[10]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[11]  Elmar Nöth,et al.  Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis , 2015, Comput. Math. Methods Medicine.

[12]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[13]  James R. Glass,et al.  Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification , 2010, Odyssey.

[14]  L. Muflikhah,et al.  Document Clustering Using Concept Space and Cosine Similarity Measurement , 2009, 2009 International Conference on Computer Technology and Development.

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[17]  Elmar Nöth,et al.  The Prosody Module , 2006, SmartKom.

[18]  Elmar Nöth,et al.  Multimodal Assessment of Parkinson's Disease: A Deep Learning Approach , 2019, IEEE Journal of Biomedical and Health Informatics.

[19]  J. Jankovic,et al.  Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Process, format, and clinimetric testing plan , 2007, Movement disorders : official journal of the Movement Disorder Society.

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.