This article presents a specific approach for selecting a limited set of most relevant, information rich speech data from the whole amount of training data. The proposed method uses Principal Component Analysis (PCA) to optimally select a lower-dimensional data subset with similar variances. In this paper, three selection algorithms, based on eigenvalue criterion are presented. The first one operates and analyzes the data at the entire speech-recording level. The second one additionally segments each of the recordings into experimentally sized blocks, which theoretically divides a record level into several smaller information richer/poorer blocks. Finally, the third one analyzes all the speech records at the feature vector level. These three approaches represent three different criterion-based selection techniques from the coarsest to the finest data level. The main aim of the presented experiments is to show that PCA trained with the limited subset of data achieves comparable or even better results than PCA trained with the entire speech corpus. In fact, this approach can radically speed up the learning of PCA with much smaller memory and computational costs. All methods are evaluated in Slovak phoneme-based large vocabulary continuous speech recognition task.
[1]
Milos Cernak,et al.
Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition
,
2011,
TSD.
[2]
I. Jolliffe.
Principal Component Analysis
,
2002
.
[3]
H. Ney,et al.
Linear discriminant analysis for improved large vocabulary continuous speech recognition
,
1992,
[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4]
Douglas D. O'Shaughnessy,et al.
Improving the efficiency of automatic speech recognition by feature transformation and dimensionality reduction
,
2003,
INTERSPEECH.
[5]
Babak Nasersharif,et al.
Class-Dependent PCA Optimization Using Genetic Programming for Robust MFCC Extraction
,
2007
.