This study focuses on handling high-dimensional classification problems by means of feature selection. The data sets used are provided by the organizers of the Interspeech 2012 Speaker Trait Challenge. A combination of two feature selection approaches gives results that approach or exceed the challenge baselines using a knearest-neighbor classifier. One of the feature selection methods is based on covering the data set with correct unsupervised or supervised classifications according to individual features. The other selection method applies a measure of statistical dependence between discretized features and class labels. Index Terms: pattern recognition, feature selection, high-dimensional data, speaker characteristics
[1]
Vijay V. Vazirani,et al.
Approximation Algorithms
,
2001,
Springer Berlin Heidelberg.
[2]
C.-C. Jay Kuo,et al.
A new initialization technique for generalized Lloyd iteration
,
1994,
IEEE Signal Processing Letters.
[3]
Michael I. Jordan,et al.
On Convergence Properties of the EM Algorithm for Gaussian Mixtures
,
1996,
Neural Computation.
[4]
Laurence A. Wolsey,et al.
Integer and Combinatorial Optimization
,
1988
.
[5]
Elmar Nöth,et al.
The INTERSPEECH 2012 Speaker Trait Challenge
,
2012,
INTERSPEECH.