The optimal nonlinear features for a criterion function of the general form f(D_{l},\cdots,D_{M},K_{1},\cdots,K_{M}) are studied, where the D_{j} and the are the conditional first- and second-order moments. The optimal solution is found to be a parametric function of the conditional densities. By imposing a further restriction on the functional dependence of f on the K_{j} , the optimal mapping becomes an intuitively pleasing function of the posterior probabilities. Given a finite number of features \psi_{1}(X),\cdots ,\psi_{L}(X) , the problem of finding the best linear mappings to m features is next investigated. The resulting optimum mapping is a linear combination of the projections of the posterior probabilities onto the subspace spanned by the \psi_{j}(X) . The problem of finding the best single feature and seqnential feature selection is discussed in this framework. Finally, several examples are discussed.
[1]
D. W. Peterson,et al.
A method of finding linear discriminant functions for a class of performance criteria
,
1966,
IEEE Trans. Inf. Theory.
[2]
Keinosuke Fukunaga,et al.
The optimum nonlinear features for a scatter criterion in discriminant analysis
,
1977,
IEEE Trans. Inf. Theory.
[3]
Keinosuke Fukunaga,et al.
Introduction to Statistical Pattern Recognition
,
1972
.
[4]
John W. Sammon,et al.
An Optimal Set of Discriminant Vectors
,
1975,
IEEE Transactions on Computers.
[5]
L. E. Elsgolc.
THE METHOD OF VARIATION IN PROBLEMS WITH FIXED BOUNDARIES
,
1961
.
[6]
Peter E. Hart,et al.
Nearest neighbor pattern classification
,
1967,
IEEE Trans. Inf. Theory.