Nonlinear feature extraction with a general criterion function

The optimal nonlinear features for a criterion function of the general form f(D_{l},\cdots,D_{M},K_{1},\cdots,K_{M}) are studied, where the D_{j} and the are the conditional first- and second-order moments. The optimal solution is found to be a parametric function of the conditional densities. By imposing a further restriction on the functional dependence of f on the K_{j} , the optimal mapping becomes an intuitively pleasing function of the posterior probabilities. Given a finite number of features \psi_{1}(X),\cdots ,\psi_{L}(X) , the problem of finding the best linear mappings to m features is next investigated. The resulting optimum mapping is a linear combination of the projections of the posterior probabilities onto the subspace spanned by the \psi_{j}(X) . The problem of finding the best single feature and seqnential feature selection is discussed in this framework. Finally, several examples are discussed.