Maximum entropy and maximum likelihood criteria for feature selection from multivariate data

We discuss several numerical methods for optimum feature selection for multivariate data based on maximum entropy and maximum likelihood criteria. Our point of view is to consider observed data x/sup 1/, x/sup 2/,..., x/sup N/ in R/sup d/ to be samples from some unknown pdf P. We project this data onto d directions, subsequently estimate the pdf of the univariate data, then find the maximum entropy (or likelihood) of all multivariate pdfs in R/sup d/ with marginals in these directions prescribed by the estimated univariate pdfs and finally maximize the entropy (or likelihood) further over the choice of these directions. This strategy for optimal feature selection depends on the method used to estimate univariate data.