Supervised learning from incomplete data via an EM approach

Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data set.s. We use mixture models for the density estimates and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm--EM is used both for the estimation of mixture components and for coping with missing data. The resulting algorithm is applicable to a wide range of supervised as well as unsupervised learning problems. Results from a classification benchmark--the iris data set--are presented.

[1]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[4]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[5]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[6]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[7]  David J. Hand,et al.  Mixture Models: Inference and Applications to Clustering , 1989 .

[8]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[9]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[10]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Volker Tresp,et al.  Network Structuring and Training Using Rule-Based Knowledge , 1992, NIPS.

[13]  Zoubin Ghahramani,et al.  Solving inverse problems using an EM approach to density estimation , 1993 .

[14]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[15]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994 .