Abstract Data mining to discover patterns and aid decisions is the key to utilizing massive data for process automation and optimization. An especially challenging data mining problem is kriging, i.e., prediction of multiple, related variables from latent patterns in the data. We present a manifold based machine learning approach to discover patterns in massive, correlated, high-dimensional data. Dimensionality reduction using a manifold is a type of non-linear principal component analysis (PCA). The manifold captures the underlying data structure of the inputs and corresponding outputs by way of projecting the data onto a set of basis functions defined by the manifold. These bases ensure that any future adjustments affect the model with respect to the natural geometry of the data. We chose the manifold learning technique for its robustness against unbalanced data. Our contribution, described in this paper, enables interactive learning and incremental learning, i.e., incremental adjustment of the manifold (and its predictions) based on new observations and also user corrections to the predicted values, rerun the analysis on the full data set. Our experiments demonstrate that prediction performance remains equivalent to Multi-kernel Gaussian Processes on standard data sets despite these practically useful enhancements.
[1]
Jerry Alan Fails,et al.
Interactive machine learning
,
2003,
IUI '03.
[2]
Fan Chung,et al.
Spectral Graph Theory
,
1996
.
[3]
Katsumi Tanaka,et al.
Interactive Visual Clustering for Relational Data
,
2008
.
[4]
Geoffrey E. Hinton,et al.
Evaluation of Gaussian processes and other methods for non-linear regression
,
1997
.
[5]
Jitendra Malik,et al.
Spectral grouping using the Nystrom method
,
2004,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6]
Eric Eaton,et al.
Interactive Learning Using Manifold Geometry
,
2010,
AAAI Fall Symposium: Manifold Learning and Its Applications.
[7]
Ian H. Witten,et al.
Interactive machine learning: letting users build classifiers
,
2002,
Int. J. Hum. Comput. Stud..