One of the central problems in pattern recognition is that of input data probability density function estimation (pdf), i.e., the construction of a model of a probability distribution given a finite sample of data drawn from that distribution. Probabilistic Principal Surfaces (hereinafter PPS) is a nonlinear latent variable model providing a way to accomplish pdf estimation, and possesses two attractive aspects useful for a wide range of data mining applications: (1) visualization of high dimensional data and (2) their classification. PPS generates a non linear manifold passing through the data points defined in terms of a number of latent variables and of a nonlinear mapping from latent space to data space. Depending upon dimensionality of the latent space (usually at most 3−dimensional) one has 1−D, 2 − D or 3 − D manifolds. Among the 3 − D manifolds, PPS permits to build a spherical manifold where the latent variables are uniformly arranged on a unit sphere. This particular form of the manifold provides a very effective tool to reduce the problems deriving from curse of dimensionality when data dimension increases. In this paper we concentrate on PPS used as a visualization tool proposing a number of plot options and showing its effectiveness on two complex astronomical data sets.
[1]
Joydeep Ghosh,et al.
A Unified Model for Probabilistic Principal Surfaces
,
2001,
IEEE Trans. Pattern Anal. Mach. Intell..
[2]
Christopher M. Bishop.
Latent Variable Models
,
1998,
Learning in Graphical Models.
[3]
James M. Keller,et al.
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
,
1999
.
[4]
Christopher M. Bishop,et al.
GTM: The Generative Topographic Mapping
,
1998,
Neural Computation.
[5]
Teuvo Kohonen,et al.
Self-Organizing Maps
,
2010
.
[6]
Joydeep Ghosh,et al.
Nonlinear dimensionality reduction using probabilistic principal surfaces
,
2000
.