This paper is about new statistics and new ef-cient algorithms for a form of mixture model that learns lamentary structures. Such models are important in several areas of sci-entiic data analysis, but in this paper our main example is identiication of large-scale structure among galaxies. We describe software which can extract the positions of spherical and line-shaped clusters from data about the locations of objects such as galaxies. We do so by tting a particular type of Gaussian mixture model to the galaxy locations. The most interesting feature of our model is that it directly represents line segments in the distribution , unlike standard Gaussian mixture models which can only handle ellipses. Because we t the line segments directly, we do not need to do any post-processing to extract their locations. We use a modiication of the k-means algorithm to nd model parameters. Since our software needs to deal with large data sets, it is important to accelerate model-tting as much as possible. So, we store the galaxy locations in a multi-resolution kd-tree, and we introduce new pruning algorithms that allow us to skip over large parts of the tree in each k-means step. We provide evaluations on both synthetic and real data sets.
[1]
Heekuck Oh,et al.
Neural Networks for Pattern Recognition
,
1993,
Adv. Comput..
[2]
S. Shectman,et al.
The Las Campanas Redshift Survey
,
1996,
astro-ph/9604167.
[3]
Andrew W. Moore,et al.
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
,
2000,
ICML.
[4]
Andrew W. Moore,et al.
Multiresolution Instance-Based Learning
,
1995,
IJCAI.
[5]
Andrew W. Moore,et al.
Efficient Locally Weighted Polynomial Regression Predictions
,
1997,
ICML.
[6]
Jon Louis Bentley,et al.
An Algorithm for Finding Best Matches in Logarithmic Expected Time
,
1976,
TOMS.
[7]
Edward M. Riseman,et al.
How Easy is Matching 2D Line Models Using Local Search?
,
1997,
IEEE Trans. Pattern Anal. Mach. Intell..
[8]
D. Rubin,et al.
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
,
1977
.