论文信息 - Learning Filaments

Learning Filaments

This paper is about new statistics and new ef-cient algorithms for a form of mixture model that learns lamentary structures. Such models are important in several areas of sci-entiic data analysis, but in this paper our main example is identiication of large-scale structure among galaxies. We describe software which can extract the positions of spherical and line-shaped clusters from data about the locations of objects such as galaxies. We do so by tting a particular type of Gaussian mixture model to the galaxy locations. The most interesting feature of our model is that it directly represents line segments in the distribution , unlike standard Gaussian mixture models which can only handle ellipses. Because we t the line segments directly, we do not need to do any post-processing to extract their locations. We use a modiication of the k-means algorithm to nd model parameters. Since our software needs to deal with large data sets, it is important to accelerate model-tting as much as possible. So, we store the galaxy locations in a multi-resolution kd-tree, and we introduce new pruning algorithms that allow us to skip over large parts of the tree in each k-means step. We provide evaluations on both synthetic and real data sets.

Geoffrey J. Gordon | Andrew Moore

[1] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[2] S. Shectman,et al. The Las Campanas Redshift Survey , 1996, astro-ph/9604167.

[3] Andrew W. Moore,et al. X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[4] Andrew W. Moore,et al. Multiresolution Instance-Based Learning , 1995, IJCAI.

[5] Andrew W. Moore,et al. Efficient Locally Weighted Polynomial Regression Predictions , 1997, ICML.

[6] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1976, TOMS.

[7] Edward M. Riseman,et al. How Easy is Matching 2D Line Models Using Local Search? , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[8] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .