Fast agglomerative clustering for rendering

Hierarchical representations of large data sets, such as binary cluster trees, are a crucial component in many scalable algorithms used in various fields. Two major approaches for building these trees are agglomerative, or bottom-up, clustering and divisive, or top-down, clustering. The agglomerative approach offers some real advantages such as more flexible clustering and often produces higher quality trees, but has been little used in graphics because it is frequently assumed to be prohibitively expensive (O(N2) or worse). In this paper we show that agglomerative clustering can be done efficiently even for very large data sets. We introduce a novel locally-ordered algorithm that is faster than traditional heap-based agglomerative clustering and show that the complexity of the tree build time is much closer to linear than quadratic. We also evaluate the quality of the agglomerative clustering trees compared to the best known divisive clustering strategies in two sample applications: bounding volume hierarchies for ray tracing and light trees in the Lightcuts rendering algorithm. Tree quality is highly application, data set, and dissimilarity function specific. In our experiments the agglomerative-built tree quality is consistently higher by margins ranging from slight to significant, with up to 35% reduction in tree query times.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  John Salmon,et al.  Automatic Creation of Object Hierarchies for Ray Tracing , 1987, IEEE Computer Graphics and Applications.

[3]  J. P. Molina Massó,et al.  Automatic Hybrid Hierarchy Creation: a Cost‐model Based Approach , 2003, Comput. Graph. Forum.

[4]  Keshav Pingali,et al.  Scheduling strategies for optimistic parallel execution of irregular programs , 2008, SPAA '08.

[5]  I. Wald,et al.  On fast Construction of SAH-based Bounding Volume Hierarchies , 2007, 2007 IEEE Symposium on Interactive Ray Tracing.

[6]  James Arvo,et al.  A survey of ray tracing acceleration techniques , 1989 .

[7]  Adam Arbree,et al.  Implementing lightcuts , 2005, SIGGRAPH '05.

[8]  Bernd Gärtner,et al.  Fast and Robust Smallest Enclosing Balls , 1999, ESA.

[9]  Ingo Wald,et al.  State of the Art in Ray Tracing Animated Scenes , 2009, Comput. Graph. Forum.

[10]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Keshav Pingali,et al.  Optimistic parallelism benefits from data partitioning , 2008, ASPLOS.

[12]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[13]  Jack Ritter,et al.  An efficient bounding sphere , 1990 .

[14]  Thomas Larsson Strategies for Bounding Volume Hierarchy Updates for Ray Tracing of Deformable Models , 2003 .

[15]  Ingo Wald,et al.  Ray tracing deformable scenes using dynamic bounding volume hierarchies , 2007, TOGS.

[16]  Keshav Pingali,et al.  Optimistic parallelism requires abstractions , 2009, CACM.

[17]  Gershon Elber,et al.  Optimal bounding cones of vectors in three dimensions , 2005, Inf. Process. Lett..

[18]  Clark F. Olson,et al.  Parallel Algorithms for Hierarchical Clustering , 1995, Parallel Comput..

[19]  K. Bala,et al.  Multidimensional lightcuts , 2006, SIGGRAPH 2006.

[20]  K. Bala,et al.  Lightcuts: a scalable approach to illumination , 2005, SIGGRAPH 2005.

[21]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[22]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.