Computing Principal Components Dynamically

In this paper we present closed-form solutions for efficiently updating the principal components of a set of $n$ points, when $m$ points are added or deleted from the point set. For both operations performed on a discrete point set in $\mathbb{R}^d$, we can compute the new principal components in $O(m)$ time for fixed $d$. This is a significant improvement over the commonly used approach of recomputing the principal components from scratch, which takes $O(n+m)$ time. An important application of the above result is the dynamical computation of bounding boxes based on principal component analysis. PCA bounding boxes are very often used in many fields, among others in computer graphics for collision detection and fast rendering. We have implemented and evaluated few algorithms for computing dynamically PCA bounding boxes in $\mathbb{R}^3$. In addition, we present closed-form solutions for computing dynamically principal components of continuous point sets in $\mathbb{R}^2$ and $\mathbb{R}^3$. In both cases, discrete and continuous, to compute the new principal components, no additional data structures or storage are needed.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[3]  Joseph O'Rourke,et al.  Finding minimal enclosing boxes , 1985, International Journal of Computer & Information Sciences.

[4]  Sariel Har-Peled,et al.  Efficiently approximating the minimum-volume bounding box of a point set in three dimensions , 1999, SODA '99.

[5]  Timothy M. Chan Faster core-set constructions and data-stream algorithms in fixed dimensions , 2006, Comput. Geom..

[6]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[7]  M. Levas OBBTree : A Hierarchical Structure for Rapid Interference Detection , .

[8]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[9]  Mathias Holst,et al.  Closed-Form Solutions for Continuous PCA and Bounding Box Algorithms , 2008, VISIGRAPP.

[10]  Jirí Matousek,et al.  Dynamic half-space range reporting and its applications , 2005, Algorithmica.

[11]  Yajun Wang,et al.  Provable Dimension Detection Using Principal Component Analysis , 2008, Int. J. Comput. Geom. Appl..

[12]  G.S. Brodal,et al.  Dynamic planar convex hull , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[13]  Nick Roussopoulos,et al.  Direct spatial search on pictorial databases using packed R-trees , 1985, SIGMOD Conference.

[14]  Christos Faloutsos,et al.  The A dynamic index for multidimensional ob-jects , 1987, Very Large Data Bases Conference.

[15]  Leonidas J. Guibas,et al.  BOXTREE: A Hierarchical Representation for Surfaces in 3D , 1996, Comput. Graph. Forum.

[16]  G. Toussaint Solving geometric problems with the rotating calipers , 1983 .

[17]  Timothy M. Chan Dynamic Coresets , 2008, SCG '08.

[18]  Pankaj K. Agarwal,et al.  Approximating extent measures of points , 2004, JACM.

[19]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[20]  Günter Rote,et al.  Bounds on the quality of the PCA bounding boxes , 2009, Comput. Geom..

[21]  Dietmar Saupe,et al.  Tools for 3D-object retrieval: Karhunen-Loeve transform and spherical harmonics , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[22]  Timothy M. Chan A dynamic data structure for 3-D convex hulls and 2-D nearest neighbor queries , 2010, J. ACM.

[23]  D Baltas,et al.  Optimized bounding boxes for three-dimensional treatment planning in brachytherapy. , 2000, Medical physics.

[24]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[25]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .