Supplementary Material for “ Data-Driven 3 D Voxel Patterns for Object Category Recognition ”

In building the 3D voxel exemplars, we voxelize a 3D CAD model into a distribution of 3D voxels. Since 3D CAD models from the web repositories, such as the Trimble 3D Warehouse [1], are usually irregular and not water-tight. We employ the volumetric depth map fusion technique, which is widely used in dense 3D reconstruction in the literature [7], to build the voxel representation of a 3D CAD model. Fig. 1 illustrates our voxelization process. We first render depth images of a CAD model from different viewpoints (Fig. 1(a)). In our implementation, we render from 8 azimuths and 6 elevations, which produces 48 depth images. Then we fuse these depth images to obtain a 3D point cloud on the surface of the object (Fig. 1(b)). Finally, we voxelize the 3D space and determine which voxels are inside or outside the object using the surface point cloud (Fig. 1(c)). We experimented with different sizes of the 3D voxel space. There is a tradeoff between computational efficiency and representation power according to different sizes of 3D voxel space. We found that a 50 × 50 × 50 voxel space works well in our experiments.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[4]  Mohan M. Trivedi,et al.  Fast and Robust Object Detection Using Visual Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Silvio Savarese,et al.  Object Detection by 3D Aspectlets and Occlusion Reasoning , 2013, 2013 IEEE International Conference on Computer Vision Workshops.