QuickNN: Memory and Performance Optimization of k-d Tree Based Nearest Neighbor Search for 3D Point Clouds

The use of Light Detection And Ranging (LiDAR) has enabled the continued improvement in accuracy and performance of autonomous navigation. The latest applications require LiDAR's of the highest spatial resolution, which generate a massive amount of 3D point clouds that need to be processed in real time. In this work, we investigate the architecture design for k-Nearest Neighbor (kNN) search, an important processing kernel for 3D point clouds. An approximate kNN search based on a k-dimensional (k-d) tree is employed to improve performance. However, even for today's moderate-sized problems, this approximate kNN search is severely hindered by memory bandwidth due to numerous random accesses and minimal data reuse opportunities. We apply several memory optimization schemes to alleviate the bandwidth bottleneck: 1) the k-d tree data structure is partitioned to two sets: tree nodes and point buckets, based on their distinct characteristics – tree nodes that have high reuse are cached for their lifetime to facilitate search, while point buckets with low reuse are organized in regular contiguous segments in external memory to facilitate efficient burst access; 2) write and read caches are added to gather random accesses to transform them to sequential accesses; and 3) tree construction and tree search are interleaved to cut redundant access streams. With optimized memory bandwidth, the kNN search can be further accelerated by two new processing schemes: 1) parallel tree traversal that utilizes multiple workers with minimal tree duplication overhead, and 2) incremental tree building that minimizes the overhead of tree construction by dynamically updating the tree instead of building it from scratch every time. We demonstrate the performance and memory-optimized QuickNN architecture on FPGA and perform exhaustive benchmarking, showing that up to a 19× and 7.3× speedup over k-d tree searches performed on a modern CPU and GPU, respectively, and a 14.5× speedup over a comparable sized architecture performing an exact search. Finally, we show that QuickNN achieves two orders of magnitude performance per watt increase over CPU and GPU methods.

[1]  Amrita Mazumdar,et al.  POSTER: Application-Driven Near-Data Processing for Similarity Search , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[2]  Joachim Hertzberg,et al.  6D SLAM—3D mapping outdoor environments , 2007, J. Field Robotics.

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Timothy J. Purcell Sorting and searching , 2005, SIGGRAPH Courses.

[6]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[7]  Hideharu Amano,et al.  An FPGA Acceleration for the Kd-tree Search in Photon Mapping , 2013, ARC.

[8]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[9]  Michael A. Greenspan,et al.  Approximate k-d tree search for efficient ICP , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[10]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[11]  Tong Wang,et al.  Fully parallel kd-tree construction for real-time ray tracing , 2014, I3D '14.

[12]  Markus H. Gross,et al.  A hardware processing unit for point sets , 2008, GH '08.

[13]  Feifei Li,et al.  Fixed-function hardware sorting accelerators for near data MapReduce execution , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).

[14]  Paulo Peixoto,et al.  3D Lidar-based static and moving obstacle detection in driving environments: An approach based on voxels and multi-region ground planes , 2016, Robotics Auton. Syst..

[15]  Michael A. Greenspan,et al.  A high speed iterative closest point tracker on an FPGA platform , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16]  Lingjia Tang,et al.  The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.

[17]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[18]  Sebastian Thrun,et al.  Robust vehicle localization in urban environments using probabilistic maps , 2010, 2010 IEEE International Conference on Robotics and Automation.

[19]  Tulga Ersal,et al.  A Multi-Stage Optimization Formulation for MPC-Based Obstacle Avoidance in Autonomous Vehicles Using a LIDAR Sensor , 2014 .

[20]  Shuqing Zeng An Object-Tracking Algorithm for 3-D Range Data Using Motion and Surface Estimation , 2013, IEEE Transactions on Intelligent Transportation Systems.

[21]  Leif Kobbelt,et al.  A survey of point-based techniques in computer graphics , 2004, Comput. Graph..

[22]  Diego Alonso,et al.  A Machine Learning Approach to Pedestrian Detection for Autonomous Vehicles Using High-Definition 3D Range Data , 2016, Sensors.

[23]  Ryan Halterman,et al.  Velodyne HDL-64E lidar for unmanned surface vehicle obstacle detection , 2010, Defense + Commercial Sensing.

[24]  Zhengyou Zhang,et al.  Iterative Closest Point (ICP) , 2014, Computer Vision, A Reference Guide.

[25]  Forest Baskett,et al.  An Algorithm for Finding Nearest Neighbors , 1975, IEEE Transactions on Computers.

[26]  Yangdong Deng,et al.  FastTree: A hardware KD-tree construction acceleration engine for real-time ray tracing , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[27]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Se-Young Oh,et al.  Fast Iterative Closest Point framework for 3D LIDAR data in intelligent vehicle , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[29]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[30]  Nikolaos Papanikolopoulos,et al.  Fast segmentation of 3D point clouds: A paradigm on LiDAR data for autonomous vehicle applications , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Lee Burchett,et al.  Parallelized Iterative Closest Point for Autonomous Aerial Refueling , 2016, ISVC.

[32]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[33]  Brent Schwarz,et al.  LIDAR: Mapping the world in 3D , 2010 .

[34]  Ryan M. Eustice,et al.  Ford Campus vision and lidar data set , 2011, Int. J. Robotics Res..

[35]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .