GPU and ROS the Use of General Parallel Processing Architecture for Robot Perception

This chapter presents a full tutorial on how to get started on performing parallel processing with ROS. The chapter starts with a guide on how to install the complete version of ROS on the Nvidia development boards Tegra K1, Tegra X1 and Tegra X2. The tutorial includes a guide on how to update the development boards with the latest OS, and configuring CUDA, ROS and OpenCV4Tegra so that they are ready to perform the sample packages included in this chapter. The chapter follows with a description on how to install CUDA in a computer with Ubuntu operating system. After that, the integration between ROS and CUDA is covered, with many examples on how to create packages and perform parallel processing over several of the most used ROS message types. The codes and examples presented on this chapter are available in GitHub and can be found under the repository in https://github.com/air-lasca/ros-cuda.

[1]  Henrik I. Christensen,et al.  RGB-D object tracking: A particle filter approach on GPU , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[3]  Paulius Micikevicius,et al.  3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.

[4]  Leonardo Milhomem Franco Christino Aceleração por GPU de serviços em sistemas robóticos focado no processamento de tempo real de nuvem de pontos 3D , 2016 .

[5]  Mark Oskin,et al.  Using modern graphics architectures for general-purpose computing: a framework and analysis , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[6]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[7]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Surya P. N. Singh,et al.  V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.

[10]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[11]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  João Marcelo X. N. Teixeira,et al.  Massively Parallel Nearest Neighbor Queries for Dynamic Point Clouds on the GPU , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.

[13]  Dinesh Manocha,et al.  Fast computation of database operations using graphics processors , 2004, SIGMOD '04.

[14]  Wolfgang Paul,et al.  GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model , 2009, J. Comput. Phys..

[15]  Takeo Kanade,et al.  GPU-accelerated real-time 3D tracking for humanoid locomotion and stair climbing , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Amnon Barak,et al.  A package for OpenCL based heterogeneous computing on clusters with many GPU devices , 2010, 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS).

[17]  Roland Siegwart,et al.  Normal estimation for pointcloud using GPU based sparse tensor voting , 2012, 2012 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[18]  Andreas Nüchter,et al.  GPU-Accelerated Nearest Neighbor Search for 3D Registration , 2009, ICVS.

[19]  Stefano Cagnoni,et al.  GPU-Based Point Cloud Recognition Using Evolutionary Algorithms , 2014, EvoApplications.

[20]  Geir Hovland,et al.  3D Sensor-Based Obstacle Detection Comparing Octrees and Point clouds Using CUDA , 2012 .