Sampling color and geometry point clouds from ShapeNet dataset

The popularisation of acquisition devices capable of capturing volumetric information such as LiDAR scans and depth cameras has lead to an increased interest in point clouds as an imaging modality. Due to the high amount of data needed for their representation, efficient compression solutions are needed to enable practical applications. Among the many techniques that have been proposed in the last years, learning-based methods are receiving large attention due to their high performance and potential for improvement. Such algorithms depend on large and diverse training sets to achieve good compression performance. ShapeNet is a large-scale dataset composed of CAD models with texture and constitute and effective option for training such compression methods. This dataset is entirely composed of meshes, which must go through a sampling process in order to obtain point clouds with geometry and texture information. Although many existing software libraries are able to sample geometry from meshes through simple functions, obtaining an output point cloud with geometry and color of the external faces of the mesh models is not a straightforward process for the ShapeNet dataset. The main difficulty associated with this dataset is that its models are often defined with duplicated faces sharing the same vertices, but with different color values. This document describes a script for sampling the meshes from ShapeNet that circumvent this issue by excluding the internal faces of the mesh models prior to the sampling. The script can be accessed from the following link: https://github.com/mmspg/mesh-sampling. 1 Scope and Background The development of imaging modalities for the representation of three-dimensional content has been an important topic of research in the last decades. The increasing performance of computing devices together with the high quality of modern displays have allowed for a fast development of the field of computer graphics for both industrial and entertainment applications to mention two among a large number of potential applications. Traditionally, this field has relied on the use of meshes as the imaging modality for the representation of artificially generated content. Mesh models are represented as a set of interconnected points in the three dimensional space. These vertices and edges define a set polygons that usually constitute the surface of a watertight volume. On one hand, the color on the faces of such 3D models can be defined either as values assigned individually for each face or as two dimensional texture, mapped directly onto the surface. Point clouds, on the other hand, don’t contain any connectivity information, being composed uniquely, of a list of point coordinates with associated attributes such as color, normal vectors, semantic labels and many others possible features. The advent and popularization of acquisition devices that allow to capture volumetric information such as LiDAR scans and depth cameras has fostered the rise of new applications such as telepresence, virtual reality and wide area scanning. The output of such devices can usually be easily converted into a list of the space coordinates of the acquired points with associated attributes such as color and reflectance. Although there are algorithms capable of generating meshes from the scans, in many applications it is more advantageous to directly use of the acquired points in the form of point clouds. ar X iv :2 20 1. 06 93 5v 1 [ ee ss .I V ] 1 8 Ja n 20 22 Dataset Compression method ShapeNet[1] Wang et al.[2, 3] ModelNet[4] Quach et al.[5, 6] and Nguyen et al.[7] MPEG Alexiou et al.[8] and Guarda et al.[9, 10, 11, 12, 13, 14, 15] JPEG Pleno[16] Alexiou et al.[8] nuScenes[17] Wiesman et al.[18] Table 1: Datasets employed for training learning-based point cloud compression methods. Depending on application, the number of points in a typical point cloud model can range from thousands up to the order of billions. Since the transmission and storage of such huge amount of data is impractical, efficient compression methods are paramount. For this reason, standardisation committees such as JPEG, Khronos Group and MPEG have been devoting efforts to the development of interoperable compression standards. 2 Current practices and challenges While many conventional data structures such as the octree or the sets of projections have been proposed to encode point cloud data, deep learning-based architectures have been reporting high performance and have attracted the attention of many researchers and standardisation groups. Such methods apply transforms learned through a training process, relying on large and diverse datasets with thousands of point clouds. Several datasets were employed to train learning-based compression algorithms reported in the literature. Table 1 lists some of these datasets including the reference to the respective compression methods. Among the datasets listed in Table 1, ShapeNet is a powerful option for training learning-based compression methods due to its large number of models with associated color texture. Moreover, it has been already successfully employed for training geometry-only compression algorithms. Since ShapeNet is composed of mesh models, a preprocessing step is needed where in order to convert the dataset into point clouds prior to its use in the training loop. Although ignoring the connectivity information and forming a point cloud with the mesh vertices is in theory a possible solution, the resulting models will potentially have too low point density. Previous works [5, 6, 3] used random sampling followed by voxelization in order to obtain geometry-only point clouds with points lying on a uniform grid. However, the software libraries by these authors are only capable of sampling the geometry, ignoring associated color attributes. 3 Mesh sampling solutions 3.1 Software libraries Several [5, 6, 2, 3, 8] open source learning-based point cloud compression methods are based on Python [19] as a programming language. Similarly, many software libraries for point cloud processing are based on Python as well, such as Pyntcloud [20], Open3D [21] and pymeshlab [22]. Pyntcloud [20] allows for the sampling of meshes through the method get_sample(), which randomly selects a defined number of points from a mesh. This library was employed by the authors of [5, 6, 3], but is only able to generate geometry-only point clouds. The Open3D [21] Python library has two methods for mesh sampling: sample_points_uniformly() applies uniform sampling, while sample_points_poisson_disk() uses Poisson disk sampling [23] to obtain a point cloud from the mesh. Likewise, these methods are only able to deal with geometry-only data. Meshlab [22] is a standalone software that allows for the visualisation and processing of meshes and point clouds. It contains a large number of different algorithms for mesh sampling, which are also unable to generate point clouds with texture. Meshlab also has functions that transfer the color from mesh vertices to a point cloud, which are however not able to deal with cases where the color is defined as a two dimensional texture map. All functions from Meshlab are also available in a corresponding Python library called pymeshlab. CloudCompare [24] is another tool that can be used for visualisation and processing of 3D content. This software contains a function that allows for the direct sampling of both the color and the geometry in random positions over a mesh surface. Moreover, is is able to deal with color defined both per face or as texture.

[1]  Paolo Cignoni,et al.  MeshLab: an Open-Source Mesh Processing Tool , 2008, Eurographics Italian Chapter Conference.

[2]  Nuno M. M. Rodrigues,et al.  Deep Learning-based Point Cloud Geometry Coding with Resolution Scalability , 2020, 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP).

[3]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[4]  Nuno M. M. Rodrigues,et al.  Deep Learning-Based Point Cloud Coding: A Behavior and Performance Study , 2019, 2019 8th European Workshop on Visual Information Processing (EUVIP).

[5]  Touradj Ebrahimi,et al.  Towards neural network approaches for point cloud compression , 2020, Optical Engineering + Applications.

[6]  Eugene Fiume,et al.  Hierarchical Poisson disk sampling distributions , 1992 .

[7]  Nuno M. M. Rodrigues,et al.  Point Cloud Coding: Adopting a Deep Learning-based Approach , 2019, 2019 Picture Coding Symposium (PCS).

[8]  Fernando Pereira,et al.  Neighborhood Adaptive Loss Function for Deep Learning-Based Point Cloud Coding With Implicit and Explicit Quantization , 2021, IEEE MultiMedia.

[9]  Fernando Pereira,et al.  Adaptive Deep Learning-Based Point Cloud Geometry Coding , 2021, IEEE Journal of Selected Topics in Signal Processing.

[10]  Giuseppe Valenzise,et al.  Learning-Based Lossless Compression of 3D Point Cloud Geometry , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[12]  Fernando Pereira,et al.  Point Cloud Geometry Scalable Coding With a Single End-to-End Deep Learning Model , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[13]  Cyrill Stachniss,et al.  Deep Compression for Dense Point Cloud Maps , 2021, IEEE Robotics and Automation Letters.

[14]  Zhan Ma,et al.  Multiscale Point Cloud Geometry Compression , 2020, 2021 Data Compression Conference (DCC).

[15]  Giuseppe Valenzise,et al.  Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[16]  Frederic Dufaux,et al.  Improved Deep Point Cloud Geometry Compression , 2020, 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP).

[17]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Zhan Ma,et al.  Lossy Point Cloud Geometry Compression via End-to-End Learning , 2021, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Fred L. Drake,et al.  Python 3 Reference Manual , 2009 .

[20]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Nuno M. M. Rodrigues,et al.  Deep Learning-Based Point Cloud Geometry Coding: RD Control Through Implicit and Explicit Quantization , 2020, 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).