iSPEED: an Efficient In-Memory Based Spatial Query System for Large-Scale 3D Data with Complex Structures

Recent advances in digital pathology make it possible to support 3D tissue-based investigation of human diseases at extremely high resolutions. Exploring spatial relationships and patterns among massive 3D micro-anatomic biological objects such as blood vessels and cells derived from 3D pathology image volumes plays a critical role in studying human diseases. In this paper, we present our work on building an effective and scalable in-memory based spatial query system iSPEED for large-scale 3D data with complex structures. To achieve low latency, iSPEED stores data in memory with effective progressive compression for each 3D object with successive levels of detail. To minimize search space and computation cost, iSPEED pre-generates global spatial indexes in memory and employs on-demand indexing at run-time. In particular, iSPEED exploits structural indexing for complex structured objects in distance based queries. iSPEED provides a 3D spatial query engine that can be invoked on-demand to run many instances in parallel implemented with, but not limited to, MapReduce. iSPEED builds in-memory indexes and decompresses data on-demand, which has minimal memory footprint. We evaluate iSPEED with two representative queries: 3D spatial joins and 3D spatial proximity estimation. Our experiments demonstrate that iSPEED significantly improves the performance over traditional non-memory based spatial query systems.

[1]  Jun Kong,et al.  A high-performance spatial database based approach for pathology imaging algorithm evaluation , 2013, Journal of pathology informatics.

[2]  Le Gruenwald,et al.  Large-scale spatial join query processing in Cloud , 2015, 2015 31st IEEE International Conference on Data Engineering Workshops.

[3]  Walid G. Aref,et al.  LocationSpark: A Distributed In-Memory Data Management System for Big Spatial Data , 2016, Proc. VLDB Endow..

[4]  Jun Kong,et al.  Liver whole slide image analysis for 3D vessel reconstruction , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).

[5]  Hugues Hoppe,et al.  Progressive meshes , 1996, SIGGRAPH.

[6]  Minyi Guo,et al.  Simba: Efficient In-Memory Spatial Analytics , 2016, SIGMOD Conference.

[7]  Fusheng Wang,et al.  SATO: a spatial data partitioning framework for scalable query processing , 2014, SIGSPATIAL/GIS.

[8]  Monique Teillaud,et al.  The computational geometry algorithms library CGAL , 2015, ACCA.

[9]  Hans-Peter Kriegel,et al.  Parallel processing of spatial joins using R-trees , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[10]  Sridhar Ramaswamy,et al.  A Unified Approach for Indexed and Non-Indexed Spatial Joins , 2000, EDBT.

[11]  Ahmed Eldawy,et al.  SpatialHadoop: A MapReduce framework for spatial data , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[12]  Bernhard Seeger,et al.  An Evaluation of Generic Bulk Loading Techniques , 2001, VLDB.

[13]  Jun Kong,et al.  Scalable 3D spatial queries for analytical pathology imaging with MapReduce , 2016, SIGSPATIAL/GIS.

[14]  Helmut Kettenmann,et al.  The brain tumor microenvironment , 2011, Glia.

[15]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[16]  Mohamed Sarwat,et al.  GeoSpark: a cluster computing framework for processing large-scale spatial data , 2015, SIGSPATIAL/GIS.

[17]  Jun Kong,et al.  A 3D Primary Vessel Reconstruction Framework with Serial Microscopy Images , 2015, MICCAI.

[18]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[19]  C.-C. Jay Kuo,et al.  Technologies for 3D mesh compression: A survey , 2005, J. Vis. Commun. Image Represent..

[20]  Pierre Alliez,et al.  Progressive compression of manifold polygon meshes , 2012, Comput. Graph..

[21]  A. Davidson Optimizing Shuffle Performance in Spark , 2013 .

[22]  Xiaofang Zhou,et al.  Data Partitioning for Parallel Spatial Join Processing , 1997, GeoInformatica.

[23]  Joel H. Saltz,et al.  Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems , 2012, Proc. VLDB Endow..