SharkDB: An In-Memory Column-Oriented Trajectory Storage

The last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.

[1]  Jimeng Sun,et al.  The TPR*-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries , 2003, VLDB.

[2]  Christian S. Jensen,et al.  Indexing the positions of continuously moving objects , 2000, SIGMOD '00.

[3]  KimChangkyu,et al.  Fast updates on read-optimized databases using multi-core CPUs , 2011, VLDB 2011.

[4]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[5]  Nirvana Meratnia,et al.  Spatiotemporal Compression Techniques for Moving Point Objects , 2004, EDBT.

[6]  Marcin Zukowski,et al.  Positional update handling in column stores , 2010, SIGMOD Conference.

[7]  Martin L. Kersten,et al.  An architecture for recycling intermediates in a column-store , 2009, SIGMOD Conference.

[8]  Jörg Sander,et al.  A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing , 2005, VLDB.

[9]  Dieter Pfoser,et al.  Novel Approaches in Query Processing for Moving Object Trajectories , 2000, VLDB 2000.

[10]  Pradeep Dubey,et al.  Fast Updates on Read-Optimized Databases Using Multi-Core CPUs , 2011, Proc. VLDB Endow..

[11]  Hasso Plattner,et al.  SanssouciDB: An In-Memory Database for Processing Enterprise Workloads , 2011, BTW.

[12]  Ravi Krishnamurthy,et al.  Design of a Memory Resident DBMS , 1985, IEEE Computer Society International Conference.

[13]  Carsten Binnig,et al.  Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.

[14]  Dieter Pfoser,et al.  Novel Approaches to the Indexing of Moving Object Trajectories , 2000, VLDB.

[15]  Alexander Zeier,et al.  Speeding Up Queries in Column Stores - A Case for Compression , 2010, DaWak.

[16]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[17]  Jignesh M. Patel,et al.  Indexing Large Trajectory Data Sets With SETI , 2003, CIDR.

[18]  S. Sudarshan,et al.  DataBlitz storage manager: main-memory database performance for critical applications , 1999, SIGMOD '99.

[19]  Martin L. Kersten,et al.  Generic Database Cost Models for Hierarchical Memory Systems , 2002, VLDB.

[20]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[21]  Samuel Madden,et al.  TrajStore: An adaptive storage system for very large trajectory data sets , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[22]  Jörg Sander,et al.  PIST: An Efficient and Practical Indexing Technique for Historical Spatio-Temporal Point Data , 2008, GeoInformatica.

[23]  Carolyn Turbyfill,et al.  Performance of complex queries in Main Memory Database Systems , 1987, 1987 IEEE Third International Conference on Data Engineering.

[24]  Hasso Plattner,et al.  A common database approach for OLTP and OLAP using an in-memory column database , 2009, SIGMOD Conference.

[25]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[26]  Kai Zheng,et al.  Calibrating trajectory data for similarity-based analysis , 2013, SIGMOD '13.