Transformations on Graph Databases for Polyglot Persistence with NotaQL

Polyglot-persistence applications use a combination of many different data stores. Often, one of them is a graph database to model relationships between data items. The data-transformation language NotaQL can be used to define transformations from one NoSQL database to a different one. In this paper, we present a language extension for NotaQL to allow graph transformations, graph analysis, and data migrations on graph databases. NotaQL is schema-flexible, it offers filters and aggregation functions, and it allows for graph traversal and edge creation. Our graph-transformation platform can be used for iterative graph algorithms and bulk processing.

[1]  Marko A. Rodriguez,et al.  The Gremlin graph traversal machine and language (invited talk) , 2015, DBPL.

[2]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[3]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4]  Yannis Papakonstantinou,et al.  The SQL++ Query Language: Configurable, Unifying and Semi-structured , 2014, 1405.3631.

[5]  Reynold Xin,et al.  GraphX: a resilient distributed graph system on Spark , 2013, GRADES.

[6]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[7]  Kunle Olukotun,et al.  Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[8]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[9]  Johannes Schildgen,et al.  NotaQL Is Not a Query Language! It's for Data Transformation on Wide-Column Stores , 2015, BICOD.

[10]  Johannes Schildgen,et al.  Cross-system NoSQL data transformations with NotaQL , 2016, BeyondMR@SIGMOD.

[11]  Norbert Ritter,et al.  Towards Automated Polyglot Persistence , 2015, BTW.

[12]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[13]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[14]  Gang Hu,et al.  SQLGraph: An Efficient Relational-Based Property Graph Store , 2015, SIGMOD Conference.

[15]  Norbert Ritter,et al.  Towards a Scalable and Unified REST API for Cloud Data Stores , 2014, GI-Jahrestagung.

[16]  L. Takac DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS , 2012 .

[17]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[18]  Martin Fowler,et al.  NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence , 2012 .

[19]  Marko A. Rodriguez,et al.  The Gremlin Graph Traversal Machine and Language , 2015, ArXiv.