Data-thirsty business analysts need SODA: search over data warehouse

Querying large data warehouses is very hard for non-tech savvy business users. Deep technical knowledge of both SQL as well as the schema of the database is required in order to build correct queries and to come up with new business insights. In this paper we introduce a novel system called SODA (Search Over DAta Warehouse) that bridges the gap between the business world and the IT world by enabling extended keyword search in a data warehouse. SODA uses metadata information, DBpedia entries as well as base data to generate SQL to allow intuitive exploration of the data. The process of query classification, query graph generation and SQL generation is visualized to provide the analysts with information on how the query results are produced. Experiments with real data of a global financial institution comprising around 300 tables showed promising results.

[1]  Vikram Gadi,et al.  SPARK2: Top-k Keyword Query in Relational Databases , 2012 .

[2]  Jennifer Widom,et al.  Indexing relational database content offline for efficient keyword-based search , 2005, 9th International Database Engineering & Application Symposium (IDEAS'05).

[3]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Sonia Bergamaschi,et al.  Keyword search over relational databases: a metadata approach , 2011, SIGMOD '11.

[5]  Vagelis Hristidis,et al.  DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[6]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[7]  Luis Gravano,et al.  Efficient Keyword Search Across Heterogeneous Relational Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.