Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

Tables contain valuable knowledge in a structured form. We employ neural language modeling approaches to embed tabular data into vector spaces. Specifically, we consider different table elements, such caption, column headings, and cells, for training word and entity embeddings. These embeddings are then utilized in three particular table-related tasks, row population, column population, and table retrieval, by incorporating them into existing retrieval models as additional semantic similarity signals. Evaluation results show that table embeddings can significantly improve upon the performance of state-of-the-art baselines.

[1]  Surajit Chaudhuri,et al.  InfoGather: entity augmentation and attribute discovery by holistic matching with web tables , 2012, SIGMOD Conference.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Pedro A. Szekely,et al.  TabVec: Table Vectors for Classification of Web Tables , 2018, ArXiv.

[4]  Krisztian Balog,et al.  On-the-fly Table Generation , 2018, SIGIR.

[5]  Alon Y. Halevy,et al.  Data Integration for the Relational Web , 2009, Proc. VLDB Endow..

[6]  Reynold Xin,et al.  Finding related tables , 2012, SIGMOD Conference.

[7]  Krisztian Balog,et al.  Ad Hoc Table Retrieval using Semantic Similarity , 2018, WWW.

[8]  Shuo Zhang SmartTable: Equipping Spreadsheets with Intelligent AssistanceFunctionalities , 2018, SIGIR.

[9]  Heiko Paulheim,et al.  Entity Matching on Web Tables: a Table Embeddings approach for Blocking , 2017, EDBT.

[10]  Sunita Sarawagi,et al.  Answering Table Queries on the Web using Column Keywords , 2012, Proc. VLDB Endow..

[11]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[12]  Doug Downey,et al.  TabEL: Entity Linking in Web Tables , 2015, SEMWEB.

[13]  Paolo Merialdo,et al.  Knowledge Base Augmentation using Tabular Data , 2014, LDOW.

[14]  Krisztian Balog,et al.  EntiTables: Smart Assistance for Entity-Focused Tables , 2017, SIGIR.

[15]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[16]  Krisztian Balog,et al.  Design Patterns for Fusion-Based Object Retrieval , 2017, ECIR.

[17]  Wolfgang Lehner,et al.  Towards a Hybrid Imputation Approach Using Web Tables , 2015, 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC).