EntiTables: Smart Assistance for Entity-Focused Tables

Tables are among the most powerful and practical tools for organizing and working with data. Our motivation is to equip spreadsheet programs with smart assistance capabilities. We concentrate on one particular family of tables, namely, tables with an entity focus. We introduce and focus on two specifc tasks: populating rows with additional instances (entities) and populating columns with new headings. We develop generative probabilistic models for both tasks. For estimating the components of these models, we consider a knowledge base as well as a large table corpus. Our experimental evaluation simulates the various stages of the user entering content into an actual table. A detailed analysis of the results shows that the models' components are complimentary and that our methods outperform existing approaches from the literature.

[1]  Timothy W. Finin,et al.  Semantic Message Passing for Generating Linked Data from Tables , 1999, SEMWEB.

[2]  Reynold Xin,et al.  Finding related tables , 2012, SIGMOD Conference.

[3]  Surajit Chaudhuri,et al.  InfoGather: entity augmentation and attribute discovery by holistic matching with web tables , 2012, SIGMOD Conference.

[4]  Ganesh Ramakrishnan,et al.  Numerical Relation Extraction with Minimal Supervision , 2016, AAAI.

[5]  Meihui Zhang,et al.  InfoGather+: semantic matching and annotation of numeric and time-varying attributes in web tables , 2013, SIGMOD '13.

[6]  Doug Downey,et al.  TabEL: Entity Linking in Web Tables , 2015, SEMWEB.

[7]  Sunita Sarawagi,et al.  Open-domain quantity queries on web tables: annotation, response, and consensus models , 2014, KDD.

[8]  Jayant Madhavan,et al.  Structured Data on the Web , 2009, 2010 12th International Asia-Pacific Web Conference.

[9]  Alessandra Mileo,et al.  Using linked data to mine RDF from wikipedia's tables , 2014, WSDM.

[10]  Heiko Paulheim,et al.  The Mannheim Search Join Engine , 2015, J. Web Semant..

[11]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[12]  Wolfgang Lehner,et al.  Towards a Hybrid Imputation Approach Using Web Tables , 2015, 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC).

[13]  William W. Cohen,et al.  Iterative Set Expansion of Named Entities Using the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[14]  Gerhard Weikum,et al.  Making Sense of Entities and Quantities in Web Tables , 2016, CIKM.

[15]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[16]  Michael Granitzer,et al.  Towards Disambiguating Web Tables , 2013, SEMWEB.

[17]  Jayant Madhavan,et al.  Recovering Semantics of Tables on the Web , 2011, Proc. VLDB Endow..

[18]  Eric Crestan,et al.  Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[19]  Loredana Afanasiev,et al.  Harnessing the Deep Web: Present and Future , 2009, CIDR.

[20]  Yeye He,et al.  SEISA: set expansion by iterative similarity aggregation , 2011, WWW.

[21]  Zhengdong Lu,et al.  Neural Enquirer: Learning to Query Tables in Natural Language , 2016, IEEE Data Eng. Bull..

[22]  Haixun Wang,et al.  Understanding Tables on the Web , 2012, ER.

[23]  Guoliang Li,et al.  Extending string similarity join to tolerant fuzzy token matching , 2014, ACM Trans. Database Syst..

[24]  Marcin Sydow,et al.  Aspect-Based Similar Entity Search in Semantic Knowledge Graphs with Diversity-Awareness and Relaxation , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[25]  Daisy Zhe Wang,et al.  WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..

[26]  Eric Crestan,et al.  Web-scale table census and classification , 2011, WSDM '11.

[27]  Doug Downey,et al.  Methods for exploring and mining tables on Wikipedia , 2013, IDEA@KDD.

[28]  Yeye He,et al.  Concept Expansion Using Web Tables , 2015, WWW.

[29]  Dan Roth,et al.  Reasoning about Quantities in Natural Language , 2015, TACL.

[30]  M. de Rijke,et al.  Example Based Entity Search in the Web of Data , 2013, ECIR.

[31]  Omar Alonso,et al.  Raimond: Quantitative Data Extraction from Twitter to Describe Events , 2015, ICWE.

[32]  Sunita Sarawagi,et al.  Annotating and searching web tables using entities, types and relationships , 2010, Proc. VLDB Endow..

[33]  Paolo Merialdo,et al.  Knowledge Base Augmentation using Tabular Data , 2014, LDOW.