Graph integration of structured, semistructured and unstructured data for data journalism

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[3]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[4]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5]  Matthew A. Jaro,et al.  Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .

[6]  R. Ravi,et al.  A polylogarithmic approximation algorithm for the group Steiner tree problem , 2000, SODA '98.

[7]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[8]  Vagelis Hristidis,et al.  DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[9]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[10]  David Maier,et al.  From databases to dataspaces: a new abstraction for information management , 2005, SGMD.

[11]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[12]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[14]  Luis Gravano,et al.  Efficient Keyword Search Across Heterogeneous Relational Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[15]  Yi Chen,et al.  Identifying meaningful return information for XML keyword search , 2007, SIGMOD '07.

[16]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[17]  Martin L. Kersten,et al.  Database Cracking , 2007, CIDR.

[18]  Alon Y. Halevy,et al.  Indexing dataspaces , 2007, SIGMOD '07.

[19]  Anthony K. H. Tung,et al.  A graph method for keyword-based selection of the top-K databases , 2008, SIGMOD Conference.

[20]  Sunita Sarawagi,et al.  Information Extraction , 2008 .

[21]  Beng Chin Ooi,et al.  EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data , 2008, SIGMOD Conference.

[22]  Jeffrey Xu Yu,et al.  Keyword Search in Databases , 2010, Keyword Search in Databases.

[23]  Jeffrey Xu Yu,et al.  Keyword Search in Relational Databases: A Survey , 2010, IEEE Data Eng. Bull..

[24]  Roi Blanco,et al.  Keyword search over RDF graphs , 2011, CIKM '11.

[25]  Serge Abiteboul,et al.  PARIS: Probabilistic Alignment of Relations, Instances, and Schema , 2011, Proc. VLDB Endow..

[26]  Maurizio Lenzerini Ontology-based data management , 2011, CIKM '11.

[27]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[28]  Diego Calvanese,et al.  Query Processing under GLAV Mappings for Relational and Graph Databases , 2012, Proc. VLDB Endow..

[29]  Alon Y. Halevy,et al.  Principles of Data Integration , 2012 .

[30]  Olivier Galibert,et al.  Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign , 2012, LREC.

[31]  Thomas Neumann,et al.  Fast approximation of steiner trees in large graphs , 2012, CIKM.

[32]  Anastasia Ailamaki,et al.  NoDB in Action: Adaptive Query Processing on Raw Data , 2012, Proc. VLDB Endow..

[33]  Benoît Sagot,et al.  Annotation référentielle du Corpus Arboré de Paris 7 en entités nommées (Referential named entity annotation of the Paris 7 French TreeBank) [in French] , 2012, JEP/TALN/RECITAL.

[34]  Giorgio Orsi,et al.  A methodology for evaluating algorithms for table understanding in PDF documents , 2012, DocEng '12.

[35]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[36]  Joel Nothman,et al.  Learning multilingual named entity recognition from Wikipedia , 2013, Artif. Intell..

[37]  Tamir Hassan,et al.  ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[38]  François Goasdoué,et al.  Fact checking and analyzing the web , 2013, SIGMOD '13.

[39]  François Goasdoué,et al.  Growing triples on trees: an XML-RDF hybrid model for annotated documents , 2011, The VLDB Journal.

[40]  Partha Pratim Talukdar,et al.  Active learning in keyword search-based data integration , 2014, The VLDB Journal.

[41]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[42]  Markus Krötzsch,et al.  Wikidata , 2014 .

[43]  Feifei Li,et al.  Scalable Keyword Search on Large RDF Data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[44]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[45]  Patrick Valduriez,et al.  CloudMdsQL: querying heterogeneous cloud data stores with a common language , 2016, Distributed and Parallel Databases.

[46]  Edleno Silva de Moura,et al.  Ranking Candidate Networks of relations to improve keyword search over relational databases , 2015, ICDE.

[47]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[48]  François Goasdoué,et al.  Mixed-instance querying: a lightweight integration architecture for data journalism , 2016, Proc. VLDB Endow..

[49]  Lei Chen,et al.  Keyword Query over Error-Tolerant Knowledge Bases , 2016, Journal of Computer Science and Technology.

[50]  Laure Berti-Équille,et al.  Veracity of Big Data: Challenges of Cross-Modal Truth Discovery , 2016, JDIQ.

[51]  Alon Y. Halevy,et al.  Goods: Organizing Google's Datasets , 2016, SIGMOD Conference.

[52]  Jeffrey Xu Yu,et al.  Efficient and Progressive Group Steiner Tree Search , 2016, SIGMOD Conference.

[53]  François Goasdoué,et al.  Social, Structured and Semantic Search , 2016, EDBT.

[54]  Pierre Senellart,et al.  Conjunctive Queries on Probabilistic Graphs: Combined Complexity , 2017, PODS.

[55]  Reynold Cheng,et al.  An Indexing Framework for Queries on Probabilistic Graphs , 2017, ACM Trans. Database Syst..

[56]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[57]  Ioana Manolescu,et al.  Extracting linked data from statistic spreadsheets , 2017, SBD@SIGMOD.

[58]  Ioana Manolescu,et al.  A Content Management Perspective on Fact-Checking , 2018, WWW.

[59]  Ioana Manolescu,et al.  ConnectionLens: Finding Connections Across Heterogeneous Data Sources , 2018, Proc. VLDB Endow..

[60]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[61]  Gerhard Weikum,et al.  A Study of the Importance of External Knowledge in the Named Entity Recognition Task , 2018, ACL.

[62]  Gerhard Weikum,et al.  Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs , 2019, SIGIR.

[63]  Roland Vollgraf,et al.  FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP , 2019, NAACL.

[64]  Ioana Manolescu,et al.  Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue , 2019, SIGMOD Conference.

[65]  Ioana Manolescu,et al.  Keyword Search in Heterogeneous Data Sources , 2020 .

[66]  Nathalie Pernelle,et al.  BECKEY: Understanding, comparing and discovering keys of different semantics in knowledge bases , 2020, Knowl. Based Syst..

[67]  François Goasdoué,et al.  Ontology-Based RDF Integration of Heterogeneous Data , 2020, EDBT.

[68]  Angelos-Christos G. Anadiotis,et al.  Graph-based keyword search in heterogeneous data sources , 2020, ArXiv.

[69]  Gerhard Weikum,et al.  YAGO 4: A Reason-able Knowledge Base , 2020, ESWC.

[70]  George Papadakis,et al.  Entity Resolution: Past, Present and Yet-to-Come , 2020, EDBT.