Dynamic Complexity of Document Spanners

The present paper investigates the dynamic complexity of document spanners, a formal framework for information extraction introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (JACM 2015). We first look at the class of regular spanners and prove that any regular spanner can be maintained in the dynamic complexity class DynPROP. This result follows from work done previously on the dynamic complexity of formal languages by Gelade, Marquardt, and Schwentick (TOCL 2012). To investigate core spanners we use SpLog, a concatenation logic that exactly captures core spanners. We show that the dynamic complexity class DynCQ, is more expressive than SpLog and therefore can maintain any core spanner. This result is then extended to show that DynFO can maintain any generalized core spanner and that DynFO is at least as powerful as SpLog with negation.

[1]  RONALD FAGIN,et al.  Document Spanners , 2015, J. ACM.

[2]  Jianwen Su,et al.  Nonrecursive incremental evaluation of Datalog queries , 1995, Annals of Mathematics and Artificial Intelligence.

[3]  Thomas Schwentick,et al.  Dynamic conjunctive queries , 2017, J. Comput. Syst. Sci..

[4]  Benny Kimelfeld,et al.  Joining Extractions of Regular Expressions , 2017, PODS.

[5]  Thomas Zeume,et al.  Dynamic Graph Queries , 2015, ICDT.

[6]  Dominik D. Freydenberger,et al.  Document Spanners: From Expressive Power to Decision Problems , 2016, ICDT.

[7]  Neil Immerman,et al.  Dyn-FO: A Parallel, Dynamic Complexity Class , 1997, J. Comput. Syst. Sci..

[8]  Katja Losemann Foundations of Regular Languages for Processing RDF and XML , 2015 .

[9]  Thomas Schwentick,et al.  Dynamic complexity: recent updates , 2016, SIGL.

[10]  Ronald Fagin,et al.  Recursive Programs for Document Spanners , 2017, ICDT.

[11]  Dominik D. Freydenberger,et al.  Document Spanners: From Expressive Power to Decision Problems , 2017, Theory of Computing Systems.

[12]  Arto Salomaa,et al.  Pattern languages with and without erasing , 1994 .

[13]  Frank Neven,et al.  Split-Correctness in Information Extraction , 2018, PODS.

[14]  Thomas Schwentick,et al.  The dynamic complexity of formal languages , 2008, TOCL.

[15]  Markus Kröll,et al.  Complexity Bounds for Relational Algebra over Document Spanners , 2019, PODS.

[16]  Dominik D. Freydenberger A Logic for Document Spanners , 2018, Theory of Computing Systems.

[17]  Markus L. Schmid Characterising REGEX Languages by Regular Languages Equipped with Factor-Referencing , 2014, Developments in Language Theory.

[18]  Stijn Vansummeren,et al.  Constant Delay Algorithms for Regular Document Spanners , 2018, PODS.

[19]  Antoine Amarilli,et al.  Constant-Delay Enumeration for Nondeterministic Document Spanners , 2019, ICDT.

[20]  Cristian Riveros,et al.  Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity , 2018, PODS.