The Smallest Extraction Problem
暂无分享,去创建一个
Franco Milicchio | Valter Crescenzi | Paolo Atzeni | Valerio Cetorelli | P. Atzeni | Valter Crescenzi | F. Milicchio | Valerio Cetorelli
[1] Andrew Tomkins,et al. The volume and evolution of web page templates , 2005, WWW '05.
[2] James A. Storer,et al. Data compression via textual substitution , 1982, JACM.
[3] Wei Liu,et al. ViDE: A Vision-Based Approach for Deep Web Data Extraction , 2010, IEEE Transactions on Knowledge and Data Engineering.
[4] Bing Liu,et al. Web data extraction based on partial tree alignment , 2005, WWW '05.
[5] Tim Furche,et al. Robust and Noise Resistant Wrapper Induction , 2016, SIGMOD Conference.
[6] Aditya G. Parameswaran,et al. Optimal schemes for robust web extraction , 2011, Proc. VLDB Endow..
[7] Luca Breveglieri,et al. Formal Languages and Compilation , 2009, Texts in Computer Science.
[8] Matteo Pradella,et al. Toward a theory of input-driven locally parsable languages , 2017, Theor. Comput. Sci..
[9] Qiang Hao,et al. From one tree to a forest: a unified solution for structured web data extraction , 2011, SIGIR.
[10] Berthier A. Ribeiro-Neto,et al. A brief survey of web data extraction tools , 2002, SGMD.
[11] Valter Crescenzi,et al. Handling irregularities in ROADRUNNER , 2004, AAAI 2004.
[12] Valter Crescenzi,et al. Extraction and Integration of Partially Overlapping Web Sources , 2013, Proc. VLDB Endow..
[13] Georg Gottlob,et al. The Lixto data extraction project: back and forth between theory and practice , 2004, PODS.
[14] Tim Furche,et al. Joint repairs for web wrappers , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).
[15] Sumit Gulwani,et al. Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference , 2020, SIGMOD Conference.
[16] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[17] Mitesh Patel,et al. Accessing the deep web , 2007, CACM.
[18] C. Jacobs,et al. Parsing Techniques: A Practical Guide, 2nd edition , 2008 .
[19] Matthias Gallé,et al. The Generalized Smallest Grammar Problem , 2016, ICGI.
[20] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .
[21] Nilesh N. Dalvi,et al. Robust web extraction: an approach based on a probabilistic tree-edit model , 2009, SIGMOD Conference.
[22] Hannaneh Hajishirzi,et al. Web-scale Knowledge Collection , 2020, WSDM.
[23] Nicholas Kushmerick,et al. Mining web logs for personalized site maps , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops), 2002..
[24] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[25] Henning Fernau,et al. On the Complexity of the Smallest Grammar Problem over Fixed Alphabets , 2020, Theory Comput. Syst..
[26] Markus Lohrey,et al. The Smallest Grammar Problem Revisited , 2016, IEEE Transactions on Information Theory.
[27] Valter Crescenzi,et al. Grammars Have Exceptions , 1998, Inf. Syst..
[28] Dick Grune,et al. Parsing Techniques (Monographs in Computer Science) , 2006 .
[29] Rajeev Rastogi,et al. Web-scale information extraction with vertex , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[30] Valter Crescenzi,et al. Alaska: A Flexible Benchmark for Data Integration Tasks , 2021, ArXiv.
[31] Vijay V. Raghavan,et al. Fully automatic wrapper generation for search engines , 2005, WWW '05.
[32] Boris Chidlovskii,et al. Documentum ECI self-repairing wrappers: performance analysis , 2006, SIGMOD Conference.
[33] Khaled Shaalan,et al. FiVaTech: Page-Level Web Data Extraction from Template Pages , 2007, IEEE Transactions on Knowledge and Data Engineering.
[34] Xin Dong,et al. OpenCeres: When Open Information Extraction Meets the Semi-Structured Web , 2019, NAACL.
[35] Valter Crescenzi,et al. Hybrid Crowd-Machine Wrapper Inference , 2019, ACM Trans. Knowl. Discov. Data.
[36] Abhi Shelat,et al. The smallest grammar problem , 2005, IEEE Transactions on Information Theory.
[37] Tim Furche,et al. DIADEM: Thousands of Websites to a Single Database , 2014, Proc. VLDB Endow..
[38] Markus Lohrey,et al. Algorithmics on SLP-compressed strings: A survey , 2012, Groups Complex. Cryptol..
[39] Rafael Corchuelo,et al. Trinity: On Using Trinary Trees for Unsupervised Web Data Extraction , 2014, IEEE Transactions on Knowledge and Data Engineering.
[40] Tobias Dönz. Extracting Structured Data from Web Pages , 2003 .
[41] Valter Crescenzi,et al. Automatic information extraction from large websites , 2004, JACM.
[42] Valter Crescenzi,et al. Crowdsourcing large scale wrapper inference , 2014, Distributed and Parallel Databases.
[43] Robert W. Floyd,et al. Syntactic Analysis and Operator Precedence , 1963, JACM.
[44] Jean-Christophe Aval,et al. Multivariate Fuss-Catalan numbers , 2007, Discret. Math..
[45] Xin Luna Dong,et al. CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web , 2018, Proc. VLDB Endow..
[46] Stefano Crespi-Reghizzi,et al. Operator Precedence and the Visibly Pushdown Property , 2010, LATA.
[47] Jun Ma,et al. AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types , 2020, KDD.
[48] En-Hui Yang,et al. Grammar-based codes: A new class of universal lossless source codes , 2000, IEEE Trans. Inf. Theory.
[49] Tim Furche,et al. WADaR: Joint Wrapper and Data Repair , 2015, Proc. VLDB Endow..
[50] Xiaoying Wu,et al. A survey on XML streaming evaluation techniques , 2013, The VLDB Journal.
[51] Mohd Amir Bin Mohd Azir,et al. Wrapper approaches for web data extraction : A review , 2017, 2017 6th International Conference on Electrical Engineering and Informatics (ICEEI).
[52] Brad Adelberg,et al. NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents , 1998, SIGMOD '98.
[53] Tim Furche,et al. RED: Redundancy-Driven Data Extraction from Result Pages? , 2019, WWW.