暂无分享,去创建一个
[1] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[2] Alon Y. Halevy,et al. Data Integration for the Relational Web , 2009, Proc. VLDB Endow..
[3] Matthias Jarke,et al. Designing a multi-sided data platform: findings from the International Data Spaces case , 2019, Electronic Markets.
[4] Phokion G. Kolaitis,et al. Structural characterizations of schema-mapping languages , 2009, ICDT '09.
[5] Bernhard Mitschang,et al. Modeling Data Lakes with Data Vault: Practical Experiences, Assessment, and Lessons Learned , 2019, ER.
[6] Tim Furche,et al. Data Wrangling for Big Data: Challenges and Opportunities , 2016, EDBT.
[7] Ronald Fagin,et al. Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.
[8] Alexandra Roatis,et al. CLAMS: Bringing Quality to Data Lakes , 2016, SIGMOD Conference.
[9] Emanuel Sallinger,et al. On the Undecidability of the Equivalence of Second-Order Tuple Generating Dependencies , 2015, AMW.
[10] Jignesh M. Patel,et al. Enabling JSON Document Stores in Relational Systems , 2013, WebDB.
[11] Ian T. Foster,et al. Skluma: An Extensible Metadata Extraction Pipeline for Disorganized Data , 2018, 2018 IEEE 14th International Conference on e-Science (e-Science).
[12] Dimitrios Tsoumakos,et al. MuSQLE: Distributed SQL query execution over multiple engine environments , 2016, 2016 IEEE International Conference on Big Data (Big Data).
[13] Ajit Singh. Architecture of Data Lake , 2019 .
[14] Cong Yu,et al. Constraint-based XML query rewriting for data integration , 2004, SIGMOD '04.
[15] Boualem Benatallah,et al. Temporal Provenance Model (TPM): Model and Query Language , 2012, ArXiv.
[16] Hassan H. Alrehamy,et al. Personal Data Lake with Data Gravity Pull , 2015, 2015 IEEE Fifth International Conference on Big Data and Cloud Computing.
[17] Huang Fang. Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem , 2015, 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).
[18] Chris Douglas,et al. Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics , 2017, SIGMOD Conference.
[19] Sandra Geisler,et al. Constance: An Intelligent Data Lake System , 2016, SIGMOD Conference.
[20] Laks V. S. Lakshmanan,et al. Schema mapping and query translation in heterogeneous P2P XML databases , 2010, The VLDB Journal.
[21] David Maier,et al. From databases to dataspaces: a new abstraction for information management , 2005, SGMD.
[22] Emanuel Sallinger,et al. Nested dependencies: structure and reasoning , 2014, PODS.
[23] Dan Wang,et al. Relaxed Functional Dependency Discovery in Heterogeneous Data Lakes , 2019, ER.
[24] Hakan Hacigümüs,et al. MISO: souping up big data query processing with a multistore system , 2014, SIGMOD Conference.
[25] David J. DeWitt,et al. Split query processing in polybase , 2013, SIGMOD '13.
[26] Daniel E. O'Leary,et al. Embedding AI and Crowdsourcing in the Big Data Lake , 2014, IEEE Intelligent Systems.
[27] Christoph Quix,et al. Nested Schema Mappings for Integrating JSON , 2018, ER.
[28] Jérôme Darmont,et al. Modeling Data Lake Metadata with a Data Vault , 2018, IDEAS.
[29] Eitan M. Gurari,et al. Introduction to the theory of computation , 1989 .
[30] Rada Chirkova,et al. Enabling query processing across heterogeneous data models: A survey , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[31] Maurizio Lenzerini,et al. Data integration: a theoretical perspective , 2002, PODS.
[32] Riccardo Torlone,et al. Crossing the finish line faster when paddling the Data Lake with Kayak , 2017, Proc. VLDB Endow..
[33] Meike Klettke,et al. Uncovering the evolution history of data lakes , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[34] Michael Stonebraker,et al. Aurum: A Data Discovery System , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).
[35] Christian Bizer,et al. Stitching Web Tables for Improving Matching Quality , 2017, Proc. VLDB Endow..
[36] Alon Y. Halevy,et al. Principles of Data Integration , 2012 .
[37] David Maier,et al. Principles of dataspace systems , 2006, PODS '06.
[38] Domenico Ursino,et al. A New Metadata Model to Uniformly Handle Heterogeneous Data Lake Sources , 2018, ADBIS.
[39] Zachary G. Ives,et al. Finding Related Tables in Data Lakes for Interactive Data Science , 2020, SIGMOD Conference.
[40] Patrick Valduriez,et al. CloudMdsQL: querying heterogeneous cloud data stores with a common language , 2016, Distributed and Parallel Databases.
[41] Christian Brecher,et al. Towards an Infrastructure Enabling the Internet of Production , 2019, 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS).
[42] Michael Olschimke,et al. Building a Scalable Data Warehouse with Data Vault 2.0 , 2015 .
[43] Yannis Papakonstantinou,et al. The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases , 2014 .
[44] Domenico Ursino,et al. An Approach to Extracting Thematic Views from Highly Heterogeneous Sources of a Data Lake , 2018, SEBD.
[45] Aditya G. Parameswaran,et al. Navigating the Data Lake with DATAMARAN: Automatically Extracting Structure from Log Datasets , 2017, SIGMOD Conference.
[46] Kemele M. Endris,et al. Ontario: Federated Query Processing Against a Semantic Data Lake , 2019, DEXA.
[47] Christoph Quix,et al. Query Rewriting for Heterogeneous Data Lakes , 2018, ADBIS.
[48] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[49] Natalia Miloslavskaya,et al. Big Data, Fast Data and Data Lake Concepts , 2016, BICA.
[50] Sabrina Marczak,et al. A Mapping Study about Data Lakes: An Improved Definition and Possible Architectures , 2019, SEKE.
[51] Ioana Manolescu,et al. Invisible Glue: Scalable Self-Tunning Multi-Stores , 2015, CIDR.
[52] Mayank Bawa,et al. LSH forest: self-tuning indexes for similarity search , 2005, WWW '05.
[53] Stephen R. Gardner. Building the data warehouse , 1998, CACM.
[54] Michael Stonebraker,et al. The BigDAWG Polystore System , 2015, SGMD.
[55] Siti Mariyam Hj. Shamsuddin,et al. Machine Learning in Data Lake for Combining Data Silos , 2017, DMBD.
[56] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[57] Reynold Xin,et al. Finding related tables , 2012, SIGMOD Conference.
[58] Phokion G. Kolaitis. Schema mappings, data exchange, and metadata management , 2005, PODS.
[59] Anne Laurent,et al. The next information architecture evolution: the data lake wave , 2016, MEDES.
[60] Renée J. Miller,et al. Table Union Search on Open Data , 2018, Proc. VLDB Endow..
[61] Raul Castro Fernandez,et al. Lazo: A Cardinality-Based Method for Coupled Estimation of Jaccard Similarity and Containment , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).
[62] Mary Roth,et al. Data Wrangling: The Challenging Yourney from the Wild to the Lake , 2015, CIDR.
[63] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[64] Alon Y. Halevy,et al. Managing Google's data lake: an overview of the Goods system , 2016, IEEE Data Eng. Bull..
[65] Simon Scerri,et al. Querying Data Lakes using Spark and Presto , 2019, WWW.
[66] Boualem Benatallah,et al. CoreKG: a Knowledge Lake Service , 2018, Proc. VLDB Endow..
[67] Sandra Geisler,et al. An Integrated Ontology-Based Approach for Patent Classification in Medical Engineering , 2017, DILS.
[68] Alberto Abelló,et al. Keeping the Data Lake in Form , 2020, ACM Trans. Inf. Syst..
[69] Renée J. Miller,et al. LSH Ensemble: Internet-Scale Domain Search , 2016, Proc. VLDB Endow..
[70] Paolo Papotti,et al. Nested mappings: schema mapping reloaded , 2006, VLDB.
[71] Reinhard Pichler,et al. The complexity of evaluating tuple generating dependencies , 2011, ICDT '11.
[72] Renée J. Miller,et al. JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes , 2019, SIGMOD Conference.
[73] Cécile Favre,et al. Metadata Systems for Data Lakes: Models and Features , 2019, ADBIS.
[74] Yasser Abdel-Rady I. Mohamed,et al. Data Lake Lambda Architecture for Smart Grids Big Data Analytics , 2018, IEEE Access.
[75] Miguel A. Martínez-Prieto,et al. Integrating flight-related information into a (Big) data lake , 2017, 2017 IEEE/AIAA 36th Digital Avionics Systems Conference (DASC).
[76] Tore Risch,et al. Querying combined cloud-based and relational databases , 2011, 2011 International Conference on Cloud and Service Computing.
[77] Erik Schultes,et al. The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.
[78] Renée J. Miller,et al. Value invention in data exchange , 2013, SIGMOD '13.
[79] Rui Liu,et al. Draining the Data Swamp: A Similarity-based Approach , 2018, HILDA@SIGMOD.
[80] Alberto Abelló,et al. Keeping the Data Lake in Form: DS-kNN Datasets Categorization Using Proximity Mining , 2019, MEDI.
[81] Revolucion Fundamental,et al. LA , 2020, Les statistiques en images.
[82] Norman W. Paton,et al. Dataset Discovery in Data Lakes , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).
[83] Jérôme Darmont,et al. On data lake architectures and metadata management , 2020, Journal of Intelligent Information Systems.
[84] Marcelo Arenas,et al. The language of plain SO-tgds: Composition, inversion and structural properties , 2013, J. Comput. Syst. Sci..
[85] Toon Calders,et al. Towards Information Profiling: Data Lake Content Metadata Management , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).
[86] Alon Y. Halevy,et al. Goods: Organizing Google's Datasets , 2016, SIGMOD Conference.
[87] Christian Mathis,et al. SAP HANA Vora: A Distributed Computing Platform for Enterprise Data Lakes , 2017, BTW.
[88] Erton Boci,et al. A novel big data architecture in support of ADS-B data analytic , 2015, 2015 Integrated Communication, Navigation and Surveillance Conference (ICNS).
[89] Christoph Quix,et al. GEMMS: A Generic and Extensible Metadata Management System for Data Lakes , 2016, CAiSE Forum.
[90] Renée J. Miller,et al. Organizing Data Lakes for Navigation , 2020, SIGMOD Conference.
[91] Toon Calders,et al. DS-Prox: Dataset Proximity Mining for Governing the Data Lake , 2017, SISAP.
[92] Boualem Benatallah,et al. CoreDB: a Data Lake Service , 2017, CIKM.
[93] Riccardo Torlone,et al. KAYAK: A Framework for Just-in-Time Data Preparation in a Data Lake , 2018, CAiSE.
[94] Robert Wrembel,et al. From conceptual design to performance optimization of ETL workflows: current state of research and open problems , 2017, The VLDB Journal.
[95] Wieslawa Gryncewicz,et al. Agile Approach to Develop Data Lake Based Systems , 2020 .
[96] Giuseppe Polese,et al. Relaxed Functional Dependencies—A Survey of Approaches , 2016, IEEE Transactions on Knowledge and Data Engineering.
[97] Christoph Quix,et al. Rewriting of Plain SO Tgds into Nested Tgds , 2019, Proc. VLDB Endow..
[98] Philip A. Bernstein,et al. Composition of mappings given by embedded dependencies , 2005, PODS '05.
[99] Sunita Sarawagi,et al. Answering Table Queries on the Web using Column Keywords , 2012, Proc. VLDB Endow..
[100] Renée J. Miller,et al. Data Lake Management: Challenges and Opportunities , 2019, Proc. VLDB Endow..
[101] Paolo Papotti,et al. Scalable data exchange with functional dependencies , 2010, Proc. VLDB Endow..
[102] Carlo Curino,et al. Automating the database schema evolution process , 2012, The VLDB Journal.
[103] Yi Zhang,et al. Dataset Relationship Management , 2019, CIDR.
[104] Christian Mathis,et al. Data Lakes , 2017, Datenbank-Spektrum.
[105] Diego Calvanese,et al. DL-Lite: Tractable Description Logics for Ontologies , 2005, AAAI.
[106] Renée J. Miller,et al. Open Data Integration , 2018, Proc. VLDB Endow..
[107] Jukka Riekki,et al. Implementing Big Data Lake for Heterogeneous Data Sources , 2019, 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW).
[108] Jennifer Widom,et al. The Beckman Report on Database Research , 2014, SGMD.
[109] David J. Groggel,et al. Practical Nonparametric Statistics , 2000, Technometrics.
[110] Matthias Jarke,et al. On Warehouses, Lakes, and Spaces: The Changing Role of Conceptual Modeling for Data Integration , 2017, Conceptual Modeling Perspectives.
[111] Beth Plale,et al. Crossing analytics systems: A case for integrated provenance in data lakes , 2016, 2016 IEEE 12th International Conference on e-Science (e-Science).