How to start a company in five (not so) easy steps

[1]  Hakan Hacigümüs,et al.  MISO: souping up big data query processing with a multistore system , 2014, SIGMOD Conference.

[2]  Daniel P. Miranker,et al.  On a model of indexability and its bounds for range queries , 2002, JACM.

[3]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[4]  D. J. De Witt,et al.  Direct—A Multiprocessor Organization for Supporting Relational Database Management Systems , 1979 .

[5]  Michael A. Olson,et al.  The Design and Implementation of the Inversion File System , 1993, USENIX Winter.

[6]  M. C. Schatz,et al.  The DNA data deluge , 2013, IEEE Spectrum.

[7]  Eugene Wong,et al.  Decomposition—a strategy for query processing , 1976, TODS.

[8]  V. Gadepally,et al.  Associative array model of SQL, NoSQL, and NewSQL databases , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[9]  Ahmed Eldawy,et al.  NADEEF: a commodity data cleaning system , 2013, SIGMOD '13.

[10]  Peter M. Rice,et al.  The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants , 2009, Nucleic acids research.

[11]  Reind P. van de Riet,et al.  Expert database systems , 1986, Future Gener. Comput. Syst..

[12]  Rada Chirkova,et al.  Enabling query processing across heterogeneous data models: A survey , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[13]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[14]  David J. DeWitt,et al.  Split query processing in polybase , 2013, SIGMOD '13.

[15]  Aditya G. Parameswaran,et al.  SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics , 2015, Proc. VLDB Endow..

[16]  R. Motwani,et al.  Query Processing, Approximation, and Resource Management in a Data Stream Management System , 2003, CIDR.

[17]  Laura M. Haas,et al.  Garlic: a new flavor of federated query processing for DB2 , 2002, SIGMOD '02.

[18]  Ramakrishna Varadarajan,et al.  The Vertica Analytic Database: C-Store 7 Years Later , 2012, Proc. VLDB Endow..

[19]  Felix Naumann,et al.  Profiling relational data: a survey , 2015, The VLDB Journal.

[20]  Jeremy Kepner,et al.  D4M: Bringing associative arrays to database engines , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[21]  Kwo-Sen Kuo,et al.  Implementing connected component labeling as a user defined operator for SciDB , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[22]  Michael L. Brodie Understanding Data Science: An Emerging Discipline for Data Intensive Discovery , 2015, DAMDID/RCDL.

[23]  Jeffrey Heer,et al.  Wrangler: interactive visual specification of data transformation scripts , 2011, CHI.

[24]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[25]  Tamraparni Dasu,et al.  Statistical Distortion: Consequences of Data Cleaning , 2012, Proc. VLDB Endow..

[26]  Donald D. Chamberlin,et al.  SEQUEL: A structured English query language , 1974, SIGFIDET '74.

[27]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[28]  Stephen N. Zilles,et al.  Programming with abstract data types , 1974 .

[29]  C. Mohan,et al.  DB2's Use of the Coupling Facility for Data Sharing , 1997, IBM Syst. J..

[30]  Alexander S. Szalay,et al.  The Sloan Digital Sky Survey and beyond , 2008, SGMD.

[31]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[32]  Michael Stonebraker,et al.  BigDAWG version 0.1 , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[33]  E. F. Codd,et al.  The relational and network approaches: Comparison of the application programming interfaces , 1975, SIGFIDET '74.

[34]  Philip S. Yu,et al.  Cluster Architectures and S/390 Parallel Sysplex Scalability , 1997, IBM Syst. J..

[35]  Donald J. Haderle,et al.  The History and Growth of IBM's DB2 , 2013, IEEE Annals of the History of Computing.

[36]  Irving L. Traiger,et al.  System R: relational approach to database management , 1976, TODS.

[37]  Stanley B. Zdonik,et al.  Data Ingestion for the Connected World , 2017, CIDR.

[38]  C. H. Faham,et al.  Accelerating Scientific Analysis with SciDB , 2015 .

[39]  Andrew Pavlo,et al.  What's Really New with NewSQL? , 2016, SGMD.

[40]  David J. DeWitt,et al.  Query execution in DIRECT , 1979, SIGMOD '79.

[41]  A Robbin,et al.  Creating SIPP longitudinal analysis files using a relational database management system. , 1988 .

[42]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[43]  Martin L. Kersten,et al.  MonetDB: Two Decades of Research in Column-oriented Database Architectures , 2012, IEEE Data Eng. Bull..

[44]  V. Kevin M. Whitney,et al.  Relational data management implementation techniques , 1974, SIGFIDET '74.

[45]  Divesh Srivastava,et al.  Combining Quantitative and Logical Data Cleaning , 2015, Proc. VLDB Endow..

[46]  Jianzhong Li,et al.  Towards certain fixes with editing rules and master data , 2010, Proc. VLDB Endow..

[47]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[48]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[49]  Jeffrey F. Naughton,et al.  Generalized Search Trees for Database Systems , 1995, VLDB.

[50]  David Maier,et al.  Making smalltalk a database system , 1984, SIGMOD '84.

[51]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[52]  Samuel Madden,et al.  Scorpion: Explaining Away Outliers in Aggregate Queries , 2013, Proc. VLDB Endow..

[53]  Nan Tang,et al.  Towards dependable data repairing with fixing rules , 2014, SIGMOD Conference.

[54]  Paolo Papotti,et al.  KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing , 2015, SIGMOD Conference.

[55]  Neil Savage,et al.  Forging relationships , 2015, Commun. ACM.

[56]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[57]  Vijay Gadepally,et al.  Demonstrating the BigDAWG Polystore System for Ocean Metagenomics Analysis , 2017, CIDR.

[58]  Bruce G. Lindsay,et al.  A retrospective of R*: A distributed database management system , 1987, Proceedings of the IEEE.

[59]  Martin L. Kersten,et al.  Breaking the memory wall in MonetDB , 2008, CACM.

[60]  Alvin Cheung,et al.  PipeGen: Data Pipe Generator for Hybrid Analytics , 2016, SoCC.

[61]  Donald D. Chamberlin,et al.  SEQUEL 2: A Unified Approach to Data Definition, Manipulation, and Control , 1976, IBM J. Res. Dev..

[62]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[63]  Lukasz Golab,et al.  On the relative trust between inconsistent data and inaccurate constraints , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[64]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[65]  Marti A. Hearst Search User Interfaces , 2009 .

[66]  Paolo Papotti,et al.  Holistic data cleaning: Putting violations into context , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[67]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[68]  Clayton M. Christensen The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail , 2013 .

[69]  Stanley B. Zdonik,et al.  Window-aware load shedding for aggregation queries over data streams , 2006, VLDB.

[70]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[71]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[72]  Tim Kraska,et al.  Tupleware: "Big" Data, Big Analytics, Small Clusters , 2015, CIDR.

[73]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[74]  Paolo Papotti,et al.  Discovering Denial Constraints , 2013, Proc. VLDB Endow..

[75]  C. Mohan,et al.  Concurrency and recovery in generalized search trees , 1997, SIGMOD '97.

[76]  André Csillaghy,et al.  2016 Ieee International Conference on Big Data (big Data) Running Scientific Algorithms as Array Database Operators: Bringing the Processing Power to the Data , 2022 .

[77]  Stanley B. Zdonik,et al.  Revision Processing in a Stream Processing Engine: A High-Level Design , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[78]  Jennie Duggan,et al.  BigDAWG polystore query optimization through semantic equivalences , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[79]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[80]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[81]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.