The Architecture of the Cornell Knowledge Broker

Intelligence applications have to process massive amounts of data in order to extract relevant information. This includes archived historical data as well as continuously arriving new data. We propose a novel architecture that addresses this problem - the Cornell Knowledge Broker. It will not only support knowledge discovery, but also security, privacy, information exchange, and collaboration.

[1]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[2]  Laks V. S. Lakshmanan,et al.  Compressed Accessibility Map: Efficient Access Control for XML , 2002, VLDB.

[3]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[4]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[5]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[6]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[7]  George T. Duncan,et al.  Disclosure limitation through additive noise data masking: analysis of skewed sensitive data , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[8]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[9]  Daniel A. Keim,et al.  Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , 2002, KDD.

[10]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[11]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[12]  Wei Hong,et al.  The design of an acquisitional query processor for sensor networks , 2003, SIGMOD '03.

[13]  Rajeev Motwani,et al.  Load Shedding Techniques for Data Stream Systems , 2003 .

[14]  Dieter Gawlick,et al.  Managing Expressions as Data in Relational Database Systems , 2003, CIDR.

[15]  Sharma Chakravarthy,et al.  Composite Events for Active Databases: Semantics, Contexts and Detection , 1994, VLDB.

[16]  Daniel A. Keim,et al.  On Knowledge Discovery and Data Mining , 1997 .

[17]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[18]  Laura Voshell Zayatz,et al.  Using noise for disclosure limi-tation of establishment tabular data , 1998 .

[19]  Philippe Bonnet,et al.  Towards Sensor Database Systems , 2001, Mobile Data Management.

[20]  David S. Rosenblum,et al.  Achieving scalability and expressiveness in an Internet-scale event notification service , 2000, PODC '00.

[21]  Carlo Zaniolo,et al.  Formal Semantics for Composite Temporal Events in Active Database Rules , 1997, J. Syst. Integr..

[22]  Peter Kooiman,et al.  Post randomisation for statistical disclosure control: Theory and implementation , 1997 .

[23]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[24]  Michael Stonebraker,et al.  Load Shedding in a Data Stream Manager , 2003, VLDB.

[25]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[26]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[27]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[28]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[29]  Stephen E. Fienberg,et al.  Disclosure limitation using perturbation and related methods for categorical data , 1998 .

[30]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[31]  Bart Selman,et al.  Natural communities in large linked networks , 2003, KDD '03.

[32]  Ramayya Krishnan,et al.  Cell suppression to limit content-based disclosure , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[33]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[34]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[35]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[36]  Keishi Tajima,et al.  Archiving scientific data , 2002, SIGMOD '02.

[37]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[38]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[39]  Elisa Bertino,et al.  Database Security: Research and Practice , 1995, Inf. Syst..

[40]  George T. Duncan,et al.  Optimal Disclosure Limitation Strategy in Statistical Databases: Deterring Tracker Attacks through Additive Noise , 2000 .

[41]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[42]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[43]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[44]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .