Utility-maximizing event stream suppression

Complex Event Processing (CEP) has emerged as a technology for monitoring event streams in search of user specified event patterns. When a CEP system is deployed in sensitive environments the user may wish to mitigate leaks of private information while ensuring that useful nonsensitive patterns are still reported. In this paper we consider how to suppress events in a stream to reduce the disclosure of sensitive patterns while maximizing the detection of nonsensitive patterns. We first formally define the problem of utility-maximizing event suppression with privacy preferences, and analyze its computational hardness. We then design a suite of real-time solutions to solve this problem. Our first solution optimally solves the problem at the event-type level. The second solution, at the event-instance level, further optimizes the event-type level solution by exploiting runtime event distributions using advanced pattern match cardinality estimation techniques. Our user study and experimental evaluation over both real-world and synthetic event streams show that our algorithms are effective in maximizing utility yet still efficient enough to offer near real-time system responsiveness.

[1]  Sharma Chakravarthy,et al.  Queueing analysis of relational operators for continuous data streams , 2003, CIKM '03.

[2]  Chetan Gupta,et al.  CHAOS: A Data Stream Analysis Architecture for Enterprise Applications , 2009, 2009 IEEE Conference on Commerce and Enterprise Computing.

[3]  Herman J. Loether,et al.  Descriptive and inferential statistics: An introduction , 1980 .

[4]  Bin Jiang,et al.  Continuous privacy preserving publishing of data streams , 2009, EDBT '09.

[5]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[6]  Felix Schlenk,et al.  Proof of Theorem 3 , 2005 .

[7]  Chi-Yin Chow,et al.  A Privacy-Preserving Location Monitoring System for Wireless Sensor Networks , 2011, IEEE Transactions on Mobile Computing.

[8]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[9]  R. S. Laundy,et al.  Multiple Criteria Optimisation: Theory, Computation and Application , 1989 .

[10]  Elke A. Rundensteiner,et al.  Active complex event processing , 2010, Proc. VLDB Endow..

[11]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[12]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[13]  Jeffrey F. Naughton,et al.  On the complexity of privacy-preserving complex event processing , 2011, PODS.

[14]  Francesco Bonchi,et al.  Hiding Sequential and Spatiotemporal Patterns , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jimeng Sun,et al.  Hiding in the Crowd: Privacy Preservation on Evolving Streams through Correlation Tracking , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Qin Zhang,et al.  Optimal sampling from distributed streams , 2010, PODS '10.

[17]  Walid G. Aref,et al.  STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows , 2006, Sixth International Conference on Data Mining (ICDM'06).

[18]  Ralph E. Steuer Multiple criteria optimization , 1986 .

[19]  Hand-Hygiene Task Force Guideline for hand hygiene in healthcare settings. , 2004 .

[20]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Moni Naor,et al.  Pan-Private Streaming Algorithms , 2010, ICS.

[22]  Jinsong Tan,et al.  Inapproximability of Maximum Weighted Edge Biclique and Its Applications , 2007, TAMC.

[23]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[24]  Yale N. Patt,et al.  A two-level approach to making class predictions , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[25]  Rajeev Motwani,et al.  Chain: operator scheduling for memory minimization in data stream systems , 2003, SIGMOD '03.

[26]  Tong Liu,et al.  Mobility modeling, location tracking, and trajectory prediction in wireless ATM networks , 1998, IEEE J. Sel. Areas Commun..

[27]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[28]  Aris Gkoulalas-Divanis,et al.  Revisiting sequential pattern hiding to enhance utility , 2011, KDD.

[29]  Ling Liu,et al.  Butterfly: Protecting Output Privacy in Stream Mining , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[30]  Suman Nath,et al.  MaskIt: privately releasing user context streams for personalized mobile applications , 2012, SIGMOD Conference.

[31]  Samuel Madden,et al.  ZStream: a cost-based query processor for adaptively detecting composite events , 2009, SIGMOD Conference.