Quantitatively Measuring Privacy in Interactive Query Settings Within RDBMS Framework

Little attention has been paid to the measurement of risk to privacy in Database Management Systems, despite their prevalence as a modality of data access. This paper proposes PriDe, a quantitative privacy metric that provides a measure (privacy score) of privacy risk when executing queries in relational database management systems. PriDe measures the degree to which attribute values, retrieved by a principal (user) engaging in an interactive query session, represent a reduction of privacy with respect to the attribute values previously retrieved by the principal. It can be deployed in interactive query settings where the user sends SQL queries to the database and gets results at run-time and provides privacy-conscious organizations with a way to monitor the usage of the application data made available to third parties in terms of privacy. The proposed approach, without loss of generality, is applicable to BigSQL-style technologies. Additionally, the paper proposes a privacy equivalence relation that facilitates the computation of the privacy score.

[1]  Zhi-Hong Deng,et al.  PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning , 2015, Expert Syst. Appl..

[2]  Vincent Frey,et al.  Discrimination rate: an attribute-centric metric to measure privacy , 2017, Ann. des Télécommunications.

[3]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[4]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5]  Wei Zhao,et al.  Privacy-Preserving Data Mining Systems , 2007, Computer.

[6]  Laura Genga,et al.  Towards a Systematic Process-aware Behavioral Analysis for Security , 2018, ICETE.

[7]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[8]  Stephanie Forrest,et al.  Automated response using system-call delays , 2000 .

[9]  Sandro Etalle,et al.  Behavior analysis in the medical sector: theory and practice , 2018, SAC.

[10]  Yoshihiko Suhara,et al.  Driver behavior profiling: An investigation with different smartphone sensors and machine learning , 2017, PloS one.

[11]  Varun Chandola,et al.  Ettu: Analyzing Query Intents in Corporate Databases , 2016, WWW.

[12]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Bart Preneel,et al.  Towards Measuring Anonymity , 2002, Privacy Enhancing Technologies.

[14]  Simon N. Foley,et al.  PriDe: A Quantitative Measure of Privacy-Loss in Interactive Querying Settings , 2019, 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS).

[15]  Joseph Lee,et al.  DIDAFIT: Detecting Intrusions in Databases Through Fingerprinting Transactions , 2002, ICEIS.

[16]  Qi Shi,et al.  SQL Injection Attack classification through the feature extraction of SQL query strings using a Gap-Weighted String Subsequence Kernel , 2018, J. Inf. Secur. Appl..

[17]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[18]  Efstathios Stamatatos,et al.  Syntactic N-grams as machine learning features for natural language processing , 2014, Expert Syst. Appl..

[19]  Simon N. Foley,et al.  Computing the Identification Capability of SQL Queries for Privacy Comparison , 2019 .

[20]  Hiroki Arimura,et al.  LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[21]  Vijay Laxmi,et al.  Machine Learning Approach for Multiple Misbehavior Detection in VANET , 2011, ACC.

[22]  Elisa Bertino,et al.  DetAnom: Detecting Anomalous Database Transactions by Insiders , 2015, CODASPY.

[23]  Iordanis Koutsopoulos,et al.  A Game Theoretic Framework for Data Privacy Preservation in Recommender Systems , 2011, ECML/PKDD.

[24]  Yun Sing Koh,et al.  Finding Sporadic Rules Using Apriori-Inverse , 2005, PAKDD.

[25]  Elisa Bertino,et al.  Data and syntax centric anomaly detection for relational databases , 2016, WIREs Data Mining Knowl. Discov..

[26]  George Danezis,et al.  Towards an Information Theoretic Metric for Anonymity , 2002, Privacy Enhancing Technologies.

[27]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[28]  Chris Clifton,et al.  On syntactic anonymity and differential privacy , 2013, 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW).

[29]  Tsvi Kuflik,et al.  PRAW - A PRivAcy model for the Web , 2005, J. Assoc. Inf. Sci. Technol..

[30]  M Damashek,et al.  Gauging Similarity with n-Grams: Language-Independent Categorization of Text , 1995, Science.

[31]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[32]  Simon N. Foley,et al.  Detecting Anomalous Behavior in DBMS Logs , 2016, CRiSIS.

[33]  Matteo Golfarelli,et al.  Similarity measures for OLAP sessions , 2013, Knowledge and Information Systems.

[34]  Claudia Díaz Anonymity Metrics Revisited , 2005, Anonymous Communication and its Applications.

[35]  Duc Thanh Anh Luong,et al.  Similarity Metrics for SQL Query Clustering , 2018, IEEE Transactions on Knowledge and Data Engineering.

[36]  Elisa Bertino,et al.  Detection of Temporal Insider Threats to Relational Databases , 2017, 2017 IEEE 3rd International Conference on Collaboration and Internet Computing (CIC).

[37]  Justine Becker Measuring privacy risk in online social networks , 2009 .

[38]  Eerke A. Boiten,et al.  Privacy Risk Assessment: From Art to Science, By Metrics , 2018, DPM/CBT@ESORICS.

[39]  Simon N. Foley,et al.  DBMS Log Analytics for Detecting Insider Threats in Contemporary Organizations , 2019, Advances in Electronic Government, Digital Divide, and Regional Development.

[40]  Rafael D. C. Santos,et al.  Text Mining Applied to SQL Queries: A Case Study for the SDSS SkyServer , 2015, SIMBig.

[41]  Sin Yeung Lee,et al.  Learning Fingerprints for a Database Intrusion Detection System , 2002, ESORICS.

[42]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[43]  Sheng-De Wang,et al.  Machine Learning Based Hybrid Behavior Models for Android Malware Analysis , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[44]  Jordi Forné,et al.  Measuring the privacy of user profiles in personalized information systems , 2014, Future Gener. Comput. Syst..

[45]  Hung Q. Ngo,et al.  A Data-Centric Approach to Insider Attack Detection in Database Systems , 2010, RAID.

[46]  Jian Pei,et al.  H-Mine: Fast and space-preserving frequent pattern mining in large databases , 2007 .