Mining Encrypted Software Logs using Alpha Algorithm

The growing complexity of software with respect to technological advances encourages model-based analysis of software systems for validation and verification. Process mining is one recently investigated technique for such analysis which enables the discovery of process models from event logs collected during software execution. However, the usage of logs in process mining can be harmful to the privacy of data owners. While for a software user the existence of sensitive information in logs can be a concern, for a software company, the intellectual property of their product and confidential company information within logs can pose a threat to company’s privacy. In this paper, we propose a privacy-preserving protocol for the discovery of process models for software analysis that assures the privacy of users and companies. For this purpose, our proposal uses encrypted logs and processes them using cryptographic protocols in a two-party setting. Furthermore, our proposal applies data packing on the cryptographic protocols to optimize computations by reducing the number of repetitive operations. The experiments show that using data packing the performance of our protocol is promising for privacy-preserving software analysis. To the best of our knowledge, our protocol is the first of its kind for the software analysis which relies on processing of encrypted logs using process mining techniques.

[1]  Wil M. P. van der Aalst,et al.  Process mining in software systems: Discovering real-life business transactions and process models from distributed systems , 2015, 2015 ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS).

[2]  Chen Fu,et al.  Is Data Privacy Always Good for Software Testing? , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[3]  Zekeriya Erkin,et al.  Generating Private Recommendations Efficiently Using Homomorphic Encryption and Data Packing , 2012, IEEE Transactions on Information Forensics and Security.

[4]  Dawn Xiaodong Song,et al.  TaintEraser: protecting sensitive data leaks using application-level taint tracking , 2011, OPSR.

[5]  Tomas Toft,et al.  Secure Equality and Greater-Than Tests with Sublinear Online Complexity , 2013, ICALP.

[6]  Zekeriya Erkin,et al.  Efficient and secure equality tests , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[7]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[8]  Georgios Gousios,et al.  The GHTorent dataset and tool suite , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[9]  David Lo,et al.  kbe-anonymity: test data anonymization for evolving programs , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[10]  Christian S. Collberg,et al.  Software watermarking: models and dynamic embeddings , 1999, POPL '99.

[11]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[12]  Wil M. P. van der Aalst,et al.  Process Mining , 2016, Springer Berlin Heidelberg.

[13]  Boudewijn F. van Dongen,et al.  Process Mining Framework for Software Processes , 2007, ICSP.

[14]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[15]  Josh Levenberg,et al.  Why Google stores billions of lines of code in a single repository , 2016, Commun. ACM.

[16]  Santiago Comella-Dorda,et al.  Model-Based Verification: An Engineering Practice , 2002 .

[17]  David Aucsmith,et al.  Tamper Resistant Software: An Implementation , 1996, Information Hiding.

[18]  Nasir D. Memon,et al.  Preventing Piracy, Reverse Engineering, and Tampering , 2003, Computer.

[19]  Miguel Castro,et al.  Better bug reporting with better privacy , 2008, ASPLOS 2008.

[20]  Wil M. P. van der Aalst,et al.  Big Software on the Run In Vivo Software Analytics Based on Process Mining (Keynote) , 2015 .

[21]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[22]  Peter M. Broadwell,et al.  Scrash: A System for Generating Secure Crash Information , 2003, USENIX Security Symposium.

[23]  Marcello Cinque,et al.  Log-Based Failure Analysis of Complex Systems: Methodology and Relevant Applications , 2013 .