LLWM & IR-Mark: Integrating Software Watermarks into an LLVM-based Framework

While software protection mechanisms, such as DRM and online services, hinder the unrestrained duplication of games and applications, these mechanisms fail at protecting individual software components from reuse by intellectual property thieves. While conceptually watermarking can discourage software misuse and allows proof of ownership, embedding watermarks requires expert knowledge. This is why software watermarks are barely used. This paper presents LLWM, an LLVM-based watermarking framework that automates the embedding of watermarks and thus enables the widespread use of watermarking. LLWM incorporates several (adapted) implementations of existing watermarking techniques and provides the foundation for IR-Mark, a new watermarking technique based on a modified register allocation. With LLWM, built upon LLVM and its compiler Clang, developers can use a watermark simply by compiling their codes, without the obstacle of having to understand and run existing methods and standalone tools. With the methods included in LLWM, we offer a variety of choices for embedding techniques and evaluate and discuss their characteristics and resilience against common attacks.

[1]  Nabendu Chaki,et al.  Software Watermarking: Progress and Challenges , 2018, INAE Letters.

[2]  Wanli Zuo,et al.  Hash Function Based Software Watermarking , 2008, 2008 Advanced Software Engineering and Its Applications.

[3]  Tanima Dutta,et al.  A robust watermarking framework for High Efficiency Video Coding (HEVC) - Encoded video with blind extraction process , 2016, J. Vis. Commun. Image Represent..

[4]  Christian S. Collberg,et al.  Surreptitious Software - Obfuscation, Watermarking, and Tamperproofing for Software Protection , 2009, Addison-Wesley Software Security Series.

[5]  Reinhold Weicker,et al.  Dhrystone benchmark: rationale for version 2 and measurement rules , 1988, SIGP.

[6]  Michael Stepp,et al.  Dynamic path-based software watermarking , 2004, PLDI '04.

[7]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[8]  Stavros D. Nikolopoulos,et al.  Encoding watermark integers as self-inverting permutations , 2010, CompSysTech '10.

[9]  Dawn Song,et al.  REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data , 2021, AsiaCCS.

[10]  Stavros D. Nikolopoulos,et al.  WaterRPG: A Graph-based Dynamic Watermarking Model for Software Protection , 2014, ArXiv.

[11]  Christian S. Collberg,et al.  Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks , 2003, ICISC.

[12]  Jeffrey C. Lagarias,et al.  The Ultimate Challenge: The 3x+1 Problem , 2011 .

[13]  Paul Osmialowski How The Flang Frontend Works: Introduction to the interior of the Open-Source Fortran frontend for LLVM , 2017, LLVM-HPC@SC.

[14]  Agostino Cortesi,et al.  A Distortion Free Watermark Framework for Relational Databases , 2009, ICSOFT.

[15]  Hongxia Jin,et al.  Self-validating Branch-Based Software Watermarking , 2005, Information Hiding.

[16]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[17]  Haoyu Ma,et al.  Xmark: Dynamic Software Watermarking Using Collatz Conjecture , 2019, IEEE Transactions on Information Forensics and Security.

[18]  Gang Qu,et al.  Analysis of watermarking techniques for graph coloring problem , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).

[19]  Shanqing Guo,et al.  How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN , 2019, ACSAC.

[20]  Marco Botta,et al.  A modular framework for color image watermarking , 2016, Signal Process..

[21]  Keiichi Kaneko,et al.  New Approaches for Software Watermarking by Register Allocation , 2008, 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing.

[22]  Miodrag Potkonjak,et al.  Hiding Signatures in Graph Coloring Solutions , 1999, Information Hiding.

[23]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[24]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[25]  Christian S. Collberg,et al.  Sandmark--A Tool for Software Protection Research , 2003, IEEE Secur. Priv..