Birthmark-Based Software Classification Using Rough Sets

Software theft or piracy is a rapidly growing problem which includes copying, modifying, and misusing proprietary software opposed to the license agreement. Software birthmark is a property of software that has been used for the detection of software theft successfully. Two separate pieces of software can be compared to identify the similarity in code by using their birthmarks. Comparison of the birthmarks of the softwares in question tells us whether software is a duplicate copy of another software or not. Similarity in birthmarks of two computer programs indicates that they are same. Until now, classification of software as pirated or not pirated still becomes a challenging task. Therefore, in this paper we present the use of rough set theory, which is a mathematical approach to deal with vagueness and uncertainty in classification problems. The technique is validated through an empirical case study. Some experiments show that the techniques are successful in assessing the specified properties of the birthmark and thus providing a valid classification decision.

[1]  Manuel Mucientes,et al.  STAC: A web platform for the comparison of algorithms using statistical tests , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[2]  Marcin S. Szczuka,et al.  The Rough Set Exploration System , 2005, Trans. Rough Sets.

[3]  Sencun Zhu,et al.  Behavior based software theft detection , 2009, CCS.

[4]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[5]  Christian S. Collberg,et al.  K-gram based software birthmarks , 2005, SAC '05.

[6]  Ying Zeng,et al.  Software Watermarking Through Obfuscated Interpretation: Implementation and Analysis , 2011, J. Multim..

[7]  Ying Zeng,et al.  Abstract interpretation-based semantic framework for software birthmark , 2012, Comput. Secur..

[8]  Hyun-il Lim,et al.  A static API birthmark for Windows binary executables , 2009, J. Syst. Softw..

[9]  Francisco Herrera,et al.  Implementing algorithms of rough set theory and fuzzy rough set theory in the R package "RoughSets" , 2014, Inf. Sci..

[10]  Christian S. Collberg,et al.  Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks , 2003, ICISC.

[11]  O. Diaz,et al.  Software product line testing: A feature oriented approach , 2012, 2012 IEEE International Conference on Industrial Technology.

[12]  Hyun-il Lim,et al.  Detecting Common Modules in Java Packages Based on Static Object Trace Birthmark , 2011, Comput. J..

[13]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[14]  Kuo-Wei Hsu,et al.  A rule-based classification algorithm: A rough set approach , 2012, 2012 IEEE International Conference on Computational Intelligence and Cybernetics (CyberneticsCom).

[15]  Christian S. Collberg,et al.  Software watermarking in the frequency domain: Implementation, analysis, and attacks , 2005, J. Comput. Secur..

[16]  Bee Ee Khoo,et al.  Robust reversible watermarking scheme using Slantlet transform matrix , 2014, J. Syst. Softw..

[17]  Josef Pieprzyk,et al.  Fingerprints for Copyright Software Protection , 1999, ISW.

[18]  Akito Monden,et al.  Using software birthmarks to identify similar classes and major functionalities , 2006, MSR '06.

[19]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[20]  Yang Xiang,et al.  Software Similarity and Classification , 2012, SpringerBriefs in Computer Science.

[21]  Xingming Sun,et al.  Dynamic K-Gram Based Software Birthmark , 2008 .

[22]  Christian S. Collberg,et al.  Detecting Software Theft via Whole Program Path Birthmarks , 2004, ISC.

[23]  Hyun-il Lim,et al.  Detecting code theft via a static instruction trace birthmark for Java methods , 2008, 2008 6th IEEE International Conference on Industrial Informatics.

[24]  Christian S. Collberg,et al.  Software theft detection through program identification , 2006 .

[25]  Gang Qu,et al.  Analysis of watermarking techniques for graph coloring problem , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).

[26]  Hyun-il Lim,et al.  Detecting Java Theft Based on Static API Trace Birthmark , 2008, IWSEC.

[27]  Akito Monden,et al.  Design and evaluation of birthmarks for detecting theft of java programs , 2004, IASTED Conf. on Software Engineering.

[28]  Siu-Ming Yiu,et al.  Heap Graph Based Software Theft Detection , 2013, IEEE Transactions on Information Forensics and Security.

[29]  Fenlin Liu,et al.  A Chaos-Based Robust Software Watermarking , 2006, ISPEC.

[30]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[31]  Maninder Singh,et al.  Software clone detection: A systematic review , 2013, Inf. Softw. Technol..

[32]  Hyun-il Lim,et al.  Customizing k-Gram Based Birthmark through Partial Matching in Detecting Software Thefts , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops.

[33]  Patrick Cousot,et al.  An abstract interpretation-based framework for software watermarking , 2004, POPL.

[34]  Germano Lambert-Torres,et al.  Rough Set Theory - Fundamental Concepts, Principals, Data Extraction, and Applications , 2009 .