Tell You a Definite Answer: Whether Your Data is Tainted During Thread Scheduling

With the advent of multicore processors, there is a great need to write parallel programs to take advantage of parallel computing resources. However, due to the nondeterminism of parallel execution, the malware behaviors sensitive to thread scheduling are extremely difficult to detect. Dynamic taint analysis is widely used in security problems. By serializing a multithreaded execution and then propagating taint tags along the serialized schedule, existing dynamic taint analysis techniques lead to under-tainting with respect to other possible interleavings under the same input. In this paper, we propose an approach called DSTAM that integrates symbolic analysis and guided execution to systematically detect tainted instances on all possible executions under a given input. Symbolic analysis infers alternative interleavings of an executed trace that cover new tainted instances, and computes thread schedules that guide future executions. Guided execution explores new execution traces that drive future symbolic analysis. We have implemented a prototype as part of an educational tool that teaches secure C programming, where accuracy is more critical than efficiency. To the best of our knowledge, DSTAM is the first algorithm that addresses the challenge of taint analysis for multithreaded program under fixed inputs.

[1]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[2]  Babak Falsafi,et al.  Flexible Hardware Acceleration for Instruction-Grain Program Monitoring , 2008, 2008 International Symposium on Computer Architecture.

[3]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[4]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.

[5]  Wei Xu,et al.  Taint-Enhanced Policy Enforcement: A Practical Approach to Defeat a Wide Range of Attacks , 2006, USENIX Security Symposium.

[6]  Aarti Gupta,et al.  DTAM: dynamic taint analysis of multi-threaded programs for relevancy , 2012, SIGSOFT FSE.

[7]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[8]  Lucas C. Cordeiro,et al.  Verifying multi-threaded software using smt-based context-bounded model checking , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[9]  Martin C. Rinard,et al.  Taint-based directed whitebox fuzzing , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[10]  Chao Wang,et al.  Symbolic predictive analysis for concurrent programs , 2009, Formal Aspects of Computing.

[11]  Qinghua Zheng,et al.  Android Malware Familial Classification and Representative Sample Selection via Frequent Subgraph Analysis , 2018, IEEE Transactions on Information Forensics and Security.

[12]  Francesco Sorrentino,et al.  Predicting null-pointer dereferences in concurrent programs , 2012, SIGSOFT FSE.

[13]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  Marie-Laure Potet,et al.  Taint Dependency Sequences: A Characterization of Insecure Execution Paths Based on Input-Sensitive Cause Sequences , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[15]  Chao Wang,et al.  ConcBugAssist: constraint solving for diagnosis and repair of concurrency bugs , 2015, ISSTA.

[16]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[17]  Jeff Huang,et al.  Persuasive prediction of concurrency access anomalies , 2011, ISSTA '11.

[18]  Derek Bruening,et al.  Efficient, transparent, and comprehensive runtime code manipulation , 2004 .

[19]  Laurent Mounier,et al.  Dynamic Information-Flow Analysis for Multi-threaded Applications , 2012, ISoLA.

[20]  Qinghua Zheng,et al.  Dependence Guided Symbolic Execution , 2017, IEEE Transactions on Software Engineering.

[21]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[22]  Shuai Shao,et al.  TWalker: An efficient taint analysis tool , 2014, 2014 10th International Conference on Information Assurance and Security.

[23]  David Brumley,et al.  All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask) , 2010, 2010 IEEE Symposium on Security and Privacy.

[24]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[25]  Madan Musuvathi,et al.  Iterative context bounding for systematic testing of multithreaded programs , 2007, PLDI '07.

[26]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[27]  Tzi-cker Chiueh,et al.  A General Dynamic Information Flow Tracking Framework for Security Applications , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[28]  Salvatore J. Stolfo,et al.  Concurrency attacks , 2012, HotPar'12.

[29]  Satish Narayanasamy,et al.  Offline symbolic analysis to infer Total Store Order , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[30]  Ming Fan,et al.  DAPASA: Detecting Android Piggybacked Apps Through Sensitive Subgraph Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[31]  Sebastian Burckhardt,et al.  Effective ? , 2010 .

[32]  Satish Narayanasamy,et al.  A case for an interleaving constrained shared-memory multi-processor , 2009, ISCA '09.

[33]  Olatunji Ruwase,et al.  Parallelizing dynamic information flow tracking , 2008, SPAA '08.

[34]  Tiziana Margaria,et al.  Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change , 2012, Lecture Notes in Computer Science.

[35]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[36]  Grigore Rosu,et al.  Maximal sound predictive race detection with control flow abstraction , 2014, PLDI.

[37]  Stephen McCamant,et al.  DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation , 2011, NDSS.

[38]  Qinghua Zheng,et al.  Reviving Sequential Program Birthmarking for Multithreaded Software Plagiarism Detection , 2018, IEEE Transactions on Software Engineering.

[39]  Satish Narayanasamy,et al.  Maple: a coverage-driven testing tool for multithreaded programs , 2012, OOPSLA '12.

[40]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[41]  George Candea,et al.  Cloud9: a software testing service , 2010, OPSR.

[42]  Xi Wang,et al.  Improving application security with data flow assertions , 2009, SOSP '09.

[43]  Todd C. Mowry,et al.  Butterfly analysis: adapting dataflow analysis to dynamic parallel monitoring , 2010, ASPLOS XV.

[44]  Chao Wang,et al.  Coverage guided systematic concurrency testing , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[45]  Koushik Sen,et al.  A trace simplification technique for effective debugging of concurrent programs , 2010, FSE '10.

[46]  Jeff Huang,et al.  CLAP: recording local executions to reproduce concurrency failures , 2013, PLDI.

[47]  Lawrence Rauchwerger,et al.  Finding schedule-sensitive branches , 2015, ESEC/SIGSOFT FSE.

[48]  Calvin Lin,et al.  Efficient and extensible security enforcement using dynamic data flow analysis , 2008, CCS.

[49]  Qinghua Zheng,et al.  Debugging Multithreaded Programs as if They Were Sequential , 2016, 2016 International Conference on Software Analysis, Testing and Evolution (SATE).

[50]  Beng Heng Ng,et al.  Beyond Instruction Level Taint Propagation , 2013 .

[51]  Bei Yu,et al.  TaintTrace: Efficient Flow Tracing with Dynamic Binary Rewriting , 2006, 11th IEEE Symposium on Computers and Communications (ISCC'06).

[52]  Tai-Myung Chung,et al.  Explicit Untainting to Reduce Shadow Memory Usage and Access Frequency in Taint Analysis , 2013, ICCSA.

[53]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[54]  Dawn Xiaodong Song,et al.  TaintEraser: protecting sensitive data leaks using application-level taint tracking , 2011, OPSR.

[55]  Satish Narayanasamy,et al.  Offline symbolic analysis for multi-processor execution replay , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[56]  Xiangyu Zhang,et al.  Strict control dependence and its effect on dynamic information flow analyses , 2010, ISSTA '10.

[57]  Guilherme Ottoni,et al.  RIFLE: An Architectural Framework for User-Centric Information-Flow Security , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[58]  Chao Wang,et al.  Trace-Based Symbolic Analysis for Atomicity Violations , 2010, TACAS.

[59]  Shin Hong,et al.  Testing concurrent programs to achieve high synchronization coverage , 2012, ISSTA 2012.

[60]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[61]  Baowen Xu,et al.  Verifying Synchronization for Atomicity Violation Fixing , 2016, IEEE Transactions on Software Engineering.