CPscan: Detecting Bugs Caused by Code Pruning in IoT Kernels

To reduce the development costs, IoT vendors tend to construct IoT kernels by customizing the Linux kernel. Code pruning is common in this customization process. However, due to the intrinsic complexity of the Linux kernel and the lack of long-term effective maintenance, IoT vendors may mistakenly delete necessary security operations in the pruning process, which leads to various bugs such as memory leakage and NULL pointer dereference. Yet detecting bugs caused by code pruning in IoT kernels is difficult. Specifically, (1) a significant structural change makes precisely locating the deleted security operations (DSO ) difficult, and (2) inferring the security impact of a DSO is not trivial since it requires complex semantic understanding, including the developing logic and the context of the corresponding IoT kernel. In this paper, we present CPscan, a system for automatically detecting bugs caused by code pruning in IoT kernels. First, using a new graph-based approach that iteratively conducts a structure-aware basic block matching, CPscan can precisely and efficiently identify theDSOs in IoT kernels. Then, CPscan infers the security impact of a DSO by comparing the bounded use chains (where and how a variable is used within potentially influenced code segments) of the security-critical variable associated with it. Specifically, CPscan reports the deletion of a security operation as vulnerable if the bounded use chain of the associated security-critical variable remains the same before and after the deletion. This is because the unchanged uses of a security-critical variable likely need the security operation, and removing it may have security impacts. The experimental results on 28 IoT kernels from 10 popular IoT vendors show that CPscan is able to identify 3,193DSO s and detect 114 new bugs with a reasonably low false-positive rate. Many such bugs tend to have a long latent period (up to 9 years and 5 months). We believe CPscan paves a way for eliminating the bugs introduced by code pruning in IoT kernels. We will open-source CPscan to facilitate further research.

[1]  Eugene W. Myers,et al.  AnO(ND) difference algorithm and its variations , 1986, Algorithmica.

[2]  Wenzhi Chen,et al.  IFIZZ: Deep-State and Efficient Fault-Scenario Generation to Test IoT Firmware , 2021, 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Insik Shin,et al.  HFL: Hybrid Fuzzing on the Linux Kernel , 2020, NDSS.

[4]  Konstantin Serebryany,et al.  MemorySanitizer: Fast detector of uninitialized memory use in C++ , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[5]  Kangjie Lu,et al.  Detecting Missing-Check Bugs via Semantic- and Context-Aware Criticalness and Constraints Inferences , 2019, USENIX Security Symposium.

[6]  Wenke Lee,et al.  UniSan: Proactive Kernel Memory Initialization to Eliminate Data Leakages , 2016, CCS.

[7]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[8]  Wolfgang Kastner,et al.  Prospect: peripheral proxying supported embedded code testing , 2014, AsiaCCS.

[9]  Pan Zhou,et al.  A Large-Scale Empirical Study on the Vulnerability of Deployed IoT Devices , 2022, IEEE Transactions on Dependable and Secure Computing.

[10]  Christopher Krügel,et al.  Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware , 2015, NDSS.

[11]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[12]  Stephen McCamant,et al.  Precisely Characterizing Security Impact in a Flood of Patches via Symbolic Rule Comparison , 2020, NDSS.

[13]  Jaiteg Singh,et al.  Enhancing Levenshtein’s Edit Distance Algorithm for Evaluating Document Similarity , 2017 .

[14]  J. J. McGregor,et al.  Backtrack search algorithms and the maximal common subgraph problem , 1982, Softw. Pract. Exp..

[15]  Matias Martinez,et al.  Fine-grained and accurate source code differencing , 2014, ASE.

[16]  Raheem Beyah,et al.  UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers , 2020, USENIX Security Symposium.

[17]  Aurélien Francillon,et al.  What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices , 2018, NDSS.

[18]  Aurélien Francillon,et al.  A Large-Scale Analysis of the Security of Embedded Firmwares , 2014, USENIX Security Symposium.

[19]  Ina Koch,et al.  Enumerating all connected maximal common subgraphs in two graphs , 2001, Theor. Comput. Sci..

[20]  Nikolai Tillmann,et al.  Fitness-guided path exploration in dynamic symbolic execution , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[21]  Chanchal Kumar Roy,et al.  Scaling classical clone detection tools for ultra-large datasets: An exploratory study , 2013, 2013 7th International Workshop on Software Clones (IWSC).

[22]  Ahmed M. Azab,et al.  PeX: A Permission Check Analysis Framework for Linux Kernel , 2019, USENIX Security Symposium.

[23]  Min Yang,et al.  PDiff: Semantic-based Patch Presence Testing for Downstream Kernels , 2020, CCS.

[24]  Jacob P. Tyo,et al.  Empirical Analysis and Automated Classification of Security Bug Reports , 2016 .

[25]  Chenxiong Qian,et al.  Precise and Scalable Detection of Double-Fetch Bugs in OS Kernels , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[26]  David Clark,et al.  A comparison of code similarity analysers , 2018, Empirical Software Engineering.

[27]  Chao Zhang,et al.  MOPT: Optimized Mutation Scheduling for Fuzzers , 2019, USENIX Security Symposium.

[28]  Suman Saha,et al.  Hector: Detecting Resource-Release Omission Faults in error-handling code for systems software , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[29]  Heng Yin,et al.  Scalable Graph-based Bug Search for Firmware Images , 2016, CCS.

[30]  Peiyuan Zong,et al.  SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits , 2017, CCS.

[31]  David Brumley,et al.  Towards Automated Dynamic Analysis for Linux-based Embedded Firmware , 2016, NDSS.

[32]  Le Song,et al.  Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection , 2018 .

[33]  Herbert Bos,et al.  PIE: Parser Identification in Embedded Systems , 2015, ACSAC.

[34]  Dawson R. Engler,et al.  Under-Constrained Symbolic Execution: Correctness Checking for Real Code , 2015, USENIX Annual Technical Conference.

[35]  Wenwen Wang,et al.  Check It Again: Detecting Lacking-Recheck Bugs in OS Kernels , 2018, CCS.

[36]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[37]  Shouling Ji,et al.  MPInspector: A Systematic and Automatic Approach for Evaluating the Security of IoT Messaging Protocols , 2022, USENIX Security Symposium.

[38]  Luca Bruno,et al.  AVATAR: A Framework to Support Dynamic Security Analysis of Embedded Systems' Firmwares , 2014, NDSS.

[39]  Ben Hardekopf,et al.  The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code , 2007, PLDI '07.

[40]  Xiaopeng Li,et al.  Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs , 2018, NDSS.

[41]  Mayur Naik,et al.  APISan: Sanitizing API Usages through Semantic Cross-Checking , 2016, USENIX Security Symposium.

[42]  Heng Yin,et al.  FIRM-AFL: High-Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation , 2019, USENIX Security Symposium.

[43]  Xuezixiang Li,et al.  Learning Program-Wide Code Representations for Binary Diffing , 2019, NDSS.

[44]  Zhiqiang Lin,et al.  IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing , 2018, NDSS.

[45]  Kangjie Lu,et al.  Automatically Identifying Security Checks for Detecting Kernel Semantic Bugs , 2019, ESORICS.

[46]  Giovanni Vigna,et al.  HALucinator: Firmware Re-hosting Through Abstraction Layer Emulation , 2020, USENIX Security Symposium.

[47]  Apostolis Zarras,et al.  Automated Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces , 2015, AsiaCCS.

[48]  Cristina V. Lopes,et al.  SourcererCC: Scaling Code Clone Detection to Big-Code , 2015, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[49]  Xuechen Zhang,et al.  Pallas: Semantic-Aware Checking for Finding Deep Bugs in Fast Path , 2017, ASPLOS.