A command-level study of Linux kernel bugs

As computer systems increase in size and complexity, bugs become ever subtler and more difficult to detect and diagnose. A bug could exist at different layers of computer systems (e.g., applications, shared libraries, file systems, device firmware), or could be caused by the incompatibility among layers. In many cases, bugs would require a very specific combination of events to be triggered and are difficult to replicate, making detection and diagnosis more complicated. Most existing tools for debugging focus on a single layer of the systems (e.g., applications), which are intrusive to the target layer and are fundamentally limited for analyzing issues involving multiple layers. As the first step towards building a multi-layer diagnostic framework, this paper presents our efforts to study the behaviors of Linux kernel bugs at the host-device interface. More specifically, we designed workloads to trigger known kernel bugs, recorded the SCSI commands observed under different kernels, and analyzed the impact of kernel bug patches after multiple runs of the workloads by counting the occurrence of individual SCSI commands. Our preliminary results show that it is possible to identify potential synchronization bugs in the kernel from information at the host-device interface level.

[1]  Mark Lillibridge,et al.  Understanding the robustness of SSDS under power fault , 2013, FAST.

[2]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[3]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[4]  W. Marsden I and J , 2012 .

[5]  Yiliang Shi,et al.  Emulating Realistic Flash Device Errors with High Fidelity , 2016, 2016 IEEE International Conference on Networking, Architecture and Storage (NAS).

[6]  Peter Membrey,et al.  The Linux Kernel , 2009 .

[7]  M. Desnoyers,et al.  The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux , 2006 .

[8]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[9]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.

[10]  M. Tim Jones Anatomy of Linux journaling file systems Journaling today and tomorrow , 2018 .

[11]  Julian Satran,et al.  Design of the iSCSI protocol , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[12]  Brad Chen,et al.  Locating System Problems Using Dynamic Instrumentation , 2010 .

[13]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[14]  Andrea C. Arpaci-Dusseau,et al.  A Study of Linux File System Evolution , 2013, FAST.

[15]  Friedhelm Schmidt The SCSI bus and IDE interface - protocols, applications and programming (2. ed.) , 1995 .