On the Use of Underspecified Data-Type Semantics for Type Safety in Low-Level Code

In recent projects on operating-system verification, C and C++ data types are often formalized usinga semantics that does not fully specify the precise byte encoding of objects. It is well-known thatsuch an underspecified data-type semantics can be used to detect certain kinds of type errors. Ingeneral, however, underspecified data-type semantics are unsound: they assign well-defined meaningto programs that have undefined behavior according to the C and C++ language standards. A precisecharacterization of the type-correctness properties that can be enforced with underspecified data-typesemantics is still missing. In this paper, we identify strengths and weaknesses of underspecified data-type semantics for ensuring type safety of low-level systems code. We prove sufficient conditionsto detect certain classes of type errors and, finally, identify a trade-off between the complexity ofunderspecified data-type semantics and their type-checking capabilities.