Axiomatic semantics verification of a secure web server

syntax form like this. If (Var "x" Int) (Simple (SubAssign (Var "x" Int) (Var "y" Int))) The meaning of the piece of code is derived by associating form or syntax with semantics by formally de ned rules in addition to those given in the formal system. For instance, a formal rule de nes that expressions of the form l =r are equivalent to l = l r. With shallow embedding we can prove that z = 0; and z = z-z; have the same e ect, namely, that z is assigned the value 0 regardless of its previous value. However we cannot prove that for all variables v the expressions v= 0; and v=v v; have the same e ect. In contrast a deep embedding has the tools to prove that for all variables v the expressions Assign v (Const 0 Int) and Assign v (Binary v Sub v) have the same e ect. Although an embedding may generally be said to be shallow or deep, there are many variations. For instance, an otherwise \shallow" embedding may represent function calls syntactically and have inference rules associated with them. A nominally \deep" embedding probably parses informally (a \shallow" embedding of the syntax) rather than carrying strings of characters and describing lexical analysis formally. If we wanted to reason about machine language programs which modify their instructions at run time, we could not even use syntactic abstraction: we would have to model uninterpreted memory contents. So the terms shallow and deep are relative terms depending on the veri cation. 5.2 A Rule for Preevaluation Side E ects The assignment axiom for v = expr; is ` fQvexprg v = expr; fQg as long as expr doesn't have any side e ects ([21], pp. 15-17) and v is not an alias for anything in Q or expr. Since C statements may have side e ects, this rule may not apply. As a simple example, the semantics of a = 2 * ++b; is well de ned [25] (it is equivalent to the compound statement ++b; a = 2 * b;), but the statement modi es the value of b as well as a. 29 To reason about complex statements, we introduce a general inference rule which derives the correctness of one statement from the correctness of a semantically equivalent statement. ` SEM EQ stm1 stm2 ` fpreg stm1 fpostg ` fpreg stm2 fpostg (5:1) The predicate SEM EQ is true if its two statement arguments are semantically equivalent. The inference rule means if two statements are semantically equivalent, and there is a partial correctness theorem for precondition, statement stm1, and postcondition we can conclude an analogous partial correctness theorem for statement stm2. We have not fully formally de ned semantic equivalence. Although we have some inference rules for higher level terms, an SML function checks equivalence and specializes the de nition. We introduce the following rule to reason speci cally about preevaluation side e ects, that is, side e ects which take place before the expression is evaluated. ` PreEval expr stm1 stm2 ` SEM EQ (Seq (Simp expr) stm1) stm2 (5:2) Seq is the abstract syntax constructor which creates a statement from a sequence of two statements. Simp converts any expression into a simple statement, which is allowed in C. The PreEval is a predicate which is true if extracting the preevaluation side e ects expression expr from statement stm2 yields stm1. Informally the rule means if stm2 can be separated into expr (which has all preevaluation side e ects) and stm1, then expr (in a statement) followed by stm1 is semantically equivalent to stm2. For example, we can derive theorems about the e ect of a = 2 * ++b; from the sequence of simpler statements ++b; a = 2 * b; from the instances of the inference rules̀ PreEval ++b a = 2 b; a = 2 ++ b; ` SEM EQ (Seq(Simp ++b) a = 2 b; ) a = 2 ++ b; 30 and ` SEM EQ (++ b; a = 2 b;) a = 2 ++ b; ` fPg ++ b; a = 2 b; fQg ` fPg a = 2 ++ b; fQg Since we have a deep embedding, we can derive a single rule to separate preevaluation side e ects using Rules 5.1 and 5.2. ` PreEval expr stm1 stm2 ` fpreg (Seq (Simp expr) stm1) fpostg ` fpreg stm2 fpostg (5:3) Why add another inference rule just to separate side e ects? Homeier's language, Sunrise [32], has an operator with a side e ect, increment, which can occur in test expressions. He handles this by embedding the semantics of the operator in the inference rules. However functions, which have arbitrary semantics including side e ects, can occur in loop or test expressions in C. Even statements without function calls can have multiple side e ects using, say, increment and assignment operators. We take this more general approach to be able to separate a side e ect from the expression in which it occurs. Given the above preference for separation inference rules, why de ne two rules (one for preevaluation side e ects and semantic equivalence and another for semantic equivalence and partial correctness) instead of the single rule 5.3? Two rules make future development easier since it breaks proofs and inferences into smaller pieces. The semantic equivalence of preevaluation separation can be proven from, say, a denotation semantics such as [38] without reference to the de nition of partial correctness. And total correctness need only have one rule for semantic equivalence rather than a rule for preevaluation side e ects, a rule for postevaluation side e ects, a rule for conditionals with side e ects, etc. 5.3 A Rule for Postevaluation Side E ects C allows postevaluation side e ects in expressions in addition to preevaluation side e ects. The statement a = 2 * b++; is well de ned, just as the preevaluation case. The statement can be broken down into the equivalent compound statement a = 2 * b; b++;. 31 The following rule allows us to reason about postevaluation side e ects, that is, side e ects which take place after the expression is evaluated. ` PostEval stm1 expr stm2 ` SEM EQ (Seq stm1 (Simp expr)) stm2 (5:4) The PostEval is a predicate which is true if extracting the postevaluation side e ects expression expr from statement stm2 yields stm1. Informally the rule means if stm2 can be separated into stm1 and expr (which has all postevaluation side e ects), stm1 followed by expr (in a statement) is semantically equivalent to stm2. Like the case for preevaluation side e ects, we can derive a rule to separate postevaluation side e ects using Rules 5.4 and 5.1. ` PostEval stm1 expr stm2 ` fpreg (Seq stm1 (Simp expr)) fpostg ` fpreg stm2 fpostg (5:5) 5.4 Side E ects in Conditionals The rules presented above are inadequate for control statements. For instance, suppose we were allowed to apply the postevaluation rule 5.5 to the following code. if (b++ > 0) { t = b; } else { e = b; } It would be transformed into this (note the postincrement afterward) which is not the same. The increment would be delayed until after the entire conditional statement. if (b > 0) { t = b; } else { 32 e = b; }b++; In the following sections we present inference rules for some control structures and indicate how the general approach could cover many other structures. Conditionals are the simplest form of control statements for our purposes. Without side e ects the inference rule is straight forward: IS VALUE expr test ` fpre ^ testg thenCode fpostg ` fpre ^ testg elseCode fpostg ` fpreg IfElse (expr) thenCode elseCode fpostg Notice we must use IS VALUE to indicate the equivalence between expr, which is in the program language, and test, which is in the assertion language. Any preevaluation side e ects can be separated and handled with the preevaluation rule 5.3. However postevaluation side e ects must be handled specially. Figure 5.3 shows the ow in a conditional statement with postevaluation side effects in the test expression. ? precondition ? precondition ^ test ? postStm ? trueCond ? \then" code ? precondition ^ test ? postStm ? falseCond ? \else" code ? postcondition ? Figure 5.3: Control Flow in a Conditional Rectangles are predicates on the program state. The pieces of text `postStm,' ` \then" code,' and ` \else" code' show code execution. The sequence of events is 33 1. Find the test condition in the initial state (when the precondition is true), 2. Evaluate the postevaluation side e ects, yielding new conditions, then 3. Evaluate the code in \then" or \else" branch, yielding the postcondition. This is the corresponding inference rule. ` SEM EQ (Seq (Simp expr) postStm)) (Simp ex) ` (postStm= EmptyStmt) _ (postStm= (Simp postSeEx)^ NoPreSE postSeEx) ` IS VALUE expr test ` fpre ^ testg postStm ftrueCondg ` fpre ^ testg postStm ffalseCondg ` ftrueCondg thenCode fpostg ` ffalseCondg elseCode fpostg ` fpreg IfElse (ex) thenCode elseCode fpostg (5:6) Informally in order to prove the partial correctness of the conditional statement, we must prove the following: The original test expression code, ex, is split into a side e ect free test expression, expr, followed by a statement for any postevaluation side e ects postStm. (Any preevaluation side e ects can be removed by Rule 5.3.) The postevaluation side e ect statement postStm is empty (if there are no side e ects), or it is a simple statement of the postevaluation side e ects postSeEx with no preevaluation side e ects. The code expr corresponds to test in the assertion language. Executing postStm with test true or false establishes the \true" or \false" conditions respectively. Executing the \then" and the \else" code establishes the post condition. Typically most of these theorems are proven automatically, thus minimizing the user's work. 34 An inference rule for one-armed conditionals can be derived from the above rule and the rule which states that one-armed conditionals are semantically equivalent to two-armed conditionals with empty \else" cases (If and IfElse with Empty, Table 6.6, page 58). 5.5 Loops with Pre and Post Eval Side E ects In simple languages the inference rule for a while loop, or backward jump, is straight forward: ` IS VALUE expr test ` finvariant ^ testg body finvariantg ` finvariantg while expr body finvariant^ testg (5:7) In languages in which the test expression ma

[1]  Natarajan Shankar,et al.  PVS: A Prototype Verification System , 1992, CADE.

[2]  Richard J. Lipton,et al.  Social processes and proofs of theorems and programs , 1977, POPL.

[3]  Michael Norrish An abstract dynamic semantics for C , 1997 .

[4]  Jim Cunningham,et al.  A Note on the Semantic Definition of Side Effects , 1976, Inf. Process. Lett..

[5]  Karl N. Levitt,et al.  A HOL Mechanization of The Axiomatic Semantics of a Simple Distributed Programming Language11The research reported here was funded in part by contract DOD-MDA 904-91-C-7053 with the National Security Agency's University Program. , 1993 .

[6]  Edmund M. Clarke,et al.  Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[7]  David Gries,et al.  Assignment and Procedure Call Proof Rules , 1980, TOPL.

[8]  Michael J. C. Gordon,et al.  Programming language theory and its implementation , 1988 .

[9]  Yuri Gurevich,et al.  Evolving Algebras: an Attempt to Discover Semantics , 1993, Current Trends in Theoretical Computer Science.

[10]  Hans-Juergen Boehm,et al.  Side effects and aliasing can have simple axiomatic descriptions , 1985, TOPL.

[11]  Peter V. Homeier Trustworthy Tools for Trustworthy Programs: A Mechanically Verified Verification Condition Generator , 1995 .

[12]  Zohar Manna,et al.  The Logic of Computer Programming , 1978, IEEE Transactions on Software Engineering.

[13]  B. A. Wichmann High Integrity Ada , 1997, SAFECOMP.

[14]  C. A. R. HOARE,et al.  An axiomatic basis for computer programming , 1969, CACM.

[15]  M. Gordon,et al.  Introduction to HOL: a theorem proving environment for higher order logic , 1993 .

[16]  Phillip J. Windley,et al.  Verifying resilient software , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[17]  Lawrence Charles Paulson,et al.  ML for the working programmer , 1991 .

[18]  Paul Curzon Deriving correctness properties of compiled code , 1992, TPHOLs.

[19]  Phillip J. Windley,et al.  Autotically Synthesized Term Denotation Predicates: A Proof Aid , 1995, TPHOLs.

[20]  Michael Norrish Derivation of Veriication Rules for C from Operational Deenitions , 1996 .

[21]  Graham Collins A Proof Tool for Reasoning About Functional Programs , 1996, TPHOLs.

[22]  Phillip J. Windley,et al.  A Brief Introduction to Formal Methods , 1996 .

[23]  Frederick B. Cohen A secure world-wide-web daemon , 1996, Comput. Secur..

[24]  Michael Dyer The Cleanroom Approach to Quality Software Development , 1992, Int. CMG Conference.

[25]  Zohar Manna,et al.  Studies In Automatic Programming Logic , 1977 .

[26]  Robert W. Floyd,et al.  Assigning Meanings to Programs , 1993 .

[27]  Vito B. L. Di,et al.  Using Formal Methods to Assist in the Requirements Analysis of the Space Shuttle GPS Change Request , 1996 .

[28]  John C. Reynolds,et al.  The craft of programming , 1981, Prentice Hall International series in computer science.

[29]  Phillip J. Windley,et al.  Formal verification of secure programs in the presence of side effects , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.