| 8744 |      1 | (*<*)
 | 
|  |      2 | theory CodeGen = Main:
 | 
|  |      3 | (*>*)
 | 
|  |      4 | 
 | 
|  |      5 | text{*\noindent
 | 
|  |      6 | The task is to develop a compiler from a generic type of expressions (built
 | 
|  |      7 | up from variables, constants and binary operations) to a stack machine.  This
 | 
|  |      8 | generic type of expressions is a generalization of the boolean expressions in
 | 
|  |      9 | \S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
 | 
|  |     10 | type of variables or values but make them type parameters.  Neither is there
 | 
|  |     11 | a fixed set of binary operations: instead the expression contains the
 | 
|  |     12 | appropriate function itself.
 | 
|  |     13 | *}
 | 
|  |     14 | 
 | 
|  |     15 | types 'v binop = "'v \\<Rightarrow> 'v \\<Rightarrow> 'v";
 | 
|  |     16 | datatype ('a,'v)expr = Cex 'v
 | 
|  |     17 |                      | Vex 'a
 | 
|  |     18 |                      | Bex "'v binop"  "('a,'v)expr"  "('a,'v)expr";
 | 
|  |     19 | 
 | 
|  |     20 | text{*\noindent
 | 
| 8771 |     21 | The three constructors represent constants, variables and the application of
 | 
|  |     22 | a binary operation to two subexpressions.
 | 
| 8744 |     23 | 
 | 
|  |     24 | The value of an expression w.r.t.\ an environment that maps variables to
 | 
|  |     25 | values is easily defined:
 | 
|  |     26 | *}
 | 
|  |     27 | 
 | 
| 8771 |     28 | consts value :: "('a,'v)expr \\<Rightarrow> ('a \\<Rightarrow> 'v) \\<Rightarrow> 'v";
 | 
| 8744 |     29 | primrec
 | 
| 8771 |     30 | "value (Cex v) env = v"
 | 
|  |     31 | "value (Vex a) env = env a"
 | 
|  |     32 | "value (Bex f e1 e2) env = f (value e1 env) (value e2 env)";
 | 
| 8744 |     33 | 
 | 
|  |     34 | text{*
 | 
|  |     35 | The stack machine has three instructions: load a constant value onto the
 | 
|  |     36 | stack, load the contents of a certain address onto the stack, and apply a
 | 
|  |     37 | binary operation to the two topmost elements of the stack, replacing them by
 | 
|  |     38 | the result. As for \isa{expr}, addresses and values are type parameters:
 | 
|  |     39 | *}
 | 
|  |     40 | 
 | 
|  |     41 | datatype ('a,'v) instr = Const 'v
 | 
|  |     42 |                        | Load 'a
 | 
|  |     43 |                        | Apply "'v binop";
 | 
|  |     44 | 
 | 
|  |     45 | text{*
 | 
| 8771 |     46 | The execution of the stack machine is modelled by a function
 | 
|  |     47 | \isa{exec} that takes a list of instructions, a store (modelled as a
 | 
|  |     48 | function from addresses to values, just like the environment for
 | 
|  |     49 | evaluating expressions), and a stack (modelled as a list) of values,
 | 
|  |     50 | and returns the stack at the end of the execution---the store remains
 | 
|  |     51 | unchanged:
 | 
| 8744 |     52 | *}
 | 
|  |     53 | 
 | 
| 8771 |     54 | consts exec :: "('a,'v)instr list \\<Rightarrow> ('a\\<Rightarrow>'v) \\<Rightarrow> 'v list \\<Rightarrow> 'v list";
 | 
| 8744 |     55 | primrec
 | 
| 8771 |     56 | "exec [] s vs = vs"
 | 
|  |     57 | "exec (i#is) s vs = (case i of
 | 
|  |     58 |     Const v  \\<Rightarrow> exec is s (v#vs)
 | 
|  |     59 |   | Load a   \\<Rightarrow> exec is s ((s a)#vs)
 | 
|  |     60 |   | Apply f  \\<Rightarrow> exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))";
 | 
| 8744 |     61 | 
 | 
|  |     62 | text{*\noindent
 | 
|  |     63 | Recall that \isa{hd} and \isa{tl}
 | 
|  |     64 | return the first element and the remainder of a list.
 | 
|  |     65 | Because all functions are total, \isa{hd} is defined even for the empty
 | 
|  |     66 | list, although we do not know what the result is. Thus our model of the
 | 
|  |     67 | machine always terminates properly, although the above definition does not
 | 
|  |     68 | tell us much about the result in situations where \isa{Apply} was executed
 | 
|  |     69 | with fewer than two elements on the stack.
 | 
|  |     70 | 
 | 
|  |     71 | The compiler is a function from expressions to a list of instructions. Its
 | 
|  |     72 | definition is pretty much obvious:
 | 
|  |     73 | *}
 | 
|  |     74 | 
 | 
|  |     75 | consts comp :: "('a,'v)expr \\<Rightarrow> ('a,'v)instr list";
 | 
|  |     76 | primrec
 | 
|  |     77 | "comp (Cex v)       = [Const v]"
 | 
|  |     78 | "comp (Vex a)       = [Load a]"
 | 
|  |     79 | "comp (Bex f e1 e2) = (comp e2) @ (comp e1) @ [Apply f]";
 | 
|  |     80 | 
 | 
|  |     81 | text{*
 | 
|  |     82 | Now we have to prove the correctness of the compiler, i.e.\ that the
 | 
|  |     83 | execution of a compiled expression results in the value of the expression:
 | 
|  |     84 | *}
 | 
| 8771 |     85 | theorem "exec (comp e) s [] = [value e s]";
 | 
| 8744 |     86 | (*<*)oops;(*>*)
 | 
|  |     87 | text{*\noindent
 | 
|  |     88 | This theorem needs to be generalized to
 | 
|  |     89 | *}
 | 
|  |     90 | 
 | 
| 8771 |     91 | theorem "\\<forall>vs. exec (comp e) s vs = (value e s) # vs";
 | 
| 8744 |     92 | 
 | 
|  |     93 | txt{*\noindent
 | 
|  |     94 | which is proved by induction on \isa{e} followed by simplification, once
 | 
|  |     95 | we have the following lemma about executing the concatenation of two
 | 
|  |     96 | instruction sequences:
 | 
|  |     97 | *}
 | 
|  |     98 | (*<*)oops;(*>*)
 | 
|  |     99 | lemma exec_app[simp]:
 | 
| 8771 |    100 |   "\\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)"; 
 | 
| 8744 |    101 | 
 | 
|  |    102 | txt{*\noindent
 | 
|  |    103 | This requires induction on \isa{xs} and ordinary simplification for the
 | 
|  |    104 | base cases. In the induction step, simplification leaves us with a formula
 | 
|  |    105 | that contains two \isa{case}-expressions over instructions. Thus we add
 | 
|  |    106 | automatic case splitting as well, which finishes the proof:
 | 
|  |    107 | *}
 | 
| 9458 |    108 | by(induct_tac xs, simp, simp split: instr.split);
 | 
| 8744 |    109 | 
 | 
|  |    110 | text{*\noindent
 | 
|  |    111 | Note that because \isaindex{auto} performs simplification, it can
 | 
|  |    112 | also be modified in the same way \isa{simp} can. Thus the proof can be
 | 
|  |    113 | rewritten as
 | 
|  |    114 | *}
 | 
|  |    115 | (*<*)
 | 
|  |    116 | lemmas [simp del] = exec_app;
 | 
| 8771 |    117 | lemma [simp]: "\\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)"; 
 | 
| 8744 |    118 | (*>*)
 | 
| 9458 |    119 | by(induct_tac xs, auto split: instr.split);
 | 
| 8744 |    120 | 
 | 
|  |    121 | text{*\noindent
 | 
|  |    122 | Although this is more compact, it is less clear for the reader of the proof.
 | 
|  |    123 | 
 | 
| 8771 |    124 | We could now go back and prove \isa{exec (comp e) s [] = [value e s]}
 | 
| 8744 |    125 | merely by simplification with the generalized version we just proved.
 | 
|  |    126 | However, this is unnecessary because the generalized version fully subsumes
 | 
|  |    127 | its instance.
 | 
|  |    128 | *}
 | 
|  |    129 | (*<*)
 | 
| 8771 |    130 | theorem "\\<forall>vs. exec (comp e) s vs = (value e s) # vs";
 | 
| 9458 |    131 | by(induct_tac e, auto);
 | 
| 8744 |    132 | end
 | 
|  |    133 | (*>*)
 |