1 (*<*) |
1 (*<*) |
2 theory CodeGen imports Main begin |
2 theory CodeGen imports Main begin |
3 (*>*) |
3 (*>*) |
4 |
4 |
5 section{*Case Study: Compiling Expressions*} |
5 section\<open>Case Study: Compiling Expressions\<close> |
6 |
6 |
7 text{*\label{sec:ExprCompiler} |
7 text\<open>\label{sec:ExprCompiler} |
8 \index{compiling expressions example|(}% |
8 \index{compiling expressions example|(}% |
9 The task is to develop a compiler from a generic type of expressions (built |
9 The task is to develop a compiler from a generic type of expressions (built |
10 from variables, constants and binary operations) to a stack machine. This |
10 from variables, constants and binary operations) to a stack machine. This |
11 generic type of expressions is a generalization of the boolean expressions in |
11 generic type of expressions is a generalization of the boolean expressions in |
12 \S\ref{sec:boolex}. This time we do not commit ourselves to a particular |
12 \S\ref{sec:boolex}. This time we do not commit ourselves to a particular |
13 type of variables or values but make them type parameters. Neither is there |
13 type of variables or values but make them type parameters. Neither is there |
14 a fixed set of binary operations: instead the expression contains the |
14 a fixed set of binary operations: instead the expression contains the |
15 appropriate function itself. |
15 appropriate function itself. |
16 *} |
16 \<close> |
17 |
17 |
18 type_synonym 'v binop = "'v \<Rightarrow> 'v \<Rightarrow> 'v" |
18 type_synonym 'v binop = "'v \<Rightarrow> 'v \<Rightarrow> 'v" |
19 datatype (dead 'a, 'v) expr = Cex 'v |
19 datatype (dead 'a, 'v) expr = Cex 'v |
20 | Vex 'a |
20 | Vex 'a |
21 | Bex "'v binop" "('a,'v)expr" "('a,'v)expr" |
21 | Bex "'v binop" "('a,'v)expr" "('a,'v)expr" |
22 |
22 |
23 text{*\noindent |
23 text\<open>\noindent |
24 The three constructors represent constants, variables and the application of |
24 The three constructors represent constants, variables and the application of |
25 a binary operation to two subexpressions. |
25 a binary operation to two subexpressions. |
26 |
26 |
27 The value of an expression with respect to an environment that maps variables to |
27 The value of an expression with respect to an environment that maps variables to |
28 values is easily defined: |
28 values is easily defined: |
29 *} |
29 \<close> |
30 |
30 |
31 primrec "value" :: "('a,'v)expr \<Rightarrow> ('a \<Rightarrow> 'v) \<Rightarrow> 'v" where |
31 primrec "value" :: "('a,'v)expr \<Rightarrow> ('a \<Rightarrow> 'v) \<Rightarrow> 'v" where |
32 "value (Cex v) env = v" | |
32 "value (Cex v) env = v" | |
33 "value (Vex a) env = env a" | |
33 "value (Vex a) env = env a" | |
34 "value (Bex f e1 e2) env = f (value e1 env) (value e2 env)" |
34 "value (Bex f e1 e2) env = f (value e1 env) (value e2 env)" |
35 |
35 |
36 text{* |
36 text\<open> |
37 The stack machine has three instructions: load a constant value onto the |
37 The stack machine has three instructions: load a constant value onto the |
38 stack, load the contents of an address onto the stack, and apply a |
38 stack, load the contents of an address onto the stack, and apply a |
39 binary operation to the two topmost elements of the stack, replacing them by |
39 binary operation to the two topmost elements of the stack, replacing them by |
40 the result. As for @{text"expr"}, addresses and values are type parameters: |
40 the result. As for @{text"expr"}, addresses and values are type parameters: |
41 *} |
41 \<close> |
42 |
42 |
43 datatype (dead 'a, 'v) instr = Const 'v |
43 datatype (dead 'a, 'v) instr = Const 'v |
44 | Load 'a |
44 | Load 'a |
45 | Apply "'v binop" |
45 | Apply "'v binop" |
46 |
46 |
47 text{* |
47 text\<open> |
48 The execution of the stack machine is modelled by a function |
48 The execution of the stack machine is modelled by a function |
49 @{text"exec"} that takes a list of instructions, a store (modelled as a |
49 @{text"exec"} that takes a list of instructions, a store (modelled as a |
50 function from addresses to values, just like the environment for |
50 function from addresses to values, just like the environment for |
51 evaluating expressions), and a stack (modelled as a list) of values, |
51 evaluating expressions), and a stack (modelled as a list) of values, |
52 and returns the stack at the end of the execution --- the store remains |
52 and returns the stack at the end of the execution --- the store remains |
53 unchanged: |
53 unchanged: |
54 *} |
54 \<close> |
55 |
55 |
56 primrec exec :: "('a,'v)instr list \<Rightarrow> ('a\<Rightarrow>'v) \<Rightarrow> 'v list \<Rightarrow> 'v list" |
56 primrec exec :: "('a,'v)instr list \<Rightarrow> ('a\<Rightarrow>'v) \<Rightarrow> 'v list \<Rightarrow> 'v list" |
57 where |
57 where |
58 "exec [] s vs = vs" | |
58 "exec [] s vs = vs" | |
59 "exec (i#is) s vs = (case i of |
59 "exec (i#is) s vs = (case i of |
60 Const v \<Rightarrow> exec is s (v#vs) |
60 Const v \<Rightarrow> exec is s (v#vs) |
61 | Load a \<Rightarrow> exec is s ((s a)#vs) |
61 | Load a \<Rightarrow> exec is s ((s a)#vs) |
62 | Apply f \<Rightarrow> exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))" |
62 | Apply f \<Rightarrow> exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))" |
63 |
63 |
64 text{*\noindent |
64 text\<open>\noindent |
65 Recall that @{term"hd"} and @{term"tl"} |
65 Recall that @{term"hd"} and @{term"tl"} |
66 return the first element and the remainder of a list. |
66 return the first element and the remainder of a list. |
67 Because all functions are total, \cdx{hd} is defined even for the empty |
67 Because all functions are total, \cdx{hd} is defined even for the empty |
68 list, although we do not know what the result is. Thus our model of the |
68 list, although we do not know what the result is. Thus our model of the |
69 machine always terminates properly, although the definition above does not |
69 machine always terminates properly, although the definition above does not |
70 tell us much about the result in situations where @{term"Apply"} was executed |
70 tell us much about the result in situations where @{term"Apply"} was executed |
71 with fewer than two elements on the stack. |
71 with fewer than two elements on the stack. |
72 |
72 |
73 The compiler is a function from expressions to a list of instructions. Its |
73 The compiler is a function from expressions to a list of instructions. Its |
74 definition is obvious: |
74 definition is obvious: |
75 *} |
75 \<close> |
76 |
76 |
77 primrec compile :: "('a,'v)expr \<Rightarrow> ('a,'v)instr list" where |
77 primrec compile :: "('a,'v)expr \<Rightarrow> ('a,'v)instr list" where |
78 "compile (Cex v) = [Const v]" | |
78 "compile (Cex v) = [Const v]" | |
79 "compile (Vex a) = [Load a]" | |
79 "compile (Vex a) = [Load a]" | |
80 "compile (Bex f e1 e2) = (compile e2) @ (compile e1) @ [Apply f]" |
80 "compile (Bex f e1 e2) = (compile e2) @ (compile e1) @ [Apply f]" |
81 |
81 |
82 text{* |
82 text\<open> |
83 Now we have to prove the correctness of the compiler, i.e.\ that the |
83 Now we have to prove the correctness of the compiler, i.e.\ that the |
84 execution of a compiled expression results in the value of the expression: |
84 execution of a compiled expression results in the value of the expression: |
85 *} |
85 \<close> |
86 theorem "exec (compile e) s [] = [value e s]" |
86 theorem "exec (compile e) s [] = [value e s]" |
87 (*<*)oops(*>*) |
87 (*<*)oops(*>*) |
88 text{*\noindent |
88 text\<open>\noindent |
89 This theorem needs to be generalized: |
89 This theorem needs to be generalized: |
90 *} |
90 \<close> |
91 |
91 |
92 theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs" |
92 theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs" |
93 |
93 |
94 txt{*\noindent |
94 txt\<open>\noindent |
95 It will be proved by induction on @{term"e"} followed by simplification. |
95 It will be proved by induction on @{term"e"} followed by simplification. |
96 First, we must prove a lemma about executing the concatenation of two |
96 First, we must prove a lemma about executing the concatenation of two |
97 instruction sequences: |
97 instruction sequences: |
98 *} |
98 \<close> |
99 (*<*)oops(*>*) |
99 (*<*)oops(*>*) |
100 lemma exec_app[simp]: |
100 lemma exec_app[simp]: |
101 "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" |
101 "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" |
102 |
102 |
103 txt{*\noindent |
103 txt\<open>\noindent |
104 This requires induction on @{term"xs"} and ordinary simplification for the |
104 This requires induction on @{term"xs"} and ordinary simplification for the |
105 base cases. In the induction step, simplification leaves us with a formula |
105 base cases. In the induction step, simplification leaves us with a formula |
106 that contains two @{text"case"}-expressions over instructions. Thus we add |
106 that contains two @{text"case"}-expressions over instructions. Thus we add |
107 automatic case splitting, which finishes the proof: |
107 automatic case splitting, which finishes the proof: |
108 *} |
108 \<close> |
109 apply(induct_tac xs, simp, simp split: instr.split) |
109 apply(induct_tac xs, simp, simp split: instr.split) |
110 (*<*)done(*>*) |
110 (*<*)done(*>*) |
111 text{*\noindent |
111 text\<open>\noindent |
112 Note that because both \methdx{simp_all} and \methdx{auto} perform simplification, they can |
112 Note that because both \methdx{simp_all} and \methdx{auto} perform simplification, they can |
113 be modified in the same way as @{text simp}. Thus the proof can be |
113 be modified in the same way as @{text simp}. Thus the proof can be |
114 rewritten as |
114 rewritten as |
115 *} |
115 \<close> |
116 (*<*) |
116 (*<*) |
117 declare exec_app[simp del] |
117 declare exec_app[simp del] |
118 lemma [simp]: "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" |
118 lemma [simp]: "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" |
119 (*>*) |
119 (*>*) |
120 apply(induct_tac xs, simp_all split: instr.split) |
120 apply(induct_tac xs, simp_all split: instr.split) |
121 (*<*)done(*>*) |
121 (*<*)done(*>*) |
122 text{*\noindent |
122 text\<open>\noindent |
123 Although this is more compact, it is less clear for the reader of the proof. |
123 Although this is more compact, it is less clear for the reader of the proof. |
124 |
124 |
125 We could now go back and prove @{prop"exec (compile e) s [] = [value e s]"} |
125 We could now go back and prove @{prop"exec (compile e) s [] = [value e s]"} |
126 merely by simplification with the generalized version we just proved. |
126 merely by simplification with the generalized version we just proved. |
127 However, this is unnecessary because the generalized version fully subsumes |
127 However, this is unnecessary because the generalized version fully subsumes |
128 its instance.% |
128 its instance.% |
129 \index{compiling expressions example|)} |
129 \index{compiling expressions example|)} |
130 *} |
130 \<close> |
131 (*<*) |
131 (*<*) |
132 theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs" |
132 theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs" |
133 by(induct_tac e, auto) |
133 by(induct_tac e, auto) |
134 end |
134 end |
135 (*>*) |
135 (*>*) |