src/Doc/Tutorial/CodeGen/CodeGen.thy
author wenzelm
Fri, 12 Jan 2018 14:08:53 +0100
changeset 67406 23307fd33906
parent 58860 fee7cfa69c50
child 69505 cc2d676d5395
permissions -rw-r--r--
isabelle update_cartouches -c;
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
     1
(*<*)
16417
9bc16273c2d4 migrated theory headers to new format
haftmann
parents: 11458
diff changeset
     2
theory CodeGen imports Main begin
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
     3
(*>*)
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
     4
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
     5
section\<open>Case Study: Compiling Expressions\<close>
9844
8016321c7de1 *** empty log message ***
nipkow
parents: 9792
diff changeset
     6
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
     7
text\<open>\label{sec:ExprCompiler}
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
     8
\index{compiling expressions example|(}%
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
     9
The task is to develop a compiler from a generic type of expressions (built
10795
9e888d60d3e5 minor edits to Chapters 1-3
paulson
parents: 10171
diff changeset
    10
from variables, constants and binary operations) to a stack machine.  This
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    11
generic type of expressions is a generalization of the boolean expressions in
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    12
\S\ref{sec:boolex}.  This time we do not commit ourselves to a particular
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    13
type of variables or values but make them type parameters.  Neither is there
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    14
a fixed set of binary operations: instead the expression contains the
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    15
appropriate function itself.
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    16
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    17
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    18
type_synonym 'v binop = "'v \<Rightarrow> 'v \<Rightarrow> 'v"
58305
57752a91eec4 renamed 'datatype' to 'old_datatype'; 'datatype' is now alias for 'datatype_new'
blanchet
parents: 48985
diff changeset
    19
datatype (dead 'a, 'v) expr = Cex 'v
57752a91eec4 renamed 'datatype' to 'old_datatype'; 'datatype' is now alias for 'datatype_new'
blanchet
parents: 48985
diff changeset
    20
                      | Vex 'a
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    21
                      | Bex "'v binop"  "('a,'v)expr"  "('a,'v)expr"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    22
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    23
text\<open>\noindent
8771
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    24
The three constructors represent constants, variables and the application of
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    25
a binary operation to two subexpressions.
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    26
10795
9e888d60d3e5 minor edits to Chapters 1-3
paulson
parents: 10171
diff changeset
    27
The value of an expression with respect to an environment that maps variables to
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    28
values is easily defined:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    29
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    30
27015
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    31
primrec "value" :: "('a,'v)expr \<Rightarrow> ('a \<Rightarrow> 'v) \<Rightarrow> 'v" where
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    32
"value (Cex v) env = v" |
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    33
"value (Vex a) env = env a" |
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    34
"value (Bex f e1 e2) env = f (value e1 env) (value e2 env)"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    35
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    36
text\<open>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    37
The stack machine has three instructions: load a constant value onto the
10795
9e888d60d3e5 minor edits to Chapters 1-3
paulson
parents: 10171
diff changeset
    38
stack, load the contents of an address onto the stack, and apply a
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    39
binary operation to the two topmost elements of the stack, replacing them by
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
    40
the result. As for @{text"expr"}, addresses and values are type parameters:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    41
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    42
58305
57752a91eec4 renamed 'datatype' to 'old_datatype'; 'datatype' is now alias for 'datatype_new'
blanchet
parents: 48985
diff changeset
    43
datatype (dead 'a, 'v) instr = Const 'v
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    44
                       | Load 'a
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    45
                       | Apply "'v binop"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    46
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    47
text\<open>
8771
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    48
The execution of the stack machine is modelled by a function
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
    49
@{text"exec"} that takes a list of instructions, a store (modelled as a
8771
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    50
function from addresses to values, just like the environment for
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    51
evaluating expressions), and a stack (modelled as a list) of values,
10971
6852682eaf16 *** empty log message ***
nipkow
parents: 10885
diff changeset
    52
and returns the stack at the end of the execution --- the store remains
8771
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    53
unchanged:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    54
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    55
27015
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    56
primrec exec :: "('a,'v)instr list \<Rightarrow> ('a\<Rightarrow>'v) \<Rightarrow> 'v list \<Rightarrow> 'v list"
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    57
where
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    58
"exec [] s vs = vs" |
8771
026f37a86ea7 *** empty log message ***
nipkow
parents: 8744
diff changeset
    59
"exec (i#is) s vs = (case i of
10171
59d6633835fa *** empty log message ***
nipkow
parents: 9933
diff changeset
    60
    Const v  \<Rightarrow> exec is s (v#vs)
59d6633835fa *** empty log message ***
nipkow
parents: 9933
diff changeset
    61
  | Load a   \<Rightarrow> exec is s ((s a)#vs)
27015
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    62
  | Apply f  \<Rightarrow> exec is s ((f (hd vs) (hd(tl vs)))#(tl(tl vs))))"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    63
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    64
text\<open>\noindent
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
    65
Recall that @{term"hd"} and @{term"tl"}
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    66
return the first element and the remainder of a list.
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
    67
Because all functions are total, \cdx{hd} is defined even for the empty
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    68
list, although we do not know what the result is. Thus our model of the
10795
9e888d60d3e5 minor edits to Chapters 1-3
paulson
parents: 10171
diff changeset
    69
machine always terminates properly, although the definition above does not
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
    70
tell us much about the result in situations where @{term"Apply"} was executed
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    71
with fewer than two elements on the stack.
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    72
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    73
The compiler is a function from expressions to a list of instructions. Its
10795
9e888d60d3e5 minor edits to Chapters 1-3
paulson
parents: 10171
diff changeset
    74
definition is obvious:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    75
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    76
27015
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    77
primrec compile :: "('a,'v)expr \<Rightarrow> ('a,'v)instr list" where
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    78
"compile (Cex v)       = [Const v]" |
f8537d69f514 *** empty log message ***
nipkow
parents: 17555
diff changeset
    79
"compile (Vex a)       = [Load a]" |
17212
6859484b5b2b comp -> compile
nipkow
parents: 16417
diff changeset
    80
"compile (Bex f e1 e2) = (compile e2) @ (compile e1) @ [Apply f]"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    81
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    82
text\<open>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    83
Now we have to prove the correctness of the compiler, i.e.\ that the
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    84
execution of a compiled expression results in the value of the expression:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    85
\<close>
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    86
theorem "exec (compile e) s [] = [value e s]"
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    87
(*<*)oops(*>*)
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    88
text\<open>\noindent
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
    89
This theorem needs to be generalized:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    90
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    91
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    92
theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs"
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    93
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    94
txt\<open>\noindent
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
    95
It will be proved by induction on @{term"e"} followed by simplification.  
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
    96
First, we must prove a lemma about executing the concatenation of two
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
    97
instruction sequences:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
    98
\<close>
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
    99
(*<*)oops(*>*)
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   100
lemma exec_app[simp]:
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   101
  "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" 
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   102
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   103
txt\<open>\noindent
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
   104
This requires induction on @{term"xs"} and ordinary simplification for the
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   105
base cases. In the induction step, simplification leaves us with a formula
9792
bbefb6ce5cb2 *** empty log message ***
nipkow
parents: 9458
diff changeset
   106
that contains two @{text"case"}-expressions over instructions. Thus we add
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
   107
automatic case splitting, which finishes the proof:
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   108
\<close>
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   109
apply(induct_tac xs, simp, simp split: instr.split)
10171
59d6633835fa *** empty log message ***
nipkow
parents: 9933
diff changeset
   110
(*<*)done(*>*)
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   111
text\<open>\noindent
11428
332347b9b942 tidying the index
paulson
parents: 10971
diff changeset
   112
Note that because both \methdx{simp_all} and \methdx{auto} perform simplification, they can
332347b9b942 tidying the index
paulson
parents: 10971
diff changeset
   113
be modified in the same way as @{text simp}.  Thus the proof can be
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   114
rewritten as
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   115
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   116
(*<*)
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   117
declare exec_app[simp del]
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   118
lemma [simp]: "\<forall>vs. exec (xs@ys) s vs = exec ys s (exec xs s vs)" 
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   119
(*>*)
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   120
apply(induct_tac xs, simp_all split: instr.split)
10171
59d6633835fa *** empty log message ***
nipkow
parents: 9933
diff changeset
   121
(*<*)done(*>*)
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   122
text\<open>\noindent
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   123
Although this is more compact, it is less clear for the reader of the proof.
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   124
17212
6859484b5b2b comp -> compile
nipkow
parents: 16417
diff changeset
   125
We could now go back and prove @{prop"exec (compile e) s [] = [value e s]"}
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   126
merely by simplification with the generalized version we just proved.
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   127
However, this is unnecessary because the generalized version fully subsumes
11458
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
   128
its instance.%
09a6c44a48ea numerous stylistic changes and indexing
paulson
parents: 11428
diff changeset
   129
\index{compiling expressions example|)}
67406
23307fd33906 isabelle update_cartouches -c;
wenzelm
parents: 58860
diff changeset
   130
\<close>
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   131
(*<*)
58860
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   132
theorem "\<forall>vs. exec (compile e) s vs = (value e s) # vs"
fee7cfa69c50 eliminated spurious semicolons;
wenzelm
parents: 58305
diff changeset
   133
by(induct_tac e, auto)
8744
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   134
end
22fa8b16c3ae *** empty log message ***
nipkow
parents:
diff changeset
   135
(*>*)