author | haftmann |
Thu, 23 Sep 2010 15:46:17 +0200 | |
changeset 39664 | 0afaf89ab591 |
parent 38814 | 4d575fbfc920 |
child 39683 | f75a01ee6c41 |
permissions | -rw-r--r-- |
28213 | 1 |
theory Introduction |
2 |
imports Setup |
|
3 |
begin |
|
4 |
||
38405 | 5 |
section {* Introduction *} |
6 |
||
38437 | 7 |
text {* |
8 |
This tutorial introduces the code generator facilities of @{text |
|
9 |
"Isabelle/HOL"}. It allows to turn (a certain class of) HOL |
|
10 |
specifications into corresponding executable code in the programming |
|
38814 | 11 |
languages @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml}, |
12 |
@{text Haskell} \cite{haskell-revised-report} and @{text Scala} |
|
13 |
\cite{scala-overview-tech-report}. |
|
38405 | 14 |
|
38437 | 15 |
To profit from this tutorial, some familiarity and experience with |
16 |
@{theory HOL} \cite{isa-tutorial} and its basic theories is assumed. |
|
17 |
*} |
|
38405 | 18 |
|
19 |
||
38437 | 20 |
subsection {* Code generation principle: shallow embedding \label{sec:principle} *} |
28213 | 21 |
|
22 |
text {* |
|
38437 | 23 |
The key concept for understanding Isabelle's code generation is |
24 |
\emph{shallow embedding}: logical entities like constants, types and |
|
25 |
classes are identified with corresponding entities in the target |
|
26 |
language. In particular, the carrier of a generated program's |
|
27 |
semantics are \emph{equational theorems} from the logic. If we view |
|
28 |
a generated program as an implementation of a higher-order rewrite |
|
29 |
system, then every rewrite step performed by the program can be |
|
30 |
simulated in the logic, which guarantees partial correctness |
|
31 |
\cite{Haftmann-Nipkow:2010:code}. |
|
28213 | 32 |
*} |
33 |
||
38437 | 34 |
|
35 |
subsection {* A quick start with the Isabelle/HOL toolbox \label{sec:queue_example} *} |
|
28419 | 36 |
|
37 |
text {* |
|
38505 | 38 |
In a HOL theory, the @{command_def datatype} and @{command_def |
39 |
definition}/@{command_def primrec}/@{command_def fun} declarations |
|
40 |
form the core of a functional programming language. By default |
|
41 |
equational theorems stemming from those are used for generated code, |
|
42 |
therefore \qt{naive} code generation can proceed without further |
|
43 |
ado. |
|
28419 | 44 |
|
45 |
For example, here a simple \qt{implementation} of amortised queues: |
|
46 |
*} |
|
47 |
||
29798 | 48 |
datatype %quote 'a queue = AQueue "'a list" "'a list" |
28419 | 49 |
|
28564 | 50 |
definition %quote empty :: "'a queue" where |
29798 | 51 |
"empty = AQueue [] []" |
28419 | 52 |
|
28564 | 53 |
primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where |
29798 | 54 |
"enqueue x (AQueue xs ys) = AQueue (x # xs) ys" |
28419 | 55 |
|
28564 | 56 |
fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where |
29798 | 57 |
"dequeue (AQueue [] []) = (None, AQueue [] [])" |
58 |
| "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)" |
|
59 |
| "dequeue (AQueue xs []) = |
|
38437 | 60 |
(case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))" (*<*) |
61 |
||
62 |
lemma %invisible dequeue_nonempty_Nil [simp]: |
|
63 |
"xs \<noteq> [] \<Longrightarrow> dequeue (AQueue xs []) = (case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))" |
|
64 |
by (cases xs) (simp_all split: list.splits) (*>*) |
|
28419 | 65 |
|
66 |
text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *} |
|
28213 | 67 |
|
28564 | 68 |
export_code %quote empty dequeue enqueue in SML |
28447 | 69 |
module_name Example file "examples/example.ML" |
28419 | 70 |
|
71 |
text {* \noindent resulting in the following code: *} |
|
72 |
||
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
73 |
text %quote {* |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
74 |
\begin{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
75 |
@{code_stmts empty enqueue dequeue (SML)} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
76 |
\end{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
77 |
*} |
28419 | 78 |
|
79 |
text {* |
|
38505 | 80 |
\noindent The @{command_def export_code} command takes a |
81 |
space-separated list of constants for which code shall be generated; |
|
82 |
anything else needed for those is added implicitly. Then follows a |
|
83 |
target language identifier and a freely chosen module name. A file |
|
84 |
name denotes the destination to store the generated code. Note that |
|
85 |
the semantics of the destination depends on the target language: for |
|
38814 | 86 |
@{text SML}, @{text OCaml} and @{text Scala} it denotes a \emph{file}, |
87 |
for @{text Haskell} it denotes a \emph{directory} where a file named as the |
|
38437 | 88 |
module name (with extension @{text ".hs"}) is written: |
28419 | 89 |
*} |
90 |
||
28564 | 91 |
export_code %quote empty dequeue enqueue in Haskell |
28447 | 92 |
module_name Example file "examples/" |
28419 | 93 |
|
94 |
text {* |
|
38437 | 95 |
\noindent This is the corresponding code: |
28419 | 96 |
*} |
97 |
||
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
98 |
text %quote {* |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
99 |
\begin{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
100 |
@{code_stmts empty enqueue dequeue (Haskell)} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
101 |
\end{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
102 |
*} |
28419 | 103 |
|
104 |
text {* |
|
38437 | 105 |
\noindent For more details about @{command export_code} see |
106 |
\secref{sec:further}. |
|
28419 | 107 |
*} |
28213 | 108 |
|
38437 | 109 |
|
110 |
subsection {* Type classes *} |
|
38402 | 111 |
|
112 |
text {* |
|
38437 | 113 |
Code can also be generated from type classes in a Haskell-like |
114 |
manner. For illustration here an example from abstract algebra: |
|
38402 | 115 |
*} |
116 |
||
38437 | 117 |
class %quote semigroup = |
118 |
fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70) |
|
119 |
assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)" |
|
120 |
||
121 |
class %quote monoid = semigroup + |
|
122 |
fixes neutral :: 'a ("\<one>") |
|
123 |
assumes neutl: "\<one> \<otimes> x = x" |
|
124 |
and neutr: "x \<otimes> \<one> = x" |
|
125 |
||
126 |
instantiation %quote nat :: monoid |
|
127 |
begin |
|
128 |
||
129 |
primrec %quote mult_nat where |
|
130 |
"0 \<otimes> n = (0\<Colon>nat)" |
|
131 |
| "Suc m \<otimes> n = n + m \<otimes> n" |
|
132 |
||
133 |
definition %quote neutral_nat where |
|
134 |
"\<one> = Suc 0" |
|
135 |
||
136 |
lemma %quote add_mult_distrib: |
|
137 |
fixes n m q :: nat |
|
138 |
shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q" |
|
139 |
by (induct n) simp_all |
|
140 |
||
141 |
instance %quote proof |
|
142 |
fix m n q :: nat |
|
143 |
show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)" |
|
144 |
by (induct m) (simp_all add: add_mult_distrib) |
|
145 |
show "\<one> \<otimes> n = n" |
|
146 |
by (simp add: neutral_nat_def) |
|
147 |
show "m \<otimes> \<one> = m" |
|
148 |
by (induct m) (simp_all add: neutral_nat_def) |
|
149 |
qed |
|
150 |
||
151 |
end %quote |
|
28213 | 152 |
|
28419 | 153 |
text {* |
38437 | 154 |
\noindent We define the natural operation of the natural numbers |
155 |
on monoids: |
|
156 |
*} |
|
157 |
||
158 |
primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where |
|
159 |
"pow 0 a = \<one>" |
|
160 |
| "pow (Suc n) a = a \<otimes> pow n a" |
|
28419 | 161 |
|
38437 | 162 |
text {* |
163 |
\noindent This we use to define the discrete exponentiation |
|
164 |
function: |
|
165 |
*} |
|
166 |
||
167 |
definition %quote bexp :: "nat \<Rightarrow> nat" where |
|
168 |
"bexp n = pow n (Suc (Suc 0))" |
|
169 |
||
170 |
text {* |
|
171 |
\noindent The corresponding code in Haskell uses that language's |
|
172 |
native classes: |
|
173 |
*} |
|
174 |
||
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
175 |
text %quote {* |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
176 |
\begin{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
177 |
@{code_stmts bexp (Haskell)} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
178 |
\end{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
179 |
*} |
28419 | 180 |
|
38437 | 181 |
text {* |
182 |
\noindent This is a convenient place to show how explicit dictionary |
|
183 |
construction manifests in generated code -- the same example in |
|
184 |
@{text SML}: |
|
185 |
*} |
|
186 |
||
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
187 |
text %quote {* |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
188 |
\begin{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
189 |
@{code_stmts bexp (SML)} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
190 |
\end{typewriter} |
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
38814
diff
changeset
|
191 |
*} |
38437 | 192 |
|
193 |
text {* |
|
194 |
\noindent Note the parameters with trailing underscore (@{verbatim |
|
195 |
"A_"}), which are the dictionary parameters. |
|
196 |
*} |
|
197 |
||
198 |
||
199 |
subsection {* How to continue from here *} |
|
200 |
||
201 |
text {* |
|
202 |
What you have seen so far should be already enough in a lot of |
|
203 |
cases. If you are content with this, you can quit reading here. |
|
204 |
||
205 |
Anyway, to understand situations where problems occur or to increase |
|
206 |
the scope of code generation beyond default, it is necessary to gain |
|
207 |
some understanding how the code generator actually works: |
|
28419 | 208 |
|
209 |
\begin{itemize} |
|
210 |
||
38437 | 211 |
\item The foundations of the code generator are described in |
212 |
\secref{sec:foundations}. |
|
213 |
||
214 |
\item In particular \secref{sec:utterly_wrong} gives hints how to |
|
215 |
debug situations where code generation does not succeed as |
|
216 |
expected. |
|
28419 | 217 |
|
38437 | 218 |
\item The scope and quality of generated code can be increased |
219 |
dramatically by applying refinement techniques, which are |
|
220 |
introduced in \secref{sec:refinement}. |
|
28419 | 221 |
|
38437 | 222 |
\item Inductive predicates can be turned executable using an |
223 |
extension of the code generator \secref{sec:inductive}. |
|
224 |
||
225 |
\item You may want to skim over the more technical sections |
|
226 |
\secref{sec:adaptation} and \secref{sec:further}. |
|
227 |
||
228 |
\item For exhaustive syntax diagrams etc. you should visit the |
|
229 |
Isabelle/Isar Reference Manual \cite{isabelle-isar-ref}. |
|
28419 | 230 |
|
231 |
\end{itemize} |
|
232 |
||
38437 | 233 |
\bigskip |
234 |
||
235 |
\begin{center}\fbox{\fbox{\begin{minipage}{8cm} |
|
236 |
||
237 |
\begin{center}\textit{Happy proving, happy hacking!}\end{center} |
|
238 |
||
239 |
\end{minipage}}}\end{center} |
|
240 |
||
241 |
\begin{warn} |
|
242 |
There is also a more ancient code generator in Isabelle by Stefan |
|
243 |
Berghofer \cite{Berghofer-Nipkow:2002}. Although its |
|
244 |
functionality is covered by the code generator presented here, it |
|
245 |
will sometimes show up as an artifact. In case of ambiguity, we |
|
246 |
will refer to the framework described here as @{text "generic code |
|
247 |
generator"}, to the other as @{text "SML code generator"}. |
|
248 |
\end{warn} |
|
28419 | 249 |
*} |
28213 | 250 |
|
251 |
end |