author | wenzelm |
Tue, 15 Apr 2008 18:49:16 +0200 | |
changeset 26663 | 020618551468 |
parent 26513 | 6f306c8c2c54 |
child 26732 | 6ea9de67e576 |
permissions | -rw-r--r-- |
26513 | 1 |
|
20948 | 2 |
(* $Id$ *) |
3 |
||
21147 | 4 |
(*<*) |
20948 | 5 |
theory Codegen |
6 |
imports Main |
|
21452 | 7 |
uses "../../../antiquote_setup.ML" |
20948 | 8 |
begin |
9 |
||
21147 | 10 |
ML {* |
24217 | 11 |
CodeTarget.code_width := 74; |
21147 | 12 |
*} |
13 |
||
14 |
(*>*) |
|
15 |
||
20948 | 16 |
chapter {* Code generation from Isabelle theories *} |
17 |
||
18 |
section {* Introduction *} |
|
19 |
||
21058 | 20 |
subsection {* Motivation *} |
21 |
||
20948 | 22 |
text {* |
21058 | 23 |
Executing formal specifications as programs is a well-established |
24 |
topic in the theorem proving community. With increasing |
|
25 |
application of theorem proving systems in the area of |
|
26 |
software development and verification, its relevance manifests |
|
27 |
for running test cases and rapid prototyping. In logical |
|
28 |
calculi like constructive type theory, |
|
29 |
a notion of executability is implicit due to the nature |
|
22550 | 30 |
of the calculus. In contrast, specifications in Isabelle |
21058 | 31 |
can be highly non-executable. In order to bridge |
32 |
the gap between logic and executable specifications, |
|
33 |
an explicit non-trivial transformation has to be applied: |
|
34 |
code generation. |
|
35 |
||
36 |
This tutorial introduces a generic code generator for the |
|
37 |
Isabelle system \cite{isa-tutorial}. |
|
38 |
Generic in the sense that the |
|
39 |
\qn{target language} for which code shall ultimately be |
|
40 |
generated is not fixed but may be an arbitrary state-of-the-art |
|
41 |
functional programming language (currently, the implementation |
|
22550 | 42 |
supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell |
43 |
\cite{haskell-revised-report}). |
|
21058 | 44 |
We aim to provide a |
45 |
versatile environment |
|
46 |
suitable for software development and verification, |
|
47 |
structuring the process |
|
48 |
of code generation into a small set of orthogonal principles |
|
49 |
while achieving a big coverage of application areas |
|
50 |
with maximum flexibility. |
|
21189 | 51 |
|
22550 | 52 |
Conceptually the code generator framework is part |
53 |
of Isabelle's @{text Pure} meta logic; the object logic |
|
54 |
@{text HOL} which is an extension of @{text Pure} |
|
55 |
already comes with a reasonable framework setup and thus provides |
|
56 |
a good working horse for raising code-generation-driven |
|
57 |
applications. So, we assume some familiarity and experience |
|
58 |
with the ingredients of the @{text HOL} \emph{Main} theory |
|
59 |
(see also \cite{isa-tutorial}). |
|
21058 | 60 |
*} |
61 |
||
62 |
||
63 |
subsection {* Overview *} |
|
64 |
||
65 |
text {* |
|
66 |
The code generator aims to be usable with no further ado |
|
67 |
in most cases while allowing for detailed customization. |
|
22550 | 68 |
This manifests in the structure of this tutorial: |
69 |
we start with a generic example \secref{sec:example} |
|
70 |
and introduce code generation concepts \secref{sec:concept}. |
|
71 |
Section |
|
21178 | 72 |
\secref{sec:basics} explains how to use the framework naively, |
21058 | 73 |
presuming a reasonable default setup. Then, section |
74 |
\secref{sec:advanced} deals with advanced topics, |
|
75 |
introducing further aspects of the code generator framework |
|
76 |
in a motivation-driven manner. Last, section \secref{sec:ml} |
|
77 |
introduces the framework's internal programming interfaces. |
|
20948 | 78 |
|
21058 | 79 |
\begin{warn} |
80 |
Ultimately, the code generator which this tutorial deals with |
|
81 |
is supposed to replace the already established code generator |
|
82 |
by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. |
|
21147 | 83 |
So, for the moment, there are two distinct code generators |
21058 | 84 |
in Isabelle. |
22916 | 85 |
Also note that while the framework itself is |
22550 | 86 |
object-logic independent, only @{text HOL} provides a reasonable |
21058 | 87 |
framework setup. |
88 |
\end{warn} |
|
89 |
*} |
|
90 |
||
91 |
||
22550 | 92 |
section {* An example: a simple theory of search trees \label{sec:example} *} |
93 |
||
94 |
text {* |
|
22916 | 95 |
When writing executable specifications using @{text HOL}, |
96 |
it is convenient to use |
|
22550 | 97 |
three existing packages: the datatype package for defining |
98 |
datatypes, the function package for (recursive) functions, |
|
99 |
and the class package for overloaded definitions. |
|
100 |
||
101 |
We develope a small theory of search trees; trees are represented |
|
102 |
as a datatype with key type @{typ "'a"} and value type @{typ "'b"}: |
|
103 |
*} |
|
22479 | 104 |
|
105 |
datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b |
|
106 |
| Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree" |
|
107 |
||
22550 | 108 |
text {* |
109 |
\noindent Note that we have constrained the type of keys |
|
110 |
to the class of total orders, @{text linorder}. |
|
111 |
||
112 |
We define @{text find} and @{text update} functions: |
|
113 |
*} |
|
114 |
||
25870 | 115 |
primrec |
22479 | 116 |
find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where |
117 |
"find (Leaf key val) it = (if it = key then Some val else None)" |
|
118 |
| "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)" |
|
119 |
||
120 |
fun |
|
121 |
update :: "'a\<Colon>linorder \<times> 'b \<Rightarrow> ('a, 'b) searchtree \<Rightarrow> ('a, 'b) searchtree" where |
|
122 |
"update (it, entry) (Leaf key val) = ( |
|
123 |
if it = key then Leaf key entry |
|
124 |
else if it \<le> key |
|
125 |
then Branch (Leaf it entry) it (Leaf key val) |
|
126 |
else Branch (Leaf key val) it (Leaf it entry) |
|
127 |
)" |
|
128 |
| "update (it, entry) (Branch t1 key t2) = ( |
|
129 |
if it \<le> key |
|
130 |
then (Branch (update (it, entry) t1) key t2) |
|
131 |
else (Branch t1 key (update (it, entry) t2)) |
|
132 |
)" |
|
133 |
||
22550 | 134 |
text {* |
135 |
\noindent For testing purpose, we define a small example |
|
136 |
using natural numbers @{typ nat} (which are a @{text linorder}) |
|
23130 | 137 |
as keys and list of nats as values: |
22550 | 138 |
*} |
139 |
||
23130 | 140 |
definition |
141 |
example :: "(nat, nat list) searchtree" |
|
142 |
where |
|
143 |
"example = update (Suc (Suc (Suc (Suc 0))), [Suc (Suc 0), Suc (Suc 0)]) (update (Suc (Suc (Suc 0)), [Suc (Suc (Suc 0))]) |
|
144 |
(update (Suc (Suc 0), [Suc (Suc 0)]) (Leaf (Suc 0) [])))" |
|
22479 | 145 |
|
22550 | 146 |
text {* |
147 |
\noindent Then we generate code |
|
148 |
*} |
|
149 |
||
24348 | 150 |
export_code example (*<*)in SML(*>*)in SML file "examples/tree.ML" |
22479 | 151 |
|
152 |
text {* |
|
22550 | 153 |
\noindent which looks like: |
22479 | 154 |
\lstsml{Thy/examples/tree.ML} |
155 |
*} |
|
156 |
||
22550 | 157 |
|
158 |
section {* Code generation concepts and process \label{sec:concept} *} |
|
21058 | 159 |
|
160 |
text {* |
|
21189 | 161 |
\begin{figure}[h] |
162 |
\centering |
|
163 |
\includegraphics[width=0.7\textwidth]{codegen_process} |
|
164 |
\caption{code generator -- processing overview} |
|
165 |
\label{fig:process} |
|
166 |
\end{figure} |
|
167 |
||
21058 | 168 |
The code generator employs a notion of executability |
169 |
for three foundational executable ingredients known |
|
170 |
from functional programming: |
|
22060 | 171 |
\emph{defining equations}, \emph{datatypes}, and |
172 |
\emph{type classes}. A defining equation as a first approximation |
|
21058 | 173 |
is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"} |
174 |
(an equation headed by a constant @{text f} with arguments |
|
22798 | 175 |
@{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}). |
22060 | 176 |
Code generation aims to turn defining equations |
21058 | 177 |
into a functional program by running through |
178 |
a process (see figure \ref{fig:process}): |
|
179 |
||
180 |
\begin{itemize} |
|
181 |
||
182 |
\item Out of the vast collection of theorems proven in a |
|
183 |
\qn{theory}, a reasonable subset modeling |
|
22060 | 184 |
defining equations is \qn{selected}. |
21058 | 185 |
|
186 |
\item On those selected theorems, certain |
|
187 |
transformations are carried out |
|
188 |
(\qn{preprocessing}). Their purpose is to turn theorems |
|
189 |
representing non- or badly executable |
|
190 |
specifications into equivalent but executable counterparts. |
|
191 |
The result is a structured collection of \qn{code theorems}. |
|
192 |
||
22479 | 193 |
\item These \qn{code theorems} then are \qn{translated} |
21058 | 194 |
into an Haskell-like intermediate |
195 |
language. |
|
196 |
||
197 |
\item Finally, out of the intermediate language the final |
|
198 |
code in the desired \qn{target language} is \qn{serialized}. |
|
199 |
||
200 |
\end{itemize} |
|
201 |
||
202 |
From these steps, only the two last are carried out |
|
203 |
outside the logic; by keeping this layer as |
|
204 |
thin as possible, the amount of code to trust is |
|
205 |
kept to a minimum. |
|
206 |
*} |
|
207 |
||
208 |
||
209 |
||
210 |
section {* Basics \label{sec:basics} *} |
|
20948 | 211 |
|
212 |
subsection {* Invoking the code generator *} |
|
213 |
||
21058 | 214 |
text {* |
22916 | 215 |
Thanks to a reasonable setup of the @{text HOL} theories, in |
21058 | 216 |
most cases code generation proceeds without further ado: |
217 |
*} |
|
218 |
||
25870 | 219 |
primrec |
22473 | 220 |
fac :: "nat \<Rightarrow> nat" where |
221 |
"fac 0 = 1" |
|
222 |
| "fac (Suc n) = Suc n * fac n" |
|
21058 | 223 |
|
224 |
text {* |
|
22550 | 225 |
\noindent This executable specification is now turned to SML code: |
21058 | 226 |
*} |
227 |
||
24348 | 228 |
export_code fac (*<*)in SML(*>*)in SML file "examples/fac.ML" |
21058 | 229 |
|
230 |
text {* |
|
24348 | 231 |
\noindent The @{text "\<EXPORTCODE>"} command takes a space-separated list of |
21058 | 232 |
constants together with \qn{serialization directives} |
22845 | 233 |
These start with a \qn{target language} |
234 |
identifier, followed by a file specification |
|
235 |
where to write the generated code to. |
|
21058 | 236 |
|
22060 | 237 |
Internally, the defining equations for all selected |
21178 | 238 |
constants are taken, including any transitively required |
21058 | 239 |
constants, datatypes and classes, resulting in the following |
240 |
code: |
|
241 |
||
242 |
\lstsml{Thy/examples/fac.ML} |
|
243 |
||
244 |
The code generator will complain when a required |
|
22550 | 245 |
ingredient does not provide a executable counterpart, |
246 |
e.g.~generating code |
|
21058 | 247 |
for constants not yielding |
22550 | 248 |
a defining equation (e.g.~the Hilbert choice |
249 |
operation @{text "SOME"}): |
|
21058 | 250 |
*} |
251 |
(*<*) |
|
252 |
setup {* Sign.add_path "foo" *} |
|
253 |
(*>*) |
|
254 |
definition |
|
21993 | 255 |
pick_some :: "'a list \<Rightarrow> 'a" where |
21058 | 256 |
"pick_some xs = (SOME x. x \<in> set xs)" |
257 |
(*<*) |
|
258 |
hide const pick_some |
|
259 |
||
260 |
setup {* Sign.parent_path *} |
|
261 |
||
262 |
definition |
|
21993 | 263 |
pick_some :: "'a list \<Rightarrow> 'a" where |
21058 | 264 |
"pick_some = hd" |
265 |
(*>*) |
|
25870 | 266 |
|
24348 | 267 |
export_code pick_some in SML file "examples/fail_const.ML" |
21058 | 268 |
|
22550 | 269 |
text {* \noindent will fail. *} |
270 |
||
20948 | 271 |
subsection {* Theorem selection *} |
272 |
||
21058 | 273 |
text {* |
22060 | 274 |
The list of all defining equations in a theory may be inspected |
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
275 |
using the @{text "\<PRINTCODESETUP>"} command: |
21058 | 276 |
*} |
277 |
||
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
278 |
print_codesetup |
21058 | 279 |
|
280 |
text {* |
|
281 |
\noindent which displays a table of constant with corresponding |
|
22060 | 282 |
defining equations (the additional stuff displayed |
22751 | 283 |
shall not bother us for the moment). |
21058 | 284 |
|
22916 | 285 |
The typical @{text HOL} tools are already set up in a way that |
22751 | 286 |
function definitions introduced by @{text "\<DEFINITION>"}, |
25870 | 287 |
@{text "\<PRIMREC>"}, @{text "\<FUN>"}, |
288 |
@{text "\<FUNCTION>"}, @{text "\<CONSTDEFS>"}, |
|
21323 | 289 |
@{text "\<RECDEF>"} are implicitly propagated |
22060 | 290 |
to this defining equation table. Specific theorems may be |
21058 | 291 |
selected using an attribute: \emph{code func}. As example, |
292 |
a weight selector function: |
|
293 |
*} |
|
294 |
||
295 |
primrec |
|
25870 | 296 |
pick :: "(nat \<times> 'a) list \<Rightarrow> nat \<Rightarrow> 'a" where |
21058 | 297 |
"pick (x#xs) n = (let (k, v) = x in |
298 |
if n < k then v else pick xs (n - k))" |
|
299 |
||
300 |
text {* |
|
22798 | 301 |
\noindent We want to eliminate the explicit destruction |
21058 | 302 |
of @{term x} to @{term "(k, v)"}: |
303 |
*} |
|
304 |
||
305 |
lemma [code func]: |
|
306 |
"pick ((k, v)#xs) n = (if n < k then v else pick xs (n - k))" |
|
307 |
by simp |
|
308 |
||
24348 | 309 |
export_code pick (*<*)in SML(*>*) in SML file "examples/pick1.ML" |
21058 | 310 |
|
311 |
text {* |
|
22798 | 312 |
\noindent This theorem now is used for generating code: |
21058 | 313 |
|
314 |
\lstsml{Thy/examples/pick1.ML} |
|
315 |
||
22798 | 316 |
\noindent It might be convenient to remove the pointless original |
22845 | 317 |
equation, using the \emph{func del} attribute: |
21058 | 318 |
*} |
319 |
||
22845 | 320 |
lemmas [code func del] = pick.simps |
21058 | 321 |
|
24348 | 322 |
export_code pick (*<*)in SML(*>*) in SML file "examples/pick2.ML" |
21058 | 323 |
|
324 |
text {* |
|
325 |
\lstsml{Thy/examples/pick2.ML} |
|
326 |
||
22798 | 327 |
\noindent Syntactic redundancies are implicitly dropped. For example, |
21058 | 328 |
using a modified version of the @{const fac} function |
22060 | 329 |
as defining equation, the then redundant (since |
330 |
syntactically subsumed) original defining equations |
|
21058 | 331 |
are dropped, resulting in a warning: |
332 |
*} |
|
333 |
||
334 |
lemma [code func]: |
|
335 |
"fac n = (case n of 0 \<Rightarrow> 1 | Suc m \<Rightarrow> n * fac m)" |
|
336 |
by (cases n) simp_all |
|
337 |
||
24348 | 338 |
export_code fac (*<*)in SML(*>*)in SML file "examples/fac_case.ML" |
21058 | 339 |
|
340 |
text {* |
|
341 |
\lstsml{Thy/examples/fac_case.ML} |
|
342 |
||
343 |
\begin{warn} |
|
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
344 |
The attributes \emph{code} and \emph{code del} |
21058 | 345 |
associated with the existing code generator also apply to |
346 |
the new one: \emph{code} implies \emph{code func}, |
|
22845 | 347 |
and \emph{code del} implies \emph{code func del}. |
21058 | 348 |
\end{warn} |
349 |
*} |
|
20948 | 350 |
|
351 |
subsection {* Type classes *} |
|
352 |
||
21058 | 353 |
text {* |
354 |
Type classes enter the game via the Isar class package. |
|
21075 | 355 |
For a short introduction how to use it, see \cite{isabelle-classes}; |
21147 | 356 |
here we just illustrate its impact on code generation. |
21058 | 357 |
|
358 |
In a target language, type classes may be represented |
|
22798 | 359 |
natively (as in the case of Haskell). For languages |
21058 | 360 |
like SML, they are implemented using \emph{dictionaries}. |
21178 | 361 |
Our following example specifies a class \qt{null}, |
21058 | 362 |
assigning to each of its inhabitants a \qt{null} value: |
363 |
*} |
|
364 |
||
22473 | 365 |
class null = type + |
21058 | 366 |
fixes null :: 'a |
367 |
||
25870 | 368 |
primrec |
369 |
head :: "'a\<Colon>null list \<Rightarrow> 'a" where |
|
21058 | 370 |
"head [] = null" |
22798 | 371 |
| "head (x#xs) = x" |
21058 | 372 |
|
373 |
text {* |
|
25533 | 374 |
\noindent We provide some instances for our @{text null}: |
21058 | 375 |
*} |
376 |
||
25533 | 377 |
instantiation option and list :: (type) null |
378 |
begin |
|
379 |
||
380 |
definition |
|
381 |
"null = None" |
|
21058 | 382 |
|
25533 | 383 |
definition |
384 |
"null = []" |
|
385 |
||
386 |
instance .. |
|
387 |
||
388 |
end |
|
21058 | 389 |
|
390 |
text {* |
|
25533 | 391 |
\noindent Constructing a dummy example: |
21058 | 392 |
*} |
393 |
||
394 |
definition |
|
395 |
"dummy = head [Some (Suc 0), None]" |
|
396 |
||
397 |
text {* |
|
21178 | 398 |
Type classes offer a suitable occasion to introduce |
21058 | 399 |
the Haskell serializer. Its usage is almost the same |
21075 | 400 |
as SML, but, in accordance with conventions |
401 |
some Haskell systems enforce, each module ends |
|
402 |
up in a single file. The module hierarchy is reflected in |
|
22845 | 403 |
the file system, with root directory given as file specification. |
21058 | 404 |
*} |
405 |
||
24348 | 406 |
export_code dummy in Haskell file "examples/" |
21147 | 407 |
(* NOTE: you may use Haskell only once in this document, otherwise |
408 |
you have to work in distinct subdirectories *) |
|
21058 | 409 |
|
410 |
text {* |
|
411 |
\lsthaskell{Thy/examples/Codegen.hs} |
|
22798 | 412 |
\noindent (we have left out all other modules). |
21058 | 413 |
|
22798 | 414 |
\medskip |
21058 | 415 |
|
416 |
The whole code in SML with explicit dictionary passing: |
|
417 |
*} |
|
418 |
||
24348 | 419 |
export_code dummy (*<*)in SML(*>*)in SML file "examples/class.ML" |
21058 | 420 |
|
421 |
text {* |
|
422 |
\lstsml{Thy/examples/class.ML} |
|
423 |
||
22798 | 424 |
\medskip |
425 |
||
426 |
\noindent or in OCaml: |
|
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
427 |
*} |
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
428 |
|
24348 | 429 |
export_code dummy in OCaml file "examples/class.ocaml" |
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
430 |
|
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
431 |
text {* |
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
432 |
\lstsml{Thy/examples/class.ocaml} |
22798 | 433 |
|
434 |
\medskip The explicit association of constants |
|
435 |
to classes can be inspected using the @{text "\<PRINTCLASSES>"} |
|
22845 | 436 |
command. |
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
437 |
*} |
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
438 |
|
22550 | 439 |
|
21058 | 440 |
section {* Recipes and advanced topics \label{sec:advanced} *} |
441 |
||
21089 | 442 |
text {* |
443 |
In this tutorial, we do not attempt to give an exhaustive |
|
444 |
description of the code generator framework; instead, |
|
445 |
we cast a light on advanced topics by introducing |
|
446 |
them together with practically motivated examples. Concerning |
|
447 |
further reading, see |
|
448 |
||
449 |
\begin{itemize} |
|
450 |
||
451 |
\item the Isabelle/Isar Reference Manual \cite{isabelle-isar-ref} |
|
452 |
for exhaustive syntax diagrams. |
|
24193 | 453 |
\item or \cite{Haftmann-Nipkow:2007:codegen} which deals with foundational issues |
21089 | 454 |
of the code generator framework. |
455 |
||
456 |
\end{itemize} |
|
457 |
*} |
|
21058 | 458 |
|
22798 | 459 |
subsection {* Library theories \label{sec:library} *} |
21058 | 460 |
|
21089 | 461 |
text {* |
22916 | 462 |
The @{text HOL} @{text Main} theory already provides a code |
463 |
generator setup |
|
21089 | 464 |
which should be suitable for most applications. Common extensions |
22916 | 465 |
and modifications are available by certain theories of the @{text HOL} |
21089 | 466 |
library; beside being useful in applications, they may serve |
21178 | 467 |
as a tutorial for customizing the code generator setup. |
21089 | 468 |
|
469 |
\begin{description} |
|
470 |
||
24994
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
471 |
\item[@{text "Code_Integer"}] represents @{text HOL} integers by big |
22798 | 472 |
integer literals in target languages. |
24994
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
473 |
\item[@{text "Code_Char"}] represents @{text HOL} characters by |
22798 | 474 |
character literals in target languages. |
24994
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
475 |
\item[@{text "Code_Char_chr"}] like @{text "Code_Char"}, |
22798 | 476 |
but also offers treatment of character codes; includes |
24994
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
477 |
@{text "Code_Integer"}. |
23850 | 478 |
\item[@{text "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers, |
21178 | 479 |
which in general will result in higher efficency; pattern |
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
480 |
matching with @{term "0\<Colon>nat"} / @{const "Suc"} |
24994
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
481 |
is eliminated; includes @{text "Code_Integer"}. |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
482 |
\item[@{text "Code_Index"}] provides an additional datatype |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
483 |
@{text index} which is mapped to target-language built-in integers. |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
484 |
Useful for code setups which involve e.g. indexing of |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
485 |
target-language arrays. |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
486 |
\item[@{text "Code_Message"}] provides an additional datatype |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
487 |
@{text message_string} which is isomorphic to strings; |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
488 |
@{text message_string}s are mapped to target-language strings. |
c385c4eabb3b
consolidated naming conventions for code generator theories
haftmann
parents:
24628
diff
changeset
|
489 |
Useful for code setups which involve e.g. printing (error) messages. |
21089 | 490 |
|
491 |
\end{description} |
|
22916 | 492 |
|
493 |
\begin{warn} |
|
494 |
When importing any of these theories, they should form the last |
|
495 |
items in an import list. Since these theories adapt the |
|
496 |
code generator setup in a non-conservative fashion, |
|
497 |
strange effects may occur otherwise. |
|
498 |
\end{warn} |
|
21089 | 499 |
*} |
500 |
||
20948 | 501 |
subsection {* Preprocessing *} |
502 |
||
21089 | 503 |
text {* |
21147 | 504 |
Before selected function theorems are turned into abstract |
505 |
code, a chain of definitional transformation steps is carried |
|
21178 | 506 |
out: \emph{preprocessing}. There are three possibilities |
21147 | 507 |
to customize preprocessing: \emph{inline theorems}, |
508 |
\emph{inline procedures} and \emph{generic preprocessors}. |
|
509 |
||
510 |
\emph{Inline theorems} are rewriting rules applied to each |
|
22060 | 511 |
defining equation. Due to the interpretation of theorems |
512 |
of defining equations, rewrites are applied to the right |
|
21147 | 513 |
hand side and the arguments of the left hand side of an |
514 |
equation, but never to the constant heading the left hand side. |
|
515 |
Inline theorems may be declared an undeclared using the |
|
22845 | 516 |
\emph{code inline} or \emph{code inline del} attribute respectively. |
21147 | 517 |
Some common applications: |
518 |
*} |
|
519 |
||
520 |
text_raw {* |
|
521 |
\begin{itemize} |
|
522 |
*} |
|
523 |
||
22845 | 524 |
text {* |
525 |
\item replacing non-executable constructs by executable ones: |
|
526 |
*} |
|
21147 | 527 |
|
22845 | 528 |
lemma [code inline]: |
529 |
"x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all |
|
530 |
||
531 |
text {* |
|
532 |
\item eliminating superfluous constants: |
|
21089 | 533 |
*} |
534 |
||
22845 | 535 |
lemma [code inline]: |
536 |
"1 = Suc 0" by simp |
|
537 |
||
538 |
text {* |
|
539 |
\item replacing executable but inconvenient constructs: |
|
540 |
*} |
|
541 |
||
542 |
lemma [code inline]: |
|
543 |
"xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all |
|
21147 | 544 |
|
545 |
text_raw {* |
|
546 |
\end{itemize} |
|
547 |
*} |
|
548 |
||
549 |
text {* |
|
22845 | 550 |
\noindent The current set of inline theorems may be inspected using |
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
551 |
the @{text "\<PRINTCODESETUP>"} command. |
21147 | 552 |
|
553 |
\emph{Inline procedures} are a generalized version of inline |
|
554 |
theorems written in ML -- rewrite rules are generated dependent |
|
555 |
on the function theorems for a certain function. One |
|
556 |
application is the implicit expanding of @{typ nat} numerals |
|
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
557 |
to @{term "0\<Colon>nat"} / @{const Suc} representation. See further |
21147 | 558 |
\secref{sec:ml} |
559 |
||
560 |
\emph{Generic preprocessors} provide a most general interface, |
|
561 |
transforming a list of function theorems to another |
|
562 |
list of function theorems, provided that neither the heading |
|
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
563 |
constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc} |
21323 | 564 |
pattern elimination implemented in |
21452 | 565 |
theory @{text "EfficientNat"} (\secref{eff_nat}) uses this |
21147 | 566 |
interface. |
567 |
||
568 |
\begin{warn} |
|
569 |
The order in which single preprocessing steps are carried |
|
570 |
out currently is not specified; in particular, preprocessing |
|
21178 | 571 |
is \emph{no} fix point process. Keep this in mind when |
21147 | 572 |
setting up the preprocessor. |
573 |
||
574 |
Further, the attribute \emph{code unfold} |
|
575 |
associated with the existing code generator also applies to |
|
576 |
the new one: \emph{code unfold} implies \emph{code inline}. |
|
577 |
\end{warn} |
|
578 |
*} |
|
20948 | 579 |
|
22798 | 580 |
|
581 |
subsection {* Concerning operational equality *} |
|
582 |
||
583 |
text {* |
|
584 |
Surely you have already noticed how equality is treated |
|
585 |
by the code generator: |
|
586 |
*} |
|
587 |
||
25870 | 588 |
primrec |
22798 | 589 |
collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where |
590 |
"collect_duplicates xs ys [] = xs" |
|
591 |
| "collect_duplicates xs ys (z#zs) = (if z \<in> set xs |
|
592 |
then if z \<in> set ys |
|
593 |
then collect_duplicates xs ys zs |
|
594 |
else collect_duplicates xs (z#ys) zs |
|
595 |
else collect_duplicates (z#xs) (z#ys) zs)" |
|
596 |
||
597 |
text {* |
|
598 |
The membership test during preprocessing is rewritten, |
|
23850 | 599 |
resulting in @{const List.member}, which itself |
22798 | 600 |
performs an explicit equality check. |
601 |
*} |
|
602 |
||
24348 | 603 |
export_code collect_duplicates (*<*)in SML(*>*)in SML file "examples/collect_duplicates.ML" |
22798 | 604 |
|
605 |
text {* |
|
606 |
\lstsml{Thy/examples/collect_duplicates.ML} |
|
607 |
*} |
|
608 |
||
609 |
text {* |
|
610 |
Obviously, polymorphic equality is implemented the Haskell |
|
26513 | 611 |
way using a type class. How is this achieved? HOL introduces |
612 |
an explicit class @{text eq} with a corresponding operation |
|
613 |
@{const eq} such that @{thm eq [no_vars]}. |
|
614 |
The preprocessing framework does the rest. |
|
22798 | 615 |
For datatypes, instances of @{text eq} are implicitly derived |
26513 | 616 |
when possible. For other types, you may instantiate @{text eq} |
617 |
manually like any other type class. |
|
22798 | 618 |
|
619 |
Though this @{text eq} class is designed to get rarely in |
|
620 |
the way, a subtlety |
|
621 |
enters the stage when definitions of overloaded constants |
|
622 |
are dependent on operational equality. For example, let |
|
623 |
us define a lexicographic ordering on tuples: |
|
624 |
*} |
|
26513 | 625 |
|
25870 | 626 |
instantiation * :: (ord, ord) ord |
627 |
begin |
|
628 |
||
629 |
definition |
|
630 |
[code func del]: "p1 < p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in |
|
631 |
x1 < x2 \<or> (x1 = x2 \<and> y1 < y2))" |
|
22798 | 632 |
|
25870 | 633 |
definition |
634 |
[code func del]: "p1 \<le> p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in |
|
635 |
x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2))" |
|
636 |
||
637 |
instance .. |
|
638 |
||
639 |
end |
|
22798 | 640 |
|
22845 | 641 |
lemma ord_prod [code func(*<*), code func del(*>*)]: |
22798 | 642 |
"(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)" |
643 |
"(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)" |
|
644 |
unfolding less_prod_def less_eq_prod_def by simp_all |
|
645 |
||
646 |
text {* |
|
647 |
Then code generation will fail. Why? The definition |
|
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
648 |
of @{term "op \<le>"} depends on equality on both arguments, |
22798 | 649 |
which are polymorphic and impose an additional @{text eq} |
650 |
class constraint, thus violating the type discipline |
|
651 |
for class operations. |
|
652 |
||
653 |
The solution is to add @{text eq} explicitly to the first sort arguments in the |
|
654 |
code theorems: |
|
655 |
*} |
|
656 |
||
657 |
lemma ord_prod_code [code func]: |
|
658 |
"(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow> |
|
659 |
x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)" |
|
660 |
"(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow> |
|
661 |
x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)" |
|
662 |
unfolding ord_prod by rule+ |
|
663 |
||
664 |
text {* |
|
665 |
\noindent Then code generation succeeds: |
|
666 |
*} |
|
667 |
||
24348 | 668 |
export_code "op \<le> \<Colon> 'a\<Colon>{eq, ord} \<times> 'b\<Colon>ord \<Rightarrow> 'a \<times> 'b \<Rightarrow> bool" |
22845 | 669 |
(*<*)in SML(*>*)in SML file "examples/lexicographic.ML" |
22798 | 670 |
|
671 |
text {* |
|
672 |
\lstsml{Thy/examples/lexicographic.ML} |
|
673 |
*} |
|
674 |
||
675 |
text {* |
|
676 |
In general, code theorems for overloaded constants may have more |
|
677 |
restrictive sort constraints than the underlying instance relation |
|
678 |
between class and type constructor as long as the whole system of |
|
679 |
constraints is coregular; code theorems violating coregularity |
|
680 |
are rejected immediately. Consequently, it might be necessary |
|
681 |
to delete disturbing theorems in the code theorem table, |
|
682 |
as we have done here with the original definitions @{text less_prod_def} |
|
683 |
and @{text less_eq_prod_def}. |
|
22885 | 684 |
|
685 |
In some cases, the automatically derived defining equations |
|
686 |
for equality on a particular type may not be appropriate. |
|
687 |
As example, watch the following datatype representing |
|
23130 | 688 |
monomorphic parametric types (where type constructors |
689 |
are referred to by natural numbers): |
|
22798 | 690 |
*} |
691 |
||
23130 | 692 |
datatype monotype = Mono nat "monotype list" |
22885 | 693 |
(*<*) |
694 |
lemma monotype_eq: |
|
695 |
"Mono tyco1 typargs1 = Mono tyco2 typargs2 \<equiv> |
|
696 |
tyco1 = tyco2 \<and> typargs1 = typargs2" by simp |
|
697 |
(*>*) |
|
698 |
||
699 |
text {* |
|
700 |
Then code generation for SML would fail with a message |
|
701 |
that the generated code conains illegal mutual dependencies: |
|
23130 | 702 |
the theorem @{thm monotype_eq [no_vars]} already requires the |
22885 | 703 |
instance @{text "monotype \<Colon> eq"}, which itself requires |
23130 | 704 |
@{thm monotype_eq [no_vars]}; Haskell has no problem with mutually |
22885 | 705 |
recursive @{text instance} and @{text function} definitions, |
706 |
but the SML serializer does not support this. |
|
707 |
||
708 |
In such cases, you have to provide you own equality equations |
|
709 |
involving auxiliary constants. In our case, |
|
710 |
@{const [show_types] list_all2} can do the job: |
|
711 |
*} |
|
712 |
||
713 |
lemma monotype_eq_list_all2 [code func]: |
|
714 |
"Mono tyco1 typargs1 = Mono tyco2 typargs2 \<longleftrightarrow> |
|
715 |
tyco1 = tyco2 \<and> list_all2 (op =) typargs1 typargs2" |
|
716 |
by (simp add: list_all2_eq [symmetric]) |
|
717 |
||
718 |
text {* |
|
719 |
does not depend on instance @{text "monotype \<Colon> eq"}: |
|
720 |
*} |
|
721 |
||
24348 | 722 |
export_code "op = :: monotype \<Rightarrow> monotype \<Rightarrow> bool" |
22885 | 723 |
(*<*)in SML(*>*)in SML file "examples/monotype.ML" |
724 |
||
725 |
text {* |
|
726 |
\lstsml{Thy/examples/monotype.ML} |
|
727 |
*} |
|
22798 | 728 |
|
729 |
subsection {* Programs as sets of theorems *} |
|
730 |
||
731 |
text {* |
|
732 |
As told in \secref{sec:concept}, code generation is based |
|
733 |
on a structured collection of code theorems. |
|
734 |
For explorative purpose, this collection |
|
735 |
may be inspected using the @{text "\<CODETHMS>"} command: |
|
736 |
*} |
|
737 |
||
738 |
code_thms "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat" |
|
739 |
||
740 |
text {* |
|
741 |
\noindent prints a table with \emph{all} defining equations |
|
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
742 |
for @{term "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat"}, including |
22798 | 743 |
\emph{all} defining equations those equations depend |
744 |
on recursivly. @{text "\<CODETHMS>"} provides a convenient |
|
745 |
mechanism to inspect the impact of a preprocessor setup |
|
746 |
on defining equations. |
|
747 |
||
748 |
Similarly, the @{text "\<CODEDEPS>"} command shows a graph |
|
749 |
visualizing dependencies between defining equations. |
|
750 |
*} |
|
751 |
||
752 |
||
25411 | 753 |
subsection {* Constructor sets for datatypes *} |
754 |
||
755 |
text {* |
|
756 |
Conceptually, any datatype is spanned by a set of |
|
757 |
\emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} |
|
758 |
where @{text "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is excactly the set of \emph{all} |
|
759 |
type variables in @{text "\<tau>"}. The HOL datatype package |
|
760 |
by default registers any new datatype in the table |
|
761 |
of datatypes, which may be inspected using |
|
762 |
the @{text "\<PRINTCODESETUP>"} command. |
|
763 |
||
764 |
In some cases, it may be convenient to alter or |
|
765 |
extend this table; as an example, we show |
|
766 |
how to implement finite sets by lists |
|
767 |
using the conversion @{term [source] "set \<Colon> 'a list \<Rightarrow> 'a set"} |
|
768 |
as constructor: |
|
769 |
*} (*<*)ML {* nonfix union *} lemmas [code func del] = in_code union_empty1 union_empty2 union_insert Union_code(*>*) |
|
770 |
||
771 |
code_datatype set |
|
772 |
||
773 |
lemma [code func]: "{} = set []" by simp |
|
774 |
||
775 |
lemma [code func]: "insert x (set xs) = set (x#xs)" by simp |
|
776 |
||
777 |
lemma [code func]: "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all |
|
778 |
||
779 |
lemma [code func]: "xs \<union> set ys = foldr insert ys xs" by (induct ys) simp_all |
|
780 |
||
781 |
lemma [code func]: "\<Union>set xs = foldr (op \<union>) xs {}" by (induct xs) simp_all |
|
782 |
||
783 |
export_code "{}" insert "op \<in>" "op \<union>" "Union" (*<*)in SML(*>*) in SML file "examples/set_list.ML" |
|
784 |
||
785 |
text {* |
|
786 |
\lstsml{Thy/examples/set_list.ML} |
|
787 |
||
788 |
\medskip |
|
789 |
||
790 |
Changing the representation of existing datatypes requires |
|
791 |
some care with respect to pattern matching and such. |
|
792 |
*} |
|
793 |
||
794 |
||
21058 | 795 |
subsection {* Customizing serialization *} |
20948 | 796 |
|
22798 | 797 |
subsubsection {* Basics *} |
798 |
||
21147 | 799 |
text {* |
800 |
Consider the following function and its corresponding |
|
801 |
SML code: |
|
802 |
*} |
|
803 |
||
25870 | 804 |
primrec |
21147 | 805 |
in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where |
806 |
"in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" |
|
807 |
(*<*) |
|
21323 | 808 |
code_type %tt bool |
21147 | 809 |
(SML) |
21323 | 810 |
code_const %tt True and False and "op \<and>" and Not |
21147 | 811 |
(SML and and and) |
812 |
(*>*) |
|
24348 | 813 |
export_code in_interval (*<*)in SML(*>*)in SML file "examples/bool_literal.ML" |
21147 | 814 |
|
815 |
text {* |
|
21323 | 816 |
\lstsml{Thy/examples/bool_literal.ML} |
21147 | 817 |
|
22798 | 818 |
\noindent Though this is correct code, it is a little bit unsatisfactory: |
21147 | 819 |
boolean values and operators are materialized as distinguished |
820 |
entities with have nothing to do with the SML-builtin notion |
|
821 |
of \qt{bool}. This results in less readable code; |
|
822 |
additionally, eager evaluation may cause programs to |
|
823 |
loop or break which would perfectly terminate when |
|
824 |
the existing SML \qt{bool} would be used. To map |
|
825 |
the HOL \qt{bool} on SML \qt{bool}, we may use |
|
826 |
\qn{custom serializations}: |
|
827 |
*} |
|
828 |
||
21323 | 829 |
code_type %tt bool |
21147 | 830 |
(SML "bool") |
21323 | 831 |
code_const %tt True and False and "op \<and>" |
21147 | 832 |
(SML "true" and "false" and "_ andalso _") |
833 |
||
834 |
text {* |
|
21323 | 835 |
The @{text "\<CODETYPE>"} commad takes a type constructor |
21147 | 836 |
as arguments together with a list of custom serializations. |
837 |
Each custom serialization starts with a target language |
|
838 |
identifier followed by an expression, which during |
|
839 |
code serialization is inserted whenever the type constructor |
|
21323 | 840 |
would occur. For constants, @{text "\<CODECONST>"} implements |
841 |
the corresponding mechanism. Each ``@{verbatim "_"}'' in |
|
21147 | 842 |
a serialization expression is treated as a placeholder |
843 |
for the type constructor's (the constant's) arguments. |
|
844 |
*} |
|
845 |
||
24348 | 846 |
export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_mlbool.ML" |
21147 | 847 |
|
848 |
text {* |
|
21323 | 849 |
\lstsml{Thy/examples/bool_mlbool.ML} |
21147 | 850 |
|
22798 | 851 |
\noindent This still is not perfect: the parentheses |
21323 | 852 |
around the \qt{andalso} expression are superfluous. |
853 |
Though the serializer |
|
21147 | 854 |
by no means attempts to imitate the rich Isabelle syntax |
855 |
framework, it provides some common idioms, notably |
|
856 |
associative infixes with precedences which may be used here: |
|
857 |
*} |
|
858 |
||
21323 | 859 |
code_const %tt "op \<and>" |
21147 | 860 |
(SML infixl 1 "andalso") |
861 |
||
24348 | 862 |
export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_infix.ML" |
21147 | 863 |
|
864 |
text {* |
|
21323 | 865 |
\lstsml{Thy/examples/bool_infix.ML} |
21147 | 866 |
|
22798 | 867 |
\medskip |
868 |
||
21147 | 869 |
Next, we try to map HOL pairs to SML pairs, using the |
21323 | 870 |
infix ``@{verbatim "*"}'' type constructor and parentheses: |
21147 | 871 |
*} |
872 |
(*<*) |
|
873 |
code_type * |
|
874 |
(SML) |
|
875 |
code_const Pair |
|
876 |
(SML) |
|
877 |
(*>*) |
|
21323 | 878 |
code_type %tt * |
21147 | 879 |
(SML infix 2 "*") |
21323 | 880 |
code_const %tt Pair |
21147 | 881 |
(SML "!((_),/ (_))") |
882 |
||
883 |
text {* |
|
21323 | 884 |
The initial bang ``@{verbatim "!"}'' tells the serializer to never put |
21147 | 885 |
parentheses around the whole expression (they are already present), |
886 |
while the parentheses around argument place holders |
|
887 |
tell not to put parentheses around the arguments. |
|
21323 | 888 |
The slash ``@{verbatim "/"}'' (followed by arbitrary white space) |
21147 | 889 |
inserts a space which may be used as a break if necessary |
890 |
during pretty printing. |
|
891 |
||
22798 | 892 |
These examples give a glimpse what mechanisms |
21178 | 893 |
custom serializations provide; however their usage |
894 |
requires careful thinking in order not to introduce |
|
895 |
inconsistencies -- or, in other words: |
|
896 |
custom serializations are completely axiomatic. |
|
21147 | 897 |
|
21178 | 898 |
A further noteworthy details is that any special |
899 |
character in a custom serialization may be quoted |
|
21323 | 900 |
using ``@{verbatim "'"}''; thus, in |
901 |
``@{verbatim "fn '_ => _"}'' the first |
|
902 |
``@{verbatim "_"}'' is a proper underscore while the |
|
903 |
second ``@{verbatim "_"}'' is a placeholder. |
|
21147 | 904 |
|
21178 | 905 |
The HOL theories provide further |
25411 | 906 |
examples for custom serializations. |
21178 | 907 |
*} |
21147 | 908 |
|
22188
a63889770d57
adjusted manual to improved treatment of overloaded constants
haftmann
parents:
22175
diff
changeset
|
909 |
|
21178 | 910 |
subsubsection {* Haskell serialization *} |
911 |
||
912 |
text {* |
|
913 |
For convenience, the default |
|
914 |
HOL setup for Haskell maps the @{text eq} class to |
|
915 |
its counterpart in Haskell, giving custom serializations |
|
21323 | 916 |
for the class (@{text "\<CODECLASS>"}) and its operation: |
21178 | 917 |
*} |
918 |
||
21323 | 919 |
code_class %tt eq |
22798 | 920 |
(Haskell "Eq" where "op =" \<equiv> "(==)") |
21178 | 921 |
|
22798 | 922 |
code_const %tt "op =" |
21178 | 923 |
(Haskell infixl 4 "==") |
924 |
||
925 |
text {* |
|
926 |
A problem now occurs whenever a type which |
|
927 |
is an instance of @{text eq} in HOL is mapped |
|
928 |
on a Haskell-builtin type which is also an instance |
|
929 |
of Haskell @{text Eq}: |
|
21147 | 930 |
*} |
931 |
||
21178 | 932 |
typedecl bar |
933 |
||
26513 | 934 |
instantiation bar :: eq |
935 |
begin |
|
936 |
||
937 |
definition "eq (x\<Colon>bar) y \<longleftrightarrow> x = y" |
|
938 |
||
939 |
instance by default (simp add: eq_bar_def) |
|
940 |
||
941 |
end |
|
21178 | 942 |
|
21323 | 943 |
code_type %tt bar |
21178 | 944 |
(Haskell "Integer") |
945 |
||
946 |
text {* |
|
22188
a63889770d57
adjusted manual to improved treatment of overloaded constants
haftmann
parents:
22175
diff
changeset
|
947 |
The code generator would produce |
a63889770d57
adjusted manual to improved treatment of overloaded constants
haftmann
parents:
22175
diff
changeset
|
948 |
an additional instance, which of course is rejected. |
a63889770d57
adjusted manual to improved treatment of overloaded constants
haftmann
parents:
22175
diff
changeset
|
949 |
To suppress this additional instance, use |
a63889770d57
adjusted manual to improved treatment of overloaded constants
haftmann
parents:
22175
diff
changeset
|
950 |
@{text "\<CODEINSTANCE>"}: |
21147 | 951 |
*} |
952 |
||
21323 | 953 |
code_instance %tt bar :: eq |
21178 | 954 |
(Haskell -) |
955 |
||
956 |
||
22798 | 957 |
subsubsection {* Pretty printing *} |
21189 | 958 |
|
959 |
text {* |
|
22798 | 960 |
The serializer provides ML interfaces to set up |
961 |
pretty serializations for expressions like lists, numerals |
|
962 |
and characters; these are |
|
963 |
monolithic stubs and should only be used with the |
|
964 |
theories introduces in \secref{sec:library}. |
|
21189 | 965 |
*} |
966 |
||
22550 | 967 |
|
21189 | 968 |
subsection {* Cyclic module dependencies *} |
21178 | 969 |
|
21189 | 970 |
text {* |
971 |
Sometimes the awkward situation occurs that dependencies |
|
972 |
between definitions introduce cyclic dependencies |
|
973 |
between modules, which in the Haskell world leaves |
|
974 |
you to the mercy of the Haskell implementation you are using, |
|
975 |
while for SML code generation is not possible. |
|
21178 | 976 |
|
21189 | 977 |
A solution is to declare module names explicitly. |
978 |
Let use assume the three cyclically dependent |
|
979 |
modules are named \emph{A}, \emph{B} and \emph{C}. |
|
980 |
Then, by stating |
|
981 |
*} |
|
982 |
||
983 |
code_modulename SML |
|
984 |
A ABC |
|
985 |
B ABC |
|
986 |
C ABC |
|
987 |
||
988 |
text {* |
|
989 |
we explicitly map all those modules on \emph{ABC}, |
|
990 |
resulting in an ad-hoc merge of this three modules |
|
991 |
at serialization time. |
|
992 |
*} |
|
21147 | 993 |
|
22798 | 994 |
subsection {* Incremental code generation *} |
995 |
||
996 |
text {* |
|
997 |
Code generation is \emph{incremental}: theorems |
|
998 |
and abstract intermediate code are cached and extended on demand. |
|
999 |
The cache may be partially or fully dropped if the underlying |
|
1000 |
executable content of the theory changes. |
|
1001 |
Implementation of caching is supposed to transparently |
|
1002 |
hid away the details from the user. Anyway, caching |
|
1003 |
reaches the surface by using a slightly more general form |
|
1004 |
of the @{text "\<CODETHMS>"}, @{text "\<CODEDEPS>"} |
|
24348 | 1005 |
and @{text "\<EXPORTCODE>"} commands: the list of constants |
22798 | 1006 |
may be omitted. Then, all constants with code theorems |
1007 |
in the current cache are referred to. |
|
1008 |
*} |
|
1009 |
||
25411 | 1010 |
(*subsection {* Code generation and proof extraction *} |
21189 | 1011 |
|
1012 |
text {* |
|
24217 | 1013 |
\fixme |
25411 | 1014 |
*}*) |
21189 | 1015 |
|
20948 | 1016 |
|
21058 | 1017 |
section {* ML interfaces \label{sec:ml} *} |
20948 | 1018 |
|
21189 | 1019 |
text {* |
1020 |
Since the code generator framework not only aims to provide |
|
1021 |
a nice Isar interface but also to form a base for |
|
1022 |
code-generation-based applications, here a short |
|
1023 |
description of the most important ML interfaces. |
|
1024 |
*} |
|
1025 |
||
24217 | 1026 |
subsection {* Executable theory content: @{text Code} *} |
21147 | 1027 |
|
1028 |
text {* |
|
1029 |
This Pure module implements the core notions of |
|
1030 |
executable content of a theory. |
|
1031 |
*} |
|
1032 |
||
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
1033 |
subsubsection {* Managing executable content *} |
20948 | 1034 |
|
21147 | 1035 |
text %mlref {* |
1036 |
\begin{mldecls} |
|
25369
5200374fda5d
replaced @{const} (allows name only) by proper @{term};
wenzelm
parents:
24994
diff
changeset
|
1037 |
@{index_ML Code.add_func: "thm -> theory -> theory"} \\ |
24217 | 1038 |
@{index_ML Code.del_func: "thm -> theory -> theory"} \\ |
24421 | 1039 |
@{index_ML Code.add_funcl: "string * thm list Susp.T -> theory -> theory"} \\ |
24217 | 1040 |
@{index_ML Code.add_inline: "thm -> theory -> theory"} \\ |
1041 |
@{index_ML Code.del_inline: "thm -> theory -> theory"} \\ |
|
1042 |
@{index_ML Code.add_inline_proc: "string * (theory -> cterm list -> thm list) |
|
21189 | 1043 |
-> theory -> theory"} \\ |
24217 | 1044 |
@{index_ML Code.del_inline_proc: "string -> theory -> theory"} \\ |
1045 |
@{index_ML Code.add_preproc: "string * (theory -> thm list -> thm list) |
|
21189 | 1046 |
-> theory -> theory"} \\ |
24217 | 1047 |
@{index_ML Code.del_preproc: "string -> theory -> theory"} \\ |
24421 | 1048 |
@{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\ |
24217 | 1049 |
@{index_ML Code.get_datatype: "theory -> string |
22479 | 1050 |
-> (string * sort) list * (string * typ list) list"} \\ |
24421 | 1051 |
@{index_ML Code.get_datatype_of_constr: "theory -> string -> string option"} |
21147 | 1052 |
\end{mldecls} |
1053 |
||
1054 |
\begin{description} |
|
1055 |
||
24217 | 1056 |
\item @{ML Code.add_func}~@{text "thm"}~@{text "thy"} adds function |
21189 | 1057 |
theorem @{text "thm"} to executable content. |
1058 |
||
24217 | 1059 |
\item @{ML Code.del_func}~@{text "thm"}~@{text "thy"} removes function |
21189 | 1060 |
theorem @{text "thm"} from executable content, if present. |
1061 |
||
24217 | 1062 |
\item @{ML Code.add_funcl}~@{text "(const, lthms)"}~@{text "thy"} adds |
22060 | 1063 |
suspended defining equations @{text lthms} for constant |
21189 | 1064 |
@{text const} to executable content. |
1065 |
||
24217 | 1066 |
\item @{ML Code.add_inline}~@{text "thm"}~@{text "thy"} adds |
21223 | 1067 |
inlining theorem @{text thm} to executable content. |
21189 | 1068 |
|
24217 | 1069 |
\item @{ML Code.del_inline}~@{text "thm"}~@{text "thy"} remove |
21189 | 1070 |
inlining theorem @{text thm} from executable content, if present. |
1071 |
||
24217 | 1072 |
\item @{ML Code.add_inline_proc}~@{text "(name, f)"}~@{text "thy"} adds |
22046 | 1073 |
inline procedure @{text f} (named @{text name}) to executable content; |
21189 | 1074 |
@{text f} is a computation of rewrite rules dependent on |
1075 |
the current theory context and the list of all arguments |
|
22060 | 1076 |
and right hand sides of the defining equations belonging |
21189 | 1077 |
to a certain function definition. |
1078 |
||
24217 | 1079 |
\item @{ML Code.del_inline_proc}~@{text "name"}~@{text "thy"} removes |
22046 | 1080 |
inline procedure named @{text name} from executable content. |
1081 |
||
24217 | 1082 |
\item @{ML Code.add_preproc}~@{text "(name, f)"}~@{text "thy"} adds |
22046 | 1083 |
generic preprocessor @{text f} (named @{text name}) to executable content; |
22060 | 1084 |
@{text f} is a transformation of the defining equations belonging |
21189 | 1085 |
to a certain function definition, depending on the |
1086 |
current theory context. |
|
1087 |
||
24217 | 1088 |
\item @{ML Code.del_preproc}~@{text "name"}~@{text "thy"} removes |
22046 | 1089 |
generic preprcoessor named @{text name} from executable content. |
1090 |
||
24421 | 1091 |
\item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds |
1092 |
a datatype to executable content, with generation |
|
1093 |
set @{text cs}. |
|
21189 | 1094 |
|
24217 | 1095 |
\item @{ML Code.get_datatype_of_constr}~@{text "thy"}~@{text "const"} |
21189 | 1096 |
returns type constructor corresponding to |
1097 |
constructor @{text const}; returns @{text NONE} |
|
1098 |
if @{text const} is no constructor. |
|
21147 | 1099 |
|
1100 |
\end{description} |
|
1101 |
*} |
|
1102 |
||
22292
3b118010ec08
adjusted to new code generator Isar commands and changes in implementation
haftmann
parents:
22188
diff
changeset
|
1103 |
subsection {* Auxiliary *} |
21147 | 1104 |
|
1105 |
text %mlref {* |
|
1106 |
\begin{mldecls} |
|
24421 | 1107 |
@{index_ML CodeUnit.read_const: "theory -> string -> string"} \\ |
1108 |
@{index_ML CodeUnit.head_func: "thm -> string * typ"} \\ |
|
24217 | 1109 |
@{index_ML CodeUnit.rewrite_func: "thm list -> thm -> thm"} \\ |
21147 | 1110 |
\end{mldecls} |
21217 | 1111 |
|
1112 |
\begin{description} |
|
1113 |
||
24217 | 1114 |
\item @{ML CodeUnit.read_const}~@{text thy}~@{text s} |
21217 | 1115 |
reads a constant as a concrete term expression @{text s}. |
1116 |
||
24217 | 1117 |
\item @{ML CodeUnit.head_func}~@{text thm} |
22751 | 1118 |
extracts the constant and its type from a defining equation @{text thm}. |
21217 | 1119 |
|
24217 | 1120 |
\item @{ML CodeUnit.rewrite_func}~@{text rews}~@{text thm} |
22060 | 1121 |
rewrites a defining equation @{text thm} with a set of rewrite |
21217 | 1122 |
rules @{text rews}; only arguments and right hand side are rewritten, |
22060 | 1123 |
not the head of the defining equation. |
21217 | 1124 |
|
1125 |
\end{description} |
|
1126 |
||
21147 | 1127 |
*} |
20948 | 1128 |
|
1129 |
subsection {* Implementing code generator applications *} |
|
1130 |
||
21147 | 1131 |
text {* |
21217 | 1132 |
Implementing code generator applications on top |
1133 |
of the framework set out so far usually not only |
|
1134 |
involves using those primitive interfaces |
|
1135 |
but also storing code-dependent data and various |
|
1136 |
other things. |
|
1137 |
||
21147 | 1138 |
\begin{warn} |
1139 |
Some interfaces discussed here have not reached |
|
1140 |
a final state yet. |
|
1141 |
Changes likely to occur in future. |
|
1142 |
\end{warn} |
|
1143 |
*} |
|
1144 |
||
1145 |
subsubsection {* Data depending on the theory's executable content *} |
|
1146 |
||
21217 | 1147 |
text {* |
21452 | 1148 |
Due to incrementality of code generation, changes in the |
1149 |
theory's executable content have to be propagated in a |
|
1150 |
certain fashion. Additionally, such changes may occur |
|
1151 |
not only during theory extension but also during theory |
|
1152 |
merge, which is a little bit nasty from an implementation |
|
1153 |
point of view. The framework provides a solution |
|
1154 |
to this technical challenge by providing a functorial |
|
1155 |
data slot @{ML_functor CodeDataFun}; on instantiation |
|
1156 |
of this functor, the following types and operations |
|
1157 |
are required: |
|
1158 |
||
21217 | 1159 |
\medskip |
1160 |
\begin{tabular}{l} |
|
1161 |
@{text "type T"} \\ |
|
1162 |
@{text "val empty: T"} \\ |
|
1163 |
@{text "val merge: Pretty.pp \<rightarrow> T * T \<rightarrow> T"} \\ |
|
24217 | 1164 |
@{text "val purge: theory option \<rightarrow> CodeUnit.const list option \<rightarrow> T \<rightarrow> T"} |
21217 | 1165 |
\end{tabular} |
1166 |
||
21452 | 1167 |
\begin{description} |
1168 |
||
1169 |
\item @{text T} the type of data to store. |
|
1170 |
||
1171 |
\item @{text empty} initial (empty) data. |
|
1172 |
||
1173 |
\item @{text merge} merging two data slots. |
|
1174 |
||
22798 | 1175 |
\item @{text purge}~@{text thy}~@{text consts} propagates changes in executable content; |
21452 | 1176 |
if possible, the current theory context is handed over |
1177 |
as argument @{text thy} (if there is no current theory context (e.g.~during |
|
22798 | 1178 |
theory merge, @{ML NONE}); @{text consts} indicates the kind |
21452 | 1179 |
of change: @{ML NONE} stands for a fundamental change |
22798 | 1180 |
which invalidates any existing code, @{text "SOME consts"} |
1181 |
hints that executable content for constants @{text consts} |
|
21452 | 1182 |
has changed. |
1183 |
||
1184 |
\end{description} |
|
1185 |
||
1186 |
An instance of @{ML_functor CodeDataFun} provides the following |
|
1187 |
interface: |
|
1188 |
||
21217 | 1189 |
\medskip |
1190 |
\begin{tabular}{l} |
|
1191 |
@{text "get: theory \<rightarrow> T"} \\ |
|
1192 |
@{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\ |
|
1193 |
@{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"} |
|
1194 |
\end{tabular} |
|
1195 |
||
1196 |
\begin{description} |
|
1197 |
||
21452 | 1198 |
\item @{text get} retrieval of the current data. |
1199 |
||
1200 |
\item @{text change} update of current data (cached!) |
|
1201 |
by giving a continuation. |
|
1202 |
||
1203 |
\item @{text change_yield} update with side result. |
|
21217 | 1204 |
|
1205 |
\end{description} |
|
1206 |
*} |
|
1207 |
||
24628 | 1208 |
(*subsubsection {* Datatype hooks *} |
21147 | 1209 |
|
21452 | 1210 |
text {* |
1211 |
Isabelle/HOL's datatype package provides a mechanism to |
|
1212 |
extend theories depending on datatype declarations: |
|
1213 |
\emph{datatype hooks}. For example, when declaring a new |
|
22060 | 1214 |
datatype, a hook proves defining equations for equality on |
21452 | 1215 |
that datatype (if possible). |
1216 |
*} |
|
1217 |
||
21217 | 1218 |
text %mlref {* |
1219 |
\begin{mldecls} |
|
21323 | 1220 |
@{index_ML_type DatatypeHooks.hook: "string list -> theory -> theory"} \\ |
21217 | 1221 |
@{index_ML DatatypeHooks.add: "DatatypeHooks.hook -> theory -> theory"} |
1222 |
\end{mldecls} |
|
21452 | 1223 |
|
1224 |
\begin{description} |
|
1225 |
||
1226 |
\item @{ML_type DatatypeHooks.hook} specifies the interface |
|
1227 |
of \emph{datatype hooks}: a theory update |
|
1228 |
depending on the list of newly introduced |
|
1229 |
datatype names. |
|
1230 |
||
1231 |
\item @{ML DatatypeHooks.add} adds a hook to the |
|
1232 |
chain of all hooks. |
|
1233 |
||
1234 |
\end{description} |
|
1235 |
*} |
|
1236 |
||
1237 |
subsubsection {* Trivial typedefs -- type copies *} |
|
1238 |
||
1239 |
text {* |
|
1240 |
Sometimes packages will introduce new types |
|
1241 |
as \emph{marked type copies} similar to Haskell's |
|
1242 |
@{text newtype} declaration (e.g. the HOL record package) |
|
1243 |
\emph{without} tinkering with the overhead of datatypes. |
|
1244 |
Technically, these type copies are trivial forms of typedefs. |
|
1245 |
Since these type copies in code generation view are nothing |
|
1246 |
else than datatypes, they have been given a own package |
|
1247 |
in order to faciliate code generation: |
|
21147 | 1248 |
*} |
21058 | 1249 |
|
21217 | 1250 |
text %mlref {* |
1251 |
\begin{mldecls} |
|
21452 | 1252 |
@{index_ML_type TypecopyPackage.info} \\ |
21217 | 1253 |
@{index_ML TypecopyPackage.add_typecopy: " |
1254 |
bstring * string list -> typ -> (bstring * bstring) option |
|
1255 |
-> theory -> (string * TypecopyPackage.info) * theory"} \\ |
|
1256 |
@{index_ML TypecopyPackage.get_typecopy_info: "theory |
|
1257 |
-> string -> TypecopyPackage.info option"} \\ |
|
1258 |
@{index_ML TypecopyPackage.get_spec: "theory -> string |
|
21452 | 1259 |
-> (string * sort) list * (string * typ list) list"} \\ |
1260 |
@{index_ML_type TypecopyPackage.hook: "string * TypecopyPackage.info -> theory -> theory"} \\ |
|
1261 |
@{index_ML TypecopyPackage.add_hook: |
|
1262 |
"TypecopyPackage.hook -> theory -> theory"} \\ |
|
21217 | 1263 |
\end{mldecls} |
21452 | 1264 |
|
1265 |
\begin{description} |
|
1266 |
||
1267 |
\item @{ML_type TypecopyPackage.info} a record containing |
|
1268 |
the specification and further data of a type copy. |
|
1269 |
||
1270 |
\item @{ML TypecopyPackage.add_typecopy} defines a new |
|
1271 |
type copy. |
|
1272 |
||
1273 |
\item @{ML TypecopyPackage.get_typecopy_info} retrieves |
|
1274 |
data of an existing type copy. |
|
1275 |
||
1276 |
\item @{ML TypecopyPackage.get_spec} retrieves datatype-like |
|
1277 |
specification of a type copy. |
|
1278 |
||
1279 |
\item @{ML_type TypecopyPackage.hook},~@{ML TypecopyPackage.add_hook} |
|
1280 |
provide a hook mechanism corresponding to the hook mechanism |
|
1281 |
on datatypes. |
|
1282 |
||
1283 |
\end{description} |
|
1284 |
*} |
|
1285 |
||
1286 |
subsubsection {* Unifying type copies and datatypes *} |
|
1287 |
||
1288 |
text {* |
|
1289 |
Since datatypes and type copies are mapped to the same concept (datatypes) |
|
1290 |
by code generation, the view on both is unified \qt{code types}: |
|
21217 | 1291 |
*} |
1292 |
||
1293 |
text %mlref {* |
|
1294 |
\begin{mldecls} |
|
21452 | 1295 |
@{index_ML_type DatatypeCodegen.hook: "(string * (bool * ((string * sort) list |
1296 |
* (string * typ list) list))) list |
|
21323 | 1297 |
-> theory -> theory"} \\ |
21217 | 1298 |
@{index_ML DatatypeCodegen.add_codetypes_hook_bootstrap: " |
1299 |
DatatypeCodegen.hook -> theory -> theory"} |
|
1300 |
\end{mldecls} |
|
1301 |
*} |
|
1302 |
||
21222 | 1303 |
text {* |
21452 | 1304 |
\begin{description} |
1305 |
||
1306 |
\item @{ML_type DatatypeCodegen.hook} specifies the code type hook |
|
1307 |
interface: a theory transformation depending on a list of |
|
1308 |
mutual recursive code types; each entry in the list |
|
1309 |
has the structure @{text "(name, (is_data, (vars, cons)))"} |
|
1310 |
where @{text name} is the name of the code type, @{text is_data} |
|
1311 |
is true iff @{text name} is a datatype rather then a type copy, |
|
1312 |
and @{text "(vars, cons)"} is the specification of the code type. |
|
1313 |
||
1314 |
\item @{ML DatatypeCodegen.add_codetypes_hook_bootstrap} adds a code |
|
1315 |
type hook; the hook is immediately processed for all already |
|
1316 |
existing datatypes, in blocks of mutual recursive datatypes, |
|
1317 |
where all datatypes a block depends on are processed before |
|
1318 |
the block. |
|
1319 |
||
1320 |
\end{description} |
|
24628 | 1321 |
*}*) |
21452 | 1322 |
|
24628 | 1323 |
text {* |
21452 | 1324 |
\emph{Happy proving, happy hacking!} |
21222 | 1325 |
*} |
21217 | 1326 |
|
20948 | 1327 |
end |