1 |
|
2 (* $Id$ *) |
|
3 |
|
4 (*<*) |
|
5 theory Codegen |
|
6 imports Main |
|
7 uses "../../../antiquote_setup.ML" |
|
8 begin |
|
9 |
|
10 ML {* Code_Target.code_width := 74 *} |
|
11 (*>*) |
|
12 |
|
13 chapter {* Code generation from Isabelle theories *} |
|
14 |
|
15 section {* Introduction *} |
|
16 |
|
17 subsection {* Motivation *} |
|
18 |
|
19 text {* |
|
20 Executing formal specifications as programs is a well-established |
|
21 topic in the theorem proving community. With increasing |
|
22 application of theorem proving systems in the area of |
|
23 software development and verification, its relevance manifests |
|
24 for running test cases and rapid prototyping. In logical |
|
25 calculi like constructive type theory, |
|
26 a notion of executability is implicit due to the nature |
|
27 of the calculus. In contrast, specifications in Isabelle |
|
28 can be highly non-executable. In order to bridge |
|
29 the gap between logic and executable specifications, |
|
30 an explicit non-trivial transformation has to be applied: |
|
31 code generation. |
|
32 |
|
33 This tutorial introduces a generic code generator for the |
|
34 Isabelle system \cite{isa-tutorial}. |
|
35 Generic in the sense that the |
|
36 \qn{target language} for which code shall ultimately be |
|
37 generated is not fixed but may be an arbitrary state-of-the-art |
|
38 functional programming language (currently, the implementation |
|
39 supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell |
|
40 \cite{haskell-revised-report}). |
|
41 We aim to provide a |
|
42 versatile environment |
|
43 suitable for software development and verification, |
|
44 structuring the process |
|
45 of code generation into a small set of orthogonal principles |
|
46 while achieving a big coverage of application areas |
|
47 with maximum flexibility. |
|
48 |
|
49 Conceptually the code generator framework is part |
|
50 of Isabelle's @{text Pure} meta logic; the object logic |
|
51 @{text HOL} which is an extension of @{text Pure} |
|
52 already comes with a reasonable framework setup and thus provides |
|
53 a good working horse for raising code-generation-driven |
|
54 applications. So, we assume some familiarity and experience |
|
55 with the ingredients of the @{text HOL} \emph{Main} theory |
|
56 (see also \cite{isa-tutorial}). |
|
57 *} |
|
58 |
|
59 |
|
60 subsection {* Overview *} |
|
61 |
|
62 text {* |
|
63 The code generator aims to be usable with no further ado |
|
64 in most cases while allowing for detailed customization. |
|
65 This manifests in the structure of this tutorial: |
|
66 we start with a generic example \secref{sec:example} |
|
67 and introduce code generation concepts \secref{sec:concept}. |
|
68 Section |
|
69 \secref{sec:basics} explains how to use the framework naively, |
|
70 presuming a reasonable default setup. Then, section |
|
71 \secref{sec:advanced} deals with advanced topics, |
|
72 introducing further aspects of the code generator framework |
|
73 in a motivation-driven manner. Last, section \secref{sec:ml} |
|
74 introduces the framework's internal programming interfaces. |
|
75 |
|
76 \begin{warn} |
|
77 Ultimately, the code generator which this tutorial deals with |
|
78 is supposed to replace the already established code generator |
|
79 by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. |
|
80 So, for the moment, there are two distinct code generators |
|
81 in Isabelle. |
|
82 Also note that while the framework itself is |
|
83 object-logic independent, only @{text HOL} provides a reasonable |
|
84 framework setup. |
|
85 \end{warn} |
|
86 *} |
|
87 |
|
88 |
|
89 section {* An example: a simple theory of search trees \label{sec:example} *} |
|
90 |
|
91 text {* |
|
92 When writing executable specifications using @{text HOL}, |
|
93 it is convenient to use |
|
94 three existing packages: the datatype package for defining |
|
95 datatypes, the function package for (recursive) functions, |
|
96 and the class package for overloaded definitions. |
|
97 |
|
98 We develope a small theory of search trees; trees are represented |
|
99 as a datatype with key type @{typ "'a"} and value type @{typ "'b"}: |
|
100 *} |
|
101 |
|
102 datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b |
|
103 | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree" |
|
104 |
|
105 text {* |
|
106 \noindent Note that we have constrained the type of keys |
|
107 to the class of total orders, @{text linorder}. |
|
108 |
|
109 We define @{text find} and @{text update} functions: |
|
110 *} |
|
111 |
|
112 primrec |
|
113 find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where |
|
114 "find (Leaf key val) it = (if it = key then Some val else None)" |
|
115 | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)" |
|
116 |
|
117 fun |
|
118 update :: "'a\<Colon>linorder \<times> 'b \<Rightarrow> ('a, 'b) searchtree \<Rightarrow> ('a, 'b) searchtree" where |
|
119 "update (it, entry) (Leaf key val) = ( |
|
120 if it = key then Leaf key entry |
|
121 else if it \<le> key |
|
122 then Branch (Leaf it entry) it (Leaf key val) |
|
123 else Branch (Leaf key val) it (Leaf it entry) |
|
124 )" |
|
125 | "update (it, entry) (Branch t1 key t2) = ( |
|
126 if it \<le> key |
|
127 then (Branch (update (it, entry) t1) key t2) |
|
128 else (Branch t1 key (update (it, entry) t2)) |
|
129 )" |
|
130 |
|
131 text {* |
|
132 \noindent For testing purpose, we define a small example |
|
133 using natural numbers @{typ nat} (which are a @{text linorder}) |
|
134 as keys and list of nats as values: |
|
135 *} |
|
136 |
|
137 definition |
|
138 example :: "(nat, nat list) searchtree" |
|
139 where |
|
140 "example = update (Suc (Suc (Suc (Suc 0))), [Suc (Suc 0), Suc (Suc 0)]) (update (Suc (Suc (Suc 0)), [Suc (Suc (Suc 0))]) |
|
141 (update (Suc (Suc 0), [Suc (Suc 0)]) (Leaf (Suc 0) [])))" |
|
142 |
|
143 text {* |
|
144 \noindent Then we generate code |
|
145 *} |
|
146 |
|
147 export_code example (*<*)in SML (*>*)in SML file "examples/tree.ML" |
|
148 |
|
149 text {* |
|
150 \noindent which looks like: |
|
151 \lstsml{Thy/examples/tree.ML} |
|
152 *} |
|
153 |
|
154 |
|
155 section {* Code generation concepts and process \label{sec:concept} *} |
|
156 |
|
157 text {* |
|
158 \begin{figure}[h] |
|
159 \centering |
|
160 \includegraphics[width=0.7\textwidth]{codegen_process} |
|
161 \caption{code generator -- processing overview} |
|
162 \label{fig:process} |
|
163 \end{figure} |
|
164 |
|
165 The code generator employs a notion of executability |
|
166 for three foundational executable ingredients known |
|
167 from functional programming: |
|
168 \emph{defining equations}, \emph{datatypes}, and |
|
169 \emph{type classes}. A defining equation as a first approximation |
|
170 is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"} |
|
171 (an equation headed by a constant @{text f} with arguments |
|
172 @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}). |
|
173 Code generation aims to turn defining equations |
|
174 into a functional program by running through |
|
175 a process (see figure \ref{fig:process}): |
|
176 |
|
177 \begin{itemize} |
|
178 |
|
179 \item Out of the vast collection of theorems proven in a |
|
180 \qn{theory}, a reasonable subset modeling |
|
181 defining equations is \qn{selected}. |
|
182 |
|
183 \item On those selected theorems, certain |
|
184 transformations are carried out |
|
185 (\qn{preprocessing}). Their purpose is to turn theorems |
|
186 representing non- or badly executable |
|
187 specifications into equivalent but executable counterparts. |
|
188 The result is a structured collection of \qn{code theorems}. |
|
189 |
|
190 \item These \qn{code theorems} then are \qn{translated} |
|
191 into an Haskell-like intermediate |
|
192 language. |
|
193 |
|
194 \item Finally, out of the intermediate language the final |
|
195 code in the desired \qn{target language} is \qn{serialized}. |
|
196 |
|
197 \end{itemize} |
|
198 |
|
199 From these steps, only the two last are carried out |
|
200 outside the logic; by keeping this layer as |
|
201 thin as possible, the amount of code to trust is |
|
202 kept to a minimum. |
|
203 *} |
|
204 |
|
205 |
|
206 |
|
207 section {* Basics \label{sec:basics} *} |
|
208 |
|
209 subsection {* Invoking the code generator *} |
|
210 |
|
211 text {* |
|
212 Thanks to a reasonable setup of the @{text HOL} theories, in |
|
213 most cases code generation proceeds without further ado: |
|
214 *} |
|
215 |
|
216 primrec |
|
217 fac :: "nat \<Rightarrow> nat" where |
|
218 "fac 0 = 1" |
|
219 | "fac (Suc n) = Suc n * fac n" |
|
220 |
|
221 text {* |
|
222 \noindent This executable specification is now turned to SML code: |
|
223 *} |
|
224 |
|
225 export_code fac (*<*)in SML(*>*)in SML file "examples/fac.ML" |
|
226 |
|
227 text {* |
|
228 \noindent The @{text "\<EXPORTCODE>"} command takes a space-separated list of |
|
229 constants together with \qn{serialization directives} |
|
230 These start with a \qn{target language} |
|
231 identifier, followed by a file specification |
|
232 where to write the generated code to. |
|
233 |
|
234 Internally, the defining equations for all selected |
|
235 constants are taken, including any transitively required |
|
236 constants, datatypes and classes, resulting in the following |
|
237 code: |
|
238 |
|
239 \lstsml{Thy/examples/fac.ML} |
|
240 |
|
241 The code generator will complain when a required |
|
242 ingredient does not provide a executable counterpart, |
|
243 e.g.~generating code |
|
244 for constants not yielding |
|
245 a defining equation (e.g.~the Hilbert choice |
|
246 operation @{text "SOME"}): |
|
247 *} |
|
248 (*<*) |
|
249 setup {* Sign.add_path "foo" *} |
|
250 (*>*) |
|
251 definition |
|
252 pick_some :: "'a list \<Rightarrow> 'a" where |
|
253 "pick_some xs = (SOME x. x \<in> set xs)" |
|
254 (*<*) |
|
255 hide const pick_some |
|
256 |
|
257 setup {* Sign.parent_path *} |
|
258 |
|
259 definition |
|
260 pick_some :: "'a list \<Rightarrow> 'a" where |
|
261 "pick_some = hd" |
|
262 (*>*) |
|
263 |
|
264 export_code pick_some in SML file "examples/fail_const.ML" |
|
265 |
|
266 text {* \noindent will fail. *} |
|
267 |
|
268 subsection {* Theorem selection *} |
|
269 |
|
270 text {* |
|
271 The list of all defining equations in a theory may be inspected |
|
272 using the @{text "\<PRINTCODESETUP>"} command: |
|
273 *} |
|
274 |
|
275 print_codesetup |
|
276 |
|
277 text {* |
|
278 \noindent which displays a table of constant with corresponding |
|
279 defining equations (the additional stuff displayed |
|
280 shall not bother us for the moment). |
|
281 |
|
282 The typical @{text HOL} tools are already set up in a way that |
|
283 function definitions introduced by @{text "\<DEFINITION>"}, |
|
284 @{text "\<PRIMREC>"}, @{text "\<FUN>"}, |
|
285 @{text "\<FUNCTION>"}, @{text "\<CONSTDEFS>"}, |
|
286 @{text "\<RECDEF>"} are implicitly propagated |
|
287 to this defining equation table. Specific theorems may be |
|
288 selected using an attribute: \emph{code func}. As example, |
|
289 a weight selector function: |
|
290 *} |
|
291 |
|
292 primrec |
|
293 pick :: "(nat \<times> 'a) list \<Rightarrow> nat \<Rightarrow> 'a" where |
|
294 "pick (x#xs) n = (let (k, v) = x in |
|
295 if n < k then v else pick xs (n - k))" |
|
296 |
|
297 text {* |
|
298 \noindent We want to eliminate the explicit destruction |
|
299 of @{term x} to @{term "(k, v)"}: |
|
300 *} |
|
301 |
|
302 lemma [code func]: |
|
303 "pick ((k, v)#xs) n = (if n < k then v else pick xs (n - k))" |
|
304 by simp |
|
305 |
|
306 export_code pick (*<*)in SML(*>*) in SML file "examples/pick1.ML" |
|
307 |
|
308 text {* |
|
309 \noindent This theorem now is used for generating code: |
|
310 |
|
311 \lstsml{Thy/examples/pick1.ML} |
|
312 |
|
313 \noindent The policy is that \emph{default equations} stemming from |
|
314 @{text "\<DEFINITION>"}, |
|
315 @{text "\<PRIMREC>"}, @{text "\<FUN>"}, |
|
316 @{text "\<FUNCTION>"}, @{text "\<CONSTDEFS>"}, |
|
317 @{text "\<RECDEF>"} statements are discarded as soon as an |
|
318 equation is explicitly selected by means of \emph{code func}. |
|
319 Further applications of \emph{code func} add theorems incrementally, |
|
320 but syntactic redundancies are implicitly dropped. For example, |
|
321 using a modified version of the @{const fac} function |
|
322 as defining equation, the then redundant (since |
|
323 syntactically subsumed) original defining equations |
|
324 are dropped. |
|
325 |
|
326 \begin{warn} |
|
327 The attributes \emph{code} and \emph{code del} |
|
328 associated with the existing code generator also apply to |
|
329 the new one: \emph{code} implies \emph{code func}, |
|
330 and \emph{code del} implies \emph{code func del}. |
|
331 \end{warn} |
|
332 *} |
|
333 |
|
334 subsection {* Type classes *} |
|
335 |
|
336 text {* |
|
337 Type classes enter the game via the Isar class package. |
|
338 For a short introduction how to use it, see \cite{isabelle-classes}; |
|
339 here we just illustrate its impact on code generation. |
|
340 |
|
341 In a target language, type classes may be represented |
|
342 natively (as in the case of Haskell). For languages |
|
343 like SML, they are implemented using \emph{dictionaries}. |
|
344 Our following example specifies a class \qt{null}, |
|
345 assigning to each of its inhabitants a \qt{null} value: |
|
346 *} |
|
347 |
|
348 class null = type + |
|
349 fixes null :: 'a |
|
350 |
|
351 primrec |
|
352 head :: "'a\<Colon>null list \<Rightarrow> 'a" where |
|
353 "head [] = null" |
|
354 | "head (x#xs) = x" |
|
355 |
|
356 text {* |
|
357 \noindent We provide some instances for our @{text null}: |
|
358 *} |
|
359 |
|
360 instantiation option and list :: (type) null |
|
361 begin |
|
362 |
|
363 definition |
|
364 "null = None" |
|
365 |
|
366 definition |
|
367 "null = []" |
|
368 |
|
369 instance .. |
|
370 |
1 |
371 end |
2 end |
372 |
|
373 text {* |
|
374 \noindent Constructing a dummy example: |
|
375 *} |
|
376 |
|
377 definition |
|
378 "dummy = head [Some (Suc 0), None]" |
|
379 |
|
380 text {* |
|
381 Type classes offer a suitable occasion to introduce |
|
382 the Haskell serializer. Its usage is almost the same |
|
383 as SML, but, in accordance with conventions |
|
384 some Haskell systems enforce, each module ends |
|
385 up in a single file. The module hierarchy is reflected in |
|
386 the file system, with root directory given as file specification. |
|
387 *} |
|
388 |
|
389 export_code dummy in Haskell file "examples/" |
|
390 (* NOTE: you may use Haskell only once in this document, otherwise |
|
391 you have to work in distinct subdirectories *) |
|
392 |
|
393 text {* |
|
394 \lsthaskell{Thy/examples/Codegen.hs} |
|
395 \noindent (we have left out all other modules). |
|
396 |
|
397 \medskip |
|
398 |
|
399 The whole code in SML with explicit dictionary passing: |
|
400 *} |
|
401 |
|
402 export_code dummy (*<*)in SML(*>*)in SML file "examples/class.ML" |
|
403 |
|
404 text {* |
|
405 \lstsml{Thy/examples/class.ML} |
|
406 |
|
407 \medskip |
|
408 |
|
409 \noindent or in OCaml: |
|
410 *} |
|
411 |
|
412 export_code dummy in OCaml file "examples/class.ocaml" |
|
413 |
|
414 text {* |
|
415 \lstsml{Thy/examples/class.ocaml} |
|
416 |
|
417 \medskip The explicit association of constants |
|
418 to classes can be inspected using the @{text "\<PRINTCLASSES>"} |
|
419 command. |
|
420 *} |
|
421 |
|
422 |
|
423 section {* Recipes and advanced topics \label{sec:advanced} *} |
|
424 |
|
425 text {* |
|
426 In this tutorial, we do not attempt to give an exhaustive |
|
427 description of the code generator framework; instead, |
|
428 we cast a light on advanced topics by introducing |
|
429 them together with practically motivated examples. Concerning |
|
430 further reading, see |
|
431 |
|
432 \begin{itemize} |
|
433 |
|
434 \item the Isabelle/Isar Reference Manual \cite{isabelle-isar-ref} |
|
435 for exhaustive syntax diagrams. |
|
436 \item or \cite{Haftmann-Nipkow:2007:codegen} which deals with foundational issues |
|
437 of the code generator framework. |
|
438 |
|
439 \end{itemize} |
|
440 *} |
|
441 |
|
442 subsection {* Library theories \label{sec:library} *} |
|
443 |
|
444 text {* |
|
445 The @{text HOL} @{text Main} theory already provides a code |
|
446 generator setup |
|
447 which should be suitable for most applications. Common extensions |
|
448 and modifications are available by certain theories of the @{text HOL} |
|
449 library; beside being useful in applications, they may serve |
|
450 as a tutorial for customizing the code generator setup. |
|
451 |
|
452 \begin{description} |
|
453 |
|
454 \item[@{text "Code_Integer"}] represents @{text HOL} integers by big |
|
455 integer literals in target languages. |
|
456 \item[@{text "Code_Char"}] represents @{text HOL} characters by |
|
457 character literals in target languages. |
|
458 \item[@{text "Code_Char_chr"}] like @{text "Code_Char"}, |
|
459 but also offers treatment of character codes; includes |
|
460 @{text "Code_Integer"}. |
|
461 \item[@{text "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers, |
|
462 which in general will result in higher efficency; pattern |
|
463 matching with @{term "0\<Colon>nat"} / @{const "Suc"} |
|
464 is eliminated; includes @{text "Code_Integer"}. |
|
465 \item[@{text "Code_Index"}] provides an additional datatype |
|
466 @{text index} which is mapped to target-language built-in integers. |
|
467 Useful for code setups which involve e.g. indexing of |
|
468 target-language arrays. |
|
469 \item[@{text "Code_Message"}] provides an additional datatype |
|
470 @{text message_string} which is isomorphic to strings; |
|
471 @{text message_string}s are mapped to target-language strings. |
|
472 Useful for code setups which involve e.g. printing (error) messages. |
|
473 |
|
474 \end{description} |
|
475 |
|
476 \begin{warn} |
|
477 When importing any of these theories, they should form the last |
|
478 items in an import list. Since these theories adapt the |
|
479 code generator setup in a non-conservative fashion, |
|
480 strange effects may occur otherwise. |
|
481 \end{warn} |
|
482 *} |
|
483 |
|
484 subsection {* Preprocessing *} |
|
485 |
|
486 text {* |
|
487 Before selected function theorems are turned into abstract |
|
488 code, a chain of definitional transformation steps is carried |
|
489 out: \emph{preprocessing}. In essence, the preprocessor |
|
490 consists of two components: a \emph{simpset} and \emph{function transformers}. |
|
491 |
|
492 The \emph{simpset} allows to employ the full generality of the Isabelle |
|
493 simplifier. Due to the interpretation of theorems |
|
494 as defining equations, rewrites are applied to the right |
|
495 hand side and the arguments of the left hand side of an |
|
496 equation, but never to the constant heading the left hand side. |
|
497 An important special case are \emph{inline theorems} which may be |
|
498 declared an undeclared using the |
|
499 \emph{code inline} or \emph{code inline del} attribute respectively. |
|
500 Some common applications: |
|
501 *} |
|
502 |
|
503 text_raw {* |
|
504 \begin{itemize} |
|
505 *} |
|
506 |
|
507 text {* |
|
508 \item replacing non-executable constructs by executable ones: |
|
509 *} |
|
510 |
|
511 lemma [code inline]: |
|
512 "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all |
|
513 |
|
514 text {* |
|
515 \item eliminating superfluous constants: |
|
516 *} |
|
517 |
|
518 lemma [code inline]: |
|
519 "1 = Suc 0" by simp |
|
520 |
|
521 text {* |
|
522 \item replacing executable but inconvenient constructs: |
|
523 *} |
|
524 |
|
525 lemma [code inline]: |
|
526 "xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all |
|
527 |
|
528 text_raw {* |
|
529 \end{itemize} |
|
530 *} |
|
531 |
|
532 text {* |
|
533 |
|
534 \emph{Function transformers} provide a very general interface, |
|
535 transforming a list of function theorems to another |
|
536 list of function theorems, provided that neither the heading |
|
537 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc} |
|
538 pattern elimination implemented in |
|
539 theory @{text "Efficient_Nat"} (see \secref{eff_nat}) uses this |
|
540 interface. |
|
541 |
|
542 \noindent The current setup of the preprocessor may be inspected using |
|
543 the @{text "\<PRINTCODESETUP>"} command. |
|
544 |
|
545 \begin{warn} |
|
546 The attribute \emph{code unfold} |
|
547 associated with the existing code generator also applies to |
|
548 the new one: \emph{code unfold} implies \emph{code inline}. |
|
549 \end{warn} |
|
550 *} |
|
551 |
|
552 |
|
553 subsection {* Concerning operational equality *} |
|
554 |
|
555 text {* |
|
556 Surely you have already noticed how equality is treated |
|
557 by the code generator: |
|
558 *} |
|
559 |
|
560 primrec |
|
561 collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where |
|
562 "collect_duplicates xs ys [] = xs" |
|
563 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs |
|
564 then if z \<in> set ys |
|
565 then collect_duplicates xs ys zs |
|
566 else collect_duplicates xs (z#ys) zs |
|
567 else collect_duplicates (z#xs) (z#ys) zs)" |
|
568 |
|
569 text {* |
|
570 The membership test during preprocessing is rewritten, |
|
571 resulting in @{const List.member}, which itself |
|
572 performs an explicit equality check. |
|
573 *} |
|
574 |
|
575 export_code collect_duplicates (*<*)in SML(*>*)in SML file "examples/collect_duplicates.ML" |
|
576 |
|
577 text {* |
|
578 \lstsml{Thy/examples/collect_duplicates.ML} |
|
579 *} |
|
580 |
|
581 text {* |
|
582 Obviously, polymorphic equality is implemented the Haskell |
|
583 way using a type class. How is this achieved? HOL introduces |
|
584 an explicit class @{text eq} with a corresponding operation |
|
585 @{const eq_class.eq} such that @{thm eq [no_vars]}. |
|
586 The preprocessing framework does the rest. |
|
587 For datatypes, instances of @{text eq} are implicitly derived |
|
588 when possible. For other types, you may instantiate @{text eq} |
|
589 manually like any other type class. |
|
590 |
|
591 Though this @{text eq} class is designed to get rarely in |
|
592 the way, a subtlety |
|
593 enters the stage when definitions of overloaded constants |
|
594 are dependent on operational equality. For example, let |
|
595 us define a lexicographic ordering on tuples: |
|
596 *} |
|
597 |
|
598 instantiation * :: (ord, ord) ord |
|
599 begin |
|
600 |
|
601 definition |
|
602 [code func del]: "p1 < p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in |
|
603 x1 < x2 \<or> (x1 = x2 \<and> y1 < y2))" |
|
604 |
|
605 definition |
|
606 [code func del]: "p1 \<le> p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in |
|
607 x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2))" |
|
608 |
|
609 instance .. |
|
610 |
|
611 end |
|
612 |
|
613 lemma ord_prod [code func(*<*), code func del(*>*)]: |
|
614 "(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)" |
|
615 "(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)" |
|
616 unfolding less_prod_def less_eq_prod_def by simp_all |
|
617 |
|
618 text {* |
|
619 Then code generation will fail. Why? The definition |
|
620 of @{term "op \<le>"} depends on equality on both arguments, |
|
621 which are polymorphic and impose an additional @{text eq} |
|
622 class constraint, thus violating the type discipline |
|
623 for class operations. |
|
624 |
|
625 The solution is to add @{text eq} explicitly to the first sort arguments in the |
|
626 code theorems: |
|
627 *} |
|
628 |
|
629 lemma ord_prod_code [code func]: |
|
630 "(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow> |
|
631 x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)" |
|
632 "(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow> |
|
633 x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)" |
|
634 unfolding ord_prod by rule+ |
|
635 |
|
636 text {* |
|
637 \noindent Then code generation succeeds: |
|
638 *} |
|
639 |
|
640 export_code "op \<le> \<Colon> 'a\<Colon>{eq, ord} \<times> 'b\<Colon>ord \<Rightarrow> 'a \<times> 'b \<Rightarrow> bool" |
|
641 (*<*)in SML(*>*)in SML file "examples/lexicographic.ML" |
|
642 |
|
643 text {* |
|
644 \lstsml{Thy/examples/lexicographic.ML} |
|
645 *} |
|
646 |
|
647 text {* |
|
648 In general, code theorems for overloaded constants may have more |
|
649 restrictive sort constraints than the underlying instance relation |
|
650 between class and type constructor as long as the whole system of |
|
651 constraints is coregular; code theorems violating coregularity |
|
652 are rejected immediately. Consequently, it might be necessary |
|
653 to delete disturbing theorems in the code theorem table, |
|
654 as we have done here with the original definitions @{text less_prod_def} |
|
655 and @{text less_eq_prod_def}. |
|
656 |
|
657 In some cases, the automatically derived defining equations |
|
658 for equality on a particular type may not be appropriate. |
|
659 As example, watch the following datatype representing |
|
660 monomorphic parametric types (where type constructors |
|
661 are referred to by natural numbers): |
|
662 *} |
|
663 |
|
664 datatype monotype = Mono nat "monotype list" |
|
665 (*<*) |
|
666 lemma monotype_eq: |
|
667 "Mono tyco1 typargs1 = Mono tyco2 typargs2 \<equiv> |
|
668 tyco1 = tyco2 \<and> typargs1 = typargs2" by simp |
|
669 (*>*) |
|
670 |
|
671 text {* |
|
672 Then code generation for SML would fail with a message |
|
673 that the generated code conains illegal mutual dependencies: |
|
674 the theorem @{thm monotype_eq [no_vars]} already requires the |
|
675 instance @{text "monotype \<Colon> eq"}, which itself requires |
|
676 @{thm monotype_eq [no_vars]}; Haskell has no problem with mutually |
|
677 recursive @{text instance} and @{text function} definitions, |
|
678 but the SML serializer does not support this. |
|
679 |
|
680 In such cases, you have to provide you own equality equations |
|
681 involving auxiliary constants. In our case, |
|
682 @{const [show_types] list_all2} can do the job: |
|
683 *} |
|
684 |
|
685 lemma monotype_eq_list_all2 [code func]: |
|
686 "Mono tyco1 typargs1 = Mono tyco2 typargs2 \<longleftrightarrow> |
|
687 tyco1 = tyco2 \<and> list_all2 (op =) typargs1 typargs2" |
|
688 by (simp add: list_all2_eq [symmetric]) |
|
689 |
|
690 text {* |
|
691 does not depend on instance @{text "monotype \<Colon> eq"}: |
|
692 *} |
|
693 |
|
694 export_code "op = :: monotype \<Rightarrow> monotype \<Rightarrow> bool" |
|
695 (*<*)in SML(*>*)in SML file "examples/monotype.ML" |
|
696 |
|
697 text {* |
|
698 \lstsml{Thy/examples/monotype.ML} |
|
699 *} |
|
700 |
|
701 subsection {* Programs as sets of theorems *} |
|
702 |
|
703 text {* |
|
704 As told in \secref{sec:concept}, code generation is based |
|
705 on a structured collection of code theorems. |
|
706 For explorative purpose, this collection |
|
707 may be inspected using the @{text "\<CODETHMS>"} command: |
|
708 *} |
|
709 |
|
710 code_thms "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat" |
|
711 |
|
712 text {* |
|
713 \noindent prints a table with \emph{all} defining equations |
|
714 for @{term "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat"}, including |
|
715 \emph{all} defining equations those equations depend |
|
716 on recursivly. @{text "\<CODETHMS>"} provides a convenient |
|
717 mechanism to inspect the impact of a preprocessor setup |
|
718 on defining equations. |
|
719 |
|
720 Similarly, the @{text "\<CODEDEPS>"} command shows a graph |
|
721 visualizing dependencies between defining equations. |
|
722 *} |
|
723 |
|
724 |
|
725 subsection {* Constructor sets for datatypes *} |
|
726 |
|
727 text {* |
|
728 Conceptually, any datatype is spanned by a set of |
|
729 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} |
|
730 where @{text "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is excactly the set of \emph{all} |
|
731 type variables in @{text "\<tau>"}. The HOL datatype package |
|
732 by default registers any new datatype in the table |
|
733 of datatypes, which may be inspected using |
|
734 the @{text "\<PRINTCODESETUP>"} command. |
|
735 |
|
736 In some cases, it may be convenient to alter or |
|
737 extend this table; as an example, we will develope an alternative |
|
738 representation of natural numbers as binary digits, whose |
|
739 size does increase logarithmically with its value, not linear |
|
740 \footnote{Indeed, the @{text "Efficient_Nat"} theory (see \ref{eff_nat}) |
|
741 does something similar}. First, the digit representation: |
|
742 *} |
|
743 |
|
744 definition Dig0 :: "nat \<Rightarrow> nat" where |
|
745 "Dig0 n = 2 * n" |
|
746 |
|
747 definition Dig1 :: "nat \<Rightarrow> nat" where |
|
748 "Dig1 n = Suc (2 * n)" |
|
749 |
|
750 text {* |
|
751 \noindent We will use these two ">digits"< to represent natural numbers |
|
752 in binary digits, e.g.: |
|
753 *} |
|
754 |
|
755 lemma 42: "42 = Dig0 (Dig1 (Dig0 (Dig1 (Dig0 1))))" |
|
756 by (simp add: Dig0_def Dig1_def) |
|
757 |
|
758 text {* |
|
759 \noindent Of course we also have to provide proper code equations for |
|
760 the operations, e.g. @{term "op + \<Colon> nat \<Rightarrow> nat \<Rightarrow> nat"}: |
|
761 *} |
|
762 |
|
763 lemma plus_Dig [code func]: |
|
764 "0 + n = n" |
|
765 "m + 0 = m" |
|
766 "1 + Dig0 n = Dig1 n" |
|
767 "Dig0 m + 1 = Dig1 m" |
|
768 "1 + Dig1 n = Dig0 (n + 1)" |
|
769 "Dig1 m + 1 = Dig0 (m + 1)" |
|
770 "Dig0 m + Dig0 n = Dig0 (m + n)" |
|
771 "Dig0 m + Dig1 n = Dig1 (m + n)" |
|
772 "Dig1 m + Dig0 n = Dig1 (m + n)" |
|
773 "Dig1 m + Dig1 n = Dig0 (m + n + 1)" |
|
774 by (simp_all add: Dig0_def Dig1_def) |
|
775 |
|
776 text {* |
|
777 \noindent We then instruct the code generator to view @{term "0\<Colon>nat"}, |
|
778 @{term "1\<Colon>nat"}, @{term Dig0} and @{term Dig1} as |
|
779 datatype constructors: |
|
780 *} |
|
781 |
|
782 code_datatype "0\<Colon>nat" "1\<Colon>nat" Dig0 Dig1 |
|
783 |
|
784 text {* |
|
785 \noindent For the former constructor @{term Suc}, we provide a code |
|
786 equation and remove some parts of the default code generator setup |
|
787 which are an obstacle here: |
|
788 *} |
|
789 |
|
790 lemma Suc_Dig [code func]: |
|
791 "Suc n = n + 1" |
|
792 by simp |
|
793 |
|
794 declare One_nat_def [code inline del] |
|
795 declare add_Suc_shift [code func del] |
|
796 |
|
797 text {* |
|
798 \noindent This yields the following code: |
|
799 *} |
|
800 |
|
801 export_code "op + \<Colon> nat \<Rightarrow> nat \<Rightarrow> nat" (*<*)in SML(*>*) in SML file "examples/nat_binary.ML" |
|
802 |
|
803 text {* \lstsml{Thy/examples/nat_binary.ML} *} |
|
804 |
|
805 text {* |
|
806 \medskip |
|
807 |
|
808 From this example, it can be easily glimpsed that using own constructor sets |
|
809 is a little delicate since it changes the set of valid patterns for values |
|
810 of that type. Without going into much detail, here some practical hints: |
|
811 |
|
812 \begin{itemize} |
|
813 \item When changing the constuctor set for datatypes, take care to |
|
814 provide an alternative for the @{text case} combinator (e.g. by replacing |
|
815 it using the preprocessor). |
|
816 \item Values in the target language need not to be normalized -- different |
|
817 values in the target language may represent the same value in the |
|
818 logic (e.g. @{text "Dig1 0 = 1"}). |
|
819 \item Usually, a good methodology to deal with the subleties of pattern |
|
820 matching is to see the type as an abstract type: provide a set |
|
821 of operations which operate on the concrete representation of the type, |
|
822 and derive further operations by combinations of these primitive ones, |
|
823 without relying on a particular representation. |
|
824 \end{itemize} |
|
825 *} |
|
826 |
|
827 code_datatype %invisible "0::nat" Suc |
|
828 declare %invisible plus_Dig [code func del] |
|
829 declare %invisible One_nat_def [code inline] |
|
830 declare %invisible add_Suc_shift [code func] |
|
831 lemma %invisible [code func]: "0 + n = (n \<Colon> nat)" by simp |
|
832 |
|
833 |
|
834 subsection {* Customizing serialization *} |
|
835 |
|
836 subsubsection {* Basics *} |
|
837 |
|
838 text {* |
|
839 Consider the following function and its corresponding |
|
840 SML code: |
|
841 *} |
|
842 |
|
843 primrec |
|
844 in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where |
|
845 "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" |
|
846 (*<*) |
|
847 code_type %tt bool |
|
848 (SML) |
|
849 code_const %tt True and False and "op \<and>" and Not |
|
850 (SML and and and) |
|
851 (*>*) |
|
852 export_code in_interval (*<*)in SML(*>*)in SML file "examples/bool_literal.ML" |
|
853 |
|
854 text {* |
|
855 \lstsml{Thy/examples/bool_literal.ML} |
|
856 |
|
857 \noindent Though this is correct code, it is a little bit unsatisfactory: |
|
858 boolean values and operators are materialized as distinguished |
|
859 entities with have nothing to do with the SML-builtin notion |
|
860 of \qt{bool}. This results in less readable code; |
|
861 additionally, eager evaluation may cause programs to |
|
862 loop or break which would perfectly terminate when |
|
863 the existing SML \qt{bool} would be used. To map |
|
864 the HOL \qt{bool} on SML \qt{bool}, we may use |
|
865 \qn{custom serializations}: |
|
866 *} |
|
867 |
|
868 code_type %tt bool |
|
869 (SML "bool") |
|
870 code_const %tt True and False and "op \<and>" |
|
871 (SML "true" and "false" and "_ andalso _") |
|
872 |
|
873 text {* |
|
874 The @{text "\<CODETYPE>"} commad takes a type constructor |
|
875 as arguments together with a list of custom serializations. |
|
876 Each custom serialization starts with a target language |
|
877 identifier followed by an expression, which during |
|
878 code serialization is inserted whenever the type constructor |
|
879 would occur. For constants, @{text "\<CODECONST>"} implements |
|
880 the corresponding mechanism. Each ``@{verbatim "_"}'' in |
|
881 a serialization expression is treated as a placeholder |
|
882 for the type constructor's (the constant's) arguments. |
|
883 *} |
|
884 |
|
885 export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_mlbool.ML" |
|
886 |
|
887 text {* |
|
888 \lstsml{Thy/examples/bool_mlbool.ML} |
|
889 |
|
890 \noindent This still is not perfect: the parentheses |
|
891 around the \qt{andalso} expression are superfluous. |
|
892 Though the serializer |
|
893 by no means attempts to imitate the rich Isabelle syntax |
|
894 framework, it provides some common idioms, notably |
|
895 associative infixes with precedences which may be used here: |
|
896 *} |
|
897 |
|
898 code_const %tt "op \<and>" |
|
899 (SML infixl 1 "andalso") |
|
900 |
|
901 export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_infix.ML" |
|
902 |
|
903 text {* |
|
904 \lstsml{Thy/examples/bool_infix.ML} |
|
905 |
|
906 \medskip |
|
907 |
|
908 Next, we try to map HOL pairs to SML pairs, using the |
|
909 infix ``@{verbatim "*"}'' type constructor and parentheses: |
|
910 *} |
|
911 (*<*) |
|
912 code_type * |
|
913 (SML) |
|
914 code_const Pair |
|
915 (SML) |
|
916 (*>*) |
|
917 code_type %tt * |
|
918 (SML infix 2 "*") |
|
919 code_const %tt Pair |
|
920 (SML "!((_),/ (_))") |
|
921 |
|
922 text {* |
|
923 The initial bang ``@{verbatim "!"}'' tells the serializer to never put |
|
924 parentheses around the whole expression (they are already present), |
|
925 while the parentheses around argument place holders |
|
926 tell not to put parentheses around the arguments. |
|
927 The slash ``@{verbatim "/"}'' (followed by arbitrary white space) |
|
928 inserts a space which may be used as a break if necessary |
|
929 during pretty printing. |
|
930 |
|
931 These examples give a glimpse what mechanisms |
|
932 custom serializations provide; however their usage |
|
933 requires careful thinking in order not to introduce |
|
934 inconsistencies -- or, in other words: |
|
935 custom serializations are completely axiomatic. |
|
936 |
|
937 A further noteworthy details is that any special |
|
938 character in a custom serialization may be quoted |
|
939 using ``@{verbatim "'"}''; thus, in |
|
940 ``@{verbatim "fn '_ => _"}'' the first |
|
941 ``@{verbatim "_"}'' is a proper underscore while the |
|
942 second ``@{verbatim "_"}'' is a placeholder. |
|
943 |
|
944 The HOL theories provide further |
|
945 examples for custom serializations. |
|
946 *} |
|
947 |
|
948 |
|
949 subsubsection {* Haskell serialization *} |
|
950 |
|
951 text {* |
|
952 For convenience, the default |
|
953 HOL setup for Haskell maps the @{text eq} class to |
|
954 its counterpart in Haskell, giving custom serializations |
|
955 for the class (@{text "\<CODECLASS>"}) and its operation: |
|
956 *} |
|
957 |
|
958 code_class %tt eq |
|
959 (Haskell "Eq" where "op =" \<equiv> "(==)") |
|
960 |
|
961 code_const %tt "op =" |
|
962 (Haskell infixl 4 "==") |
|
963 |
|
964 text {* |
|
965 A problem now occurs whenever a type which |
|
966 is an instance of @{text eq} in HOL is mapped |
|
967 on a Haskell-builtin type which is also an instance |
|
968 of Haskell @{text Eq}: |
|
969 *} |
|
970 |
|
971 typedecl bar |
|
972 |
|
973 instantiation bar :: eq |
|
974 begin |
|
975 |
|
976 definition "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y" |
|
977 |
|
978 instance by default (simp add: eq_bar_def) |
|
979 |
|
980 end |
|
981 |
|
982 code_type %tt bar |
|
983 (Haskell "Integer") |
|
984 |
|
985 text {* |
|
986 The code generator would produce |
|
987 an additional instance, which of course is rejected. |
|
988 To suppress this additional instance, use |
|
989 @{text "\<CODEINSTANCE>"}: |
|
990 *} |
|
991 |
|
992 code_instance %tt bar :: eq |
|
993 (Haskell -) |
|
994 |
|
995 |
|
996 subsubsection {* Pretty printing *} |
|
997 |
|
998 text {* |
|
999 The serializer provides ML interfaces to set up |
|
1000 pretty serializations for expressions like lists, numerals |
|
1001 and characters; these are |
|
1002 monolithic stubs and should only be used with the |
|
1003 theories introduced in \secref{sec:library}. |
|
1004 *} |
|
1005 |
|
1006 |
|
1007 subsection {* Cyclic module dependencies *} |
|
1008 |
|
1009 text {* |
|
1010 Sometimes the awkward situation occurs that dependencies |
|
1011 between definitions introduce cyclic dependencies |
|
1012 between modules, which in the Haskell world leaves |
|
1013 you to the mercy of the Haskell implementation you are using, |
|
1014 while for SML code generation is not possible. |
|
1015 |
|
1016 A solution is to declare module names explicitly. |
|
1017 Let use assume the three cyclically dependent |
|
1018 modules are named \emph{A}, \emph{B} and \emph{C}. |
|
1019 Then, by stating |
|
1020 *} |
|
1021 |
|
1022 code_modulename SML |
|
1023 A ABC |
|
1024 B ABC |
|
1025 C ABC |
|
1026 |
|
1027 text {* |
|
1028 we explicitly map all those modules on \emph{ABC}, |
|
1029 resulting in an ad-hoc merge of this three modules |
|
1030 at serialization time. |
|
1031 *} |
|
1032 |
|
1033 subsection {* Incremental code generation *} |
|
1034 |
|
1035 text {* |
|
1036 Code generation is \emph{incremental}: theorems |
|
1037 and abstract intermediate code are cached and extended on demand. |
|
1038 The cache may be partially or fully dropped if the underlying |
|
1039 executable content of the theory changes. |
|
1040 Implementation of caching is supposed to transparently |
|
1041 hid away the details from the user. Anyway, caching |
|
1042 reaches the surface by using a slightly more general form |
|
1043 of the @{text "\<CODETHMS>"}, @{text "\<CODEDEPS>"} |
|
1044 and @{text "\<EXPORTCODE>"} commands: the list of constants |
|
1045 may be omitted. Then, all constants with code theorems |
|
1046 in the current cache are referred to. |
|
1047 *} |
|
1048 |
|
1049 (*subsection {* Code generation and proof extraction *} |
|
1050 |
|
1051 text {* |
|
1052 \fixme |
|
1053 *}*) |
|
1054 |
|
1055 |
|
1056 section {* ML interfaces \label{sec:ml} *} |
|
1057 |
|
1058 text {* |
|
1059 Since the code generator framework not only aims to provide |
|
1060 a nice Isar interface but also to form a base for |
|
1061 code-generation-based applications, here a short |
|
1062 description of the most important ML interfaces. |
|
1063 *} |
|
1064 |
|
1065 subsection {* Executable theory content: @{text Code} *} |
|
1066 |
|
1067 text {* |
|
1068 This Pure module implements the core notions of |
|
1069 executable content of a theory. |
|
1070 *} |
|
1071 |
|
1072 subsubsection {* Managing executable content *} |
|
1073 |
|
1074 text %mlref {* |
|
1075 \begin{mldecls} |
|
1076 @{index_ML Code.add_func: "thm -> theory -> theory"} \\ |
|
1077 @{index_ML Code.del_func: "thm -> theory -> theory"} \\ |
|
1078 @{index_ML Code.add_funcl: "string * thm list Susp.T -> theory -> theory"} \\ |
|
1079 @{index_ML Code.map_pre: "(MetaSimplifier.simpset -> MetaSimplifier.simpset) -> theory -> theory"} \\ |
|
1080 @{index_ML Code.map_post: "(MetaSimplifier.simpset -> MetaSimplifier.simpset) -> theory -> theory"} \\ |
|
1081 @{index_ML Code.add_functrans: "string * (theory -> thm list -> thm list option) |
|
1082 -> theory -> theory"} \\ |
|
1083 @{index_ML Code.del_functrans: "string -> theory -> theory"} \\ |
|
1084 @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\ |
|
1085 @{index_ML Code.get_datatype: "theory -> string |
|
1086 -> (string * sort) list * (string * typ list) list"} \\ |
|
1087 @{index_ML Code.get_datatype_of_constr: "theory -> string -> string option"} |
|
1088 \end{mldecls} |
|
1089 |
|
1090 \begin{description} |
|
1091 |
|
1092 \item @{ML Code.add_func}~@{text "thm"}~@{text "thy"} adds function |
|
1093 theorem @{text "thm"} to executable content. |
|
1094 |
|
1095 \item @{ML Code.del_func}~@{text "thm"}~@{text "thy"} removes function |
|
1096 theorem @{text "thm"} from executable content, if present. |
|
1097 |
|
1098 \item @{ML Code.add_funcl}~@{text "(const, lthms)"}~@{text "thy"} adds |
|
1099 suspended defining equations @{text lthms} for constant |
|
1100 @{text const} to executable content. |
|
1101 |
|
1102 \item @{ML Code.map_pre}~@{text "f"}~@{text "thy"} changes |
|
1103 the preprocessor simpset. |
|
1104 |
|
1105 \item @{ML Code.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds |
|
1106 function transformer @{text f} (named @{text name}) to executable content; |
|
1107 @{text f} is a transformer of the defining equations belonging |
|
1108 to a certain function definition, depending on the |
|
1109 current theory context. Returning @{text NONE} indicates that no |
|
1110 transformation took place; otherwise, the whole process will be iterated |
|
1111 with the new defining equations. |
|
1112 |
|
1113 \item @{ML Code.del_functrans}~@{text "name"}~@{text "thy"} removes |
|
1114 function transformer named @{text name} from executable content. |
|
1115 |
|
1116 \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds |
|
1117 a datatype to executable content, with generation |
|
1118 set @{text cs}. |
|
1119 |
|
1120 \item @{ML Code.get_datatype_of_constr}~@{text "thy"}~@{text "const"} |
|
1121 returns type constructor corresponding to |
|
1122 constructor @{text const}; returns @{text NONE} |
|
1123 if @{text const} is no constructor. |
|
1124 |
|
1125 \end{description} |
|
1126 *} |
|
1127 |
|
1128 subsection {* Auxiliary *} |
|
1129 |
|
1130 text %mlref {* |
|
1131 \begin{mldecls} |
|
1132 @{index_ML Code_Unit.read_const: "theory -> string -> string"} \\ |
|
1133 @{index_ML Code_Unit.head_func: "thm -> string * ((string * sort) list * typ)"} \\ |
|
1134 @{index_ML Code_Unit.rewrite_func: "MetaSimplifier.simpset -> thm -> thm"} \\ |
|
1135 \end{mldecls} |
|
1136 |
|
1137 \begin{description} |
|
1138 |
|
1139 \item @{ML Code_Unit.read_const}~@{text thy}~@{text s} |
|
1140 reads a constant as a concrete term expression @{text s}. |
|
1141 |
|
1142 \item @{ML Code_Unit.head_func}~@{text thm} |
|
1143 extracts the constant and its type from a defining equation @{text thm}. |
|
1144 |
|
1145 \item @{ML Code_Unit.rewrite_func}~@{text ss}~@{text thm} |
|
1146 rewrites a defining equation @{text thm} with a simpset @{text ss}; |
|
1147 only arguments and right hand side are rewritten, |
|
1148 not the head of the defining equation. |
|
1149 |
|
1150 \end{description} |
|
1151 |
|
1152 *} |
|
1153 |
|
1154 subsection {* Implementing code generator applications *} |
|
1155 |
|
1156 text {* |
|
1157 Implementing code generator applications on top |
|
1158 of the framework set out so far usually not only |
|
1159 involves using those primitive interfaces |
|
1160 but also storing code-dependent data and various |
|
1161 other things. |
|
1162 |
|
1163 \begin{warn} |
|
1164 Some interfaces discussed here have not reached |
|
1165 a final state yet. |
|
1166 Changes likely to occur in future. |
|
1167 \end{warn} |
|
1168 *} |
|
1169 |
|
1170 subsubsection {* Data depending on the theory's executable content *} |
|
1171 |
|
1172 text {* |
|
1173 Due to incrementality of code generation, changes in the |
|
1174 theory's executable content have to be propagated in a |
|
1175 certain fashion. Additionally, such changes may occur |
|
1176 not only during theory extension but also during theory |
|
1177 merge, which is a little bit nasty from an implementation |
|
1178 point of view. The framework provides a solution |
|
1179 to this technical challenge by providing a functorial |
|
1180 data slot @{ML_functor CodeDataFun}; on instantiation |
|
1181 of this functor, the following types and operations |
|
1182 are required: |
|
1183 |
|
1184 \medskip |
|
1185 \begin{tabular}{l} |
|
1186 @{text "type T"} \\ |
|
1187 @{text "val empty: T"} \\ |
|
1188 @{text "val merge: Pretty.pp \<rightarrow> T * T \<rightarrow> T"} \\ |
|
1189 @{text "val purge: theory option \<rightarrow> CodeUnit.const list option \<rightarrow> T \<rightarrow> T"} |
|
1190 \end{tabular} |
|
1191 |
|
1192 \begin{description} |
|
1193 |
|
1194 \item @{text T} the type of data to store. |
|
1195 |
|
1196 \item @{text empty} initial (empty) data. |
|
1197 |
|
1198 \item @{text merge} merging two data slots. |
|
1199 |
|
1200 \item @{text purge}~@{text thy}~@{text consts} propagates changes in executable content; |
|
1201 if possible, the current theory context is handed over |
|
1202 as argument @{text thy} (if there is no current theory context (e.g.~during |
|
1203 theory merge, @{ML NONE}); @{text consts} indicates the kind |
|
1204 of change: @{ML NONE} stands for a fundamental change |
|
1205 which invalidates any existing code, @{text "SOME consts"} |
|
1206 hints that executable content for constants @{text consts} |
|
1207 has changed. |
|
1208 |
|
1209 \end{description} |
|
1210 |
|
1211 An instance of @{ML_functor CodeDataFun} provides the following |
|
1212 interface: |
|
1213 |
|
1214 \medskip |
|
1215 \begin{tabular}{l} |
|
1216 @{text "get: theory \<rightarrow> T"} \\ |
|
1217 @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\ |
|
1218 @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"} |
|
1219 \end{tabular} |
|
1220 |
|
1221 \begin{description} |
|
1222 |
|
1223 \item @{text get} retrieval of the current data. |
|
1224 |
|
1225 \item @{text change} update of current data (cached!) |
|
1226 by giving a continuation. |
|
1227 |
|
1228 \item @{text change_yield} update with side result. |
|
1229 |
|
1230 \end{description} |
|
1231 *} |
|
1232 |
|
1233 (*subsubsection {* Datatype hooks *} |
|
1234 |
|
1235 text {* |
|
1236 Isabelle/HOL's datatype package provides a mechanism to |
|
1237 extend theories depending on datatype declarations: |
|
1238 \emph{datatype hooks}. For example, when declaring a new |
|
1239 datatype, a hook proves defining equations for equality on |
|
1240 that datatype (if possible). |
|
1241 *} |
|
1242 |
|
1243 text %mlref {* |
|
1244 \begin{mldecls} |
|
1245 @{index_ML_type DatatypeHooks.hook: "string list -> theory -> theory"} \\ |
|
1246 @{index_ML DatatypeHooks.add: "DatatypeHooks.hook -> theory -> theory"} |
|
1247 \end{mldecls} |
|
1248 |
|
1249 \begin{description} |
|
1250 |
|
1251 \item @{ML_type DatatypeHooks.hook} specifies the interface |
|
1252 of \emph{datatype hooks}: a theory update |
|
1253 depending on the list of newly introduced |
|
1254 datatype names. |
|
1255 |
|
1256 \item @{ML DatatypeHooks.add} adds a hook to the |
|
1257 chain of all hooks. |
|
1258 |
|
1259 \end{description} |
|
1260 *} |
|
1261 |
|
1262 subsubsection {* Trivial typedefs -- type copies *} |
|
1263 |
|
1264 text {* |
|
1265 Sometimes packages will introduce new types |
|
1266 as \emph{marked type copies} similar to Haskell's |
|
1267 @{text newtype} declaration (e.g. the HOL record package) |
|
1268 \emph{without} tinkering with the overhead of datatypes. |
|
1269 Technically, these type copies are trivial forms of typedefs. |
|
1270 Since these type copies in code generation view are nothing |
|
1271 else than datatypes, they have been given a own package |
|
1272 in order to faciliate code generation: |
|
1273 *} |
|
1274 |
|
1275 text %mlref {* |
|
1276 \begin{mldecls} |
|
1277 @{index_ML_type TypecopyPackage.info} \\ |
|
1278 @{index_ML TypecopyPackage.add_typecopy: " |
|
1279 bstring * string list -> typ -> (bstring * bstring) option |
|
1280 -> theory -> (string * TypecopyPackage.info) * theory"} \\ |
|
1281 @{index_ML TypecopyPackage.get_typecopy_info: "theory |
|
1282 -> string -> TypecopyPackage.info option"} \\ |
|
1283 @{index_ML TypecopyPackage.get_spec: "theory -> string |
|
1284 -> (string * sort) list * (string * typ list) list"} \\ |
|
1285 @{index_ML_type TypecopyPackage.hook: "string * TypecopyPackage.info -> theory -> theory"} \\ |
|
1286 @{index_ML TypecopyPackage.add_hook: |
|
1287 "TypecopyPackage.hook -> theory -> theory"} \\ |
|
1288 \end{mldecls} |
|
1289 |
|
1290 \begin{description} |
|
1291 |
|
1292 \item @{ML_type TypecopyPackage.info} a record containing |
|
1293 the specification and further data of a type copy. |
|
1294 |
|
1295 \item @{ML TypecopyPackage.add_typecopy} defines a new |
|
1296 type copy. |
|
1297 |
|
1298 \item @{ML TypecopyPackage.get_typecopy_info} retrieves |
|
1299 data of an existing type copy. |
|
1300 |
|
1301 \item @{ML TypecopyPackage.get_spec} retrieves datatype-like |
|
1302 specification of a type copy. |
|
1303 |
|
1304 \item @{ML_type TypecopyPackage.hook},~@{ML TypecopyPackage.add_hook} |
|
1305 provide a hook mechanism corresponding to the hook mechanism |
|
1306 on datatypes. |
|
1307 |
|
1308 \end{description} |
|
1309 *} |
|
1310 |
|
1311 subsubsection {* Unifying type copies and datatypes *} |
|
1312 |
|
1313 text {* |
|
1314 Since datatypes and type copies are mapped to the same concept (datatypes) |
|
1315 by code generation, the view on both is unified \qt{code types}: |
|
1316 *} |
|
1317 |
|
1318 text %mlref {* |
|
1319 \begin{mldecls} |
|
1320 @{index_ML_type DatatypeCodegen.hook: "(string * (bool * ((string * sort) list |
|
1321 * (string * typ list) list))) list |
|
1322 -> theory -> theory"} \\ |
|
1323 @{index_ML DatatypeCodegen.add_codetypes_hook_bootstrap: " |
|
1324 DatatypeCodegen.hook -> theory -> theory"} |
|
1325 \end{mldecls} |
|
1326 *} |
|
1327 |
|
1328 text {* |
|
1329 \begin{description} |
|
1330 |
|
1331 \item @{ML_type DatatypeCodegen.hook} specifies the code type hook |
|
1332 interface: a theory transformation depending on a list of |
|
1333 mutual recursive code types; each entry in the list |
|
1334 has the structure @{text "(name, (is_data, (vars, cons)))"} |
|
1335 where @{text name} is the name of the code type, @{text is_data} |
|
1336 is true iff @{text name} is a datatype rather then a type copy, |
|
1337 and @{text "(vars, cons)"} is the specification of the code type. |
|
1338 |
|
1339 \item @{ML DatatypeCodegen.add_codetypes_hook_bootstrap} adds a code |
|
1340 type hook; the hook is immediately processed for all already |
|
1341 existing datatypes, in blocks of mutual recursive datatypes, |
|
1342 where all datatypes a block depends on are processed before |
|
1343 the block. |
|
1344 |
|
1345 \end{description} |
|
1346 *}*) |
|
1347 |
|
1348 text {* |
|
1349 \emph{Happy proving, happy hacking!} |
|
1350 *} |
|
1351 |
|
1352 end |
|