|
1 theory Program |
|
2 imports Introduction |
|
3 begin |
|
4 |
|
5 section {* Turning Theories into Programs \label{sec:program} *} |
|
6 |
|
7 subsection {* The @{text "Isabelle/HOL"} default setup *} |
|
8 |
|
9 text {* |
|
10 We have already seen how by default equations stemming from |
|
11 @{command definition}/@{command primrec}/@{command fun} |
|
12 statements are used for code generation. This default behaviour |
|
13 can be changed, e.g. by providing different code equations. |
|
14 All kinds of customisation shown in this section is \emph{safe} |
|
15 in the sense that the user does not have to worry about |
|
16 correctness -- all programs generatable that way are partially |
|
17 correct. |
|
18 *} |
|
19 |
|
20 subsection {* Selecting code equations *} |
|
21 |
|
22 text {* |
|
23 Coming back to our introductory example, we |
|
24 could provide an alternative code equations for @{const dequeue} |
|
25 explicitly: |
|
26 *} |
|
27 |
|
28 lemma %quote [code]: |
|
29 "dequeue (AQueue xs []) = |
|
30 (if xs = [] then (None, AQueue [] []) |
|
31 else dequeue (AQueue [] (rev xs)))" |
|
32 "dequeue (AQueue xs (y # ys)) = |
|
33 (Some y, AQueue xs ys)" |
|
34 by (cases xs, simp_all) (cases "rev xs", simp_all) |
|
35 |
|
36 text {* |
|
37 \noindent The annotation @{text "[code]"} is an @{text Isar} |
|
38 @{text attribute} which states that the given theorems should be |
|
39 considered as code equations for a @{text fun} statement -- |
|
40 the corresponding constant is determined syntactically. The resulting code: |
|
41 *} |
|
42 |
|
43 text %quote {*@{code_stmts dequeue (consts) dequeue (Haskell)}*} |
|
44 |
|
45 text {* |
|
46 \noindent You may note that the equality test @{term "xs = []"} has been |
|
47 replaced by the predicate @{term "null xs"}. This is due to the default |
|
48 setup in the \qn{preprocessor} to be discussed further below (\secref{sec:preproc}). |
|
49 |
|
50 Changing the default constructor set of datatypes is also |
|
51 possible. See \secref{sec:datatypes} for an example. |
|
52 |
|
53 As told in \secref{sec:concept}, code generation is based |
|
54 on a structured collection of code theorems. |
|
55 For explorative purpose, this collection |
|
56 may be inspected using the @{command code_thms} command: |
|
57 *} |
|
58 |
|
59 code_thms %quote dequeue |
|
60 |
|
61 text {* |
|
62 \noindent prints a table with \emph{all} code equations |
|
63 for @{const dequeue}, including |
|
64 \emph{all} code equations those equations depend |
|
65 on recursively. |
|
66 |
|
67 Similarly, the @{command code_deps} command shows a graph |
|
68 visualising dependencies between code equations. |
|
69 *} |
|
70 |
|
71 subsection {* @{text class} and @{text instantiation} *} |
|
72 |
|
73 text {* |
|
74 Concerning type classes and code generation, let us examine an example |
|
75 from abstract algebra: |
|
76 *} |
|
77 |
|
78 class %quote semigroup = |
|
79 fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70) |
|
80 assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)" |
|
81 |
|
82 class %quote monoid = semigroup + |
|
83 fixes neutral :: 'a ("\<one>") |
|
84 assumes neutl: "\<one> \<otimes> x = x" |
|
85 and neutr: "x \<otimes> \<one> = x" |
|
86 |
|
87 instantiation %quote nat :: monoid |
|
88 begin |
|
89 |
|
90 primrec %quote mult_nat where |
|
91 "0 \<otimes> n = (0\<Colon>nat)" |
|
92 | "Suc m \<otimes> n = n + m \<otimes> n" |
|
93 |
|
94 definition %quote neutral_nat where |
|
95 "\<one> = Suc 0" |
|
96 |
|
97 lemma %quote add_mult_distrib: |
|
98 fixes n m q :: nat |
|
99 shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q" |
|
100 by (induct n) simp_all |
|
101 |
|
102 instance %quote proof |
|
103 fix m n q :: nat |
|
104 show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)" |
|
105 by (induct m) (simp_all add: add_mult_distrib) |
|
106 show "\<one> \<otimes> n = n" |
|
107 by (simp add: neutral_nat_def) |
|
108 show "m \<otimes> \<one> = m" |
|
109 by (induct m) (simp_all add: neutral_nat_def) |
|
110 qed |
|
111 |
|
112 end %quote |
|
113 |
|
114 text {* |
|
115 \noindent We define the natural operation of the natural numbers |
|
116 on monoids: |
|
117 *} |
|
118 |
|
119 primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where |
|
120 "pow 0 a = \<one>" |
|
121 | "pow (Suc n) a = a \<otimes> pow n a" |
|
122 |
|
123 text {* |
|
124 \noindent This we use to define the discrete exponentiation function: |
|
125 *} |
|
126 |
|
127 definition %quote bexp :: "nat \<Rightarrow> nat" where |
|
128 "bexp n = pow n (Suc (Suc 0))" |
|
129 |
|
130 text {* |
|
131 \noindent The corresponding code: |
|
132 *} |
|
133 |
|
134 text %quote {*@{code_stmts bexp (Haskell)}*} |
|
135 |
|
136 text {* |
|
137 \noindent This is a convenient place to show how explicit dictionary construction |
|
138 manifests in generated code (here, the same example in @{text SML}): |
|
139 *} |
|
140 |
|
141 text %quote {*@{code_stmts bexp (SML)}*} |
|
142 |
|
143 text {* |
|
144 \noindent Note the parameters with trailing underscore (@{verbatim "A_"}) |
|
145 which are the dictionary parameters. |
|
146 *} |
|
147 |
|
148 subsection {* The preprocessor \label{sec:preproc} *} |
|
149 |
|
150 text {* |
|
151 Before selected function theorems are turned into abstract |
|
152 code, a chain of definitional transformation steps is carried |
|
153 out: \emph{preprocessing}. In essence, the preprocessor |
|
154 consists of two components: a \emph{simpset} and \emph{function transformers}. |
|
155 |
|
156 The \emph{simpset} allows to employ the full generality of the Isabelle |
|
157 simplifier. Due to the interpretation of theorems |
|
158 as code equations, rewrites are applied to the right |
|
159 hand side and the arguments of the left hand side of an |
|
160 equation, but never to the constant heading the left hand side. |
|
161 An important special case are \emph{inline theorems} which may be |
|
162 declared and undeclared using the |
|
163 \emph{code inline} or \emph{code inline del} attribute respectively. |
|
164 |
|
165 Some common applications: |
|
166 *} |
|
167 |
|
168 text_raw {* |
|
169 \begin{itemize} |
|
170 *} |
|
171 |
|
172 text {* |
|
173 \item replacing non-executable constructs by executable ones: |
|
174 *} |
|
175 |
|
176 lemma %quote [code inline]: |
|
177 "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all |
|
178 |
|
179 text {* |
|
180 \item eliminating superfluous constants: |
|
181 *} |
|
182 |
|
183 lemma %quote [code inline]: |
|
184 "1 = Suc 0" by simp |
|
185 |
|
186 text {* |
|
187 \item replacing executable but inconvenient constructs: |
|
188 *} |
|
189 |
|
190 lemma %quote [code inline]: |
|
191 "xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all |
|
192 |
|
193 text_raw {* |
|
194 \end{itemize} |
|
195 *} |
|
196 |
|
197 text {* |
|
198 \noindent \emph{Function transformers} provide a very general interface, |
|
199 transforming a list of function theorems to another |
|
200 list of function theorems, provided that neither the heading |
|
201 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc} |
|
202 pattern elimination implemented in |
|
203 theory @{text Efficient_Nat} (see \secref{eff_nat}) uses this |
|
204 interface. |
|
205 |
|
206 \noindent The current setup of the preprocessor may be inspected using |
|
207 the @{command print_codesetup} command. |
|
208 @{command code_thms} provides a convenient |
|
209 mechanism to inspect the impact of a preprocessor setup |
|
210 on code equations. |
|
211 |
|
212 \begin{warn} |
|
213 The attribute \emph{code unfold} |
|
214 associated with the @{text "SML code generator"} also applies to |
|
215 the @{text "generic code generator"}: |
|
216 \emph{code unfold} implies \emph{code inline}. |
|
217 \end{warn} |
|
218 *} |
|
219 |
|
220 subsection {* Datatypes \label{sec:datatypes} *} |
|
221 |
|
222 text {* |
|
223 Conceptually, any datatype is spanned by a set of |
|
224 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text |
|
225 "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in |
|
226 @{text "\<tau>"}. The HOL datatype package by default registers any new |
|
227 datatype in the table of datatypes, which may be inspected using the |
|
228 @{command print_codesetup} command. |
|
229 |
|
230 In some cases, it is appropriate to alter or extend this table. As |
|
231 an example, we will develop an alternative representation of the |
|
232 queue example given in \secref{sec:intro}. The amortised |
|
233 representation is convenient for generating code but exposes its |
|
234 \qt{implementation} details, which may be cumbersome when proving |
|
235 theorems about it. Therefore, here a simple, straightforward |
|
236 representation of queues: |
|
237 *} |
|
238 |
|
239 datatype %quote 'a queue = Queue "'a list" |
|
240 |
|
241 definition %quote empty :: "'a queue" where |
|
242 "empty = Queue []" |
|
243 |
|
244 primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where |
|
245 "enqueue x (Queue xs) = Queue (xs @ [x])" |
|
246 |
|
247 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where |
|
248 "dequeue (Queue []) = (None, Queue [])" |
|
249 | "dequeue (Queue (x # xs)) = (Some x, Queue xs)" |
|
250 |
|
251 text {* |
|
252 \noindent This we can use directly for proving; for executing, |
|
253 we provide an alternative characterisation: |
|
254 *} |
|
255 |
|
256 definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where |
|
257 "AQueue xs ys = Queue (ys @ rev xs)" |
|
258 |
|
259 code_datatype %quote AQueue |
|
260 |
|
261 text {* |
|
262 \noindent Here we define a \qt{constructor} @{const "AQueue"} which |
|
263 is defined in terms of @{text "Queue"} and interprets its arguments |
|
264 according to what the \emph{content} of an amortised queue is supposed |
|
265 to be. Equipped with this, we are able to prove the following equations |
|
266 for our primitive queue operations which \qt{implement} the simple |
|
267 queues in an amortised fashion: |
|
268 *} |
|
269 |
|
270 lemma %quote empty_AQueue [code]: |
|
271 "empty = AQueue [] []" |
|
272 unfolding AQueue_def empty_def by simp |
|
273 |
|
274 lemma %quote enqueue_AQueue [code]: |
|
275 "enqueue x (AQueue xs ys) = AQueue (x # xs) ys" |
|
276 unfolding AQueue_def by simp |
|
277 |
|
278 lemma %quote dequeue_AQueue [code]: |
|
279 "dequeue (AQueue xs []) = |
|
280 (if xs = [] then (None, AQueue [] []) |
|
281 else dequeue (AQueue [] (rev xs)))" |
|
282 "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)" |
|
283 unfolding AQueue_def by simp_all |
|
284 |
|
285 text {* |
|
286 \noindent For completeness, we provide a substitute for the |
|
287 @{text case} combinator on queues: |
|
288 *} |
|
289 |
|
290 definition %quote |
|
291 aqueue_case_def: "aqueue_case = queue_case" |
|
292 |
|
293 lemma %quote aqueue_case [code, code inline]: |
|
294 "queue_case = aqueue_case" |
|
295 unfolding aqueue_case_def .. |
|
296 |
|
297 lemma %quote case_AQueue [code]: |
|
298 "aqueue_case f (AQueue xs ys) = f (ys @ rev xs)" |
|
299 unfolding aqueue_case_def AQueue_def by simp |
|
300 |
|
301 text {* |
|
302 \noindent The resulting code looks as expected: |
|
303 *} |
|
304 |
|
305 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*} |
|
306 |
|
307 text {* |
|
308 \noindent From this example, it can be glimpsed that using own |
|
309 constructor sets is a little delicate since it changes the set of |
|
310 valid patterns for values of that type. Without going into much |
|
311 detail, here some practical hints: |
|
312 |
|
313 \begin{itemize} |
|
314 |
|
315 \item When changing the constructor set for datatypes, take care |
|
316 to provide an alternative for the @{text case} combinator |
|
317 (e.g.~by replacing it using the preprocessor). |
|
318 |
|
319 \item Values in the target language need not to be normalised -- |
|
320 different values in the target language may represent the same |
|
321 value in the logic. |
|
322 |
|
323 \item Usually, a good methodology to deal with the subtleties of |
|
324 pattern matching is to see the type as an abstract type: provide |
|
325 a set of operations which operate on the concrete representation |
|
326 of the type, and derive further operations by combinations of |
|
327 these primitive ones, without relying on a particular |
|
328 representation. |
|
329 |
|
330 \end{itemize} |
|
331 *} |
|
332 |
|
333 |
|
334 subsection {* Equality and wellsortedness *} |
|
335 |
|
336 text {* |
|
337 Surely you have already noticed how equality is treated |
|
338 by the code generator: |
|
339 *} |
|
340 |
|
341 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where |
|
342 "collect_duplicates xs ys [] = xs" |
|
343 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs |
|
344 then if z \<in> set ys |
|
345 then collect_duplicates xs ys zs |
|
346 else collect_duplicates xs (z#ys) zs |
|
347 else collect_duplicates (z#xs) (z#ys) zs)" |
|
348 |
|
349 text {* |
|
350 \noindent The membership test during preprocessing is rewritten, |
|
351 resulting in @{const List.member}, which itself |
|
352 performs an explicit equality check. |
|
353 *} |
|
354 |
|
355 text %quote {*@{code_stmts collect_duplicates (SML)}*} |
|
356 |
|
357 text {* |
|
358 \noindent Obviously, polymorphic equality is implemented the Haskell |
|
359 way using a type class. How is this achieved? HOL introduces |
|
360 an explicit class @{class eq} with a corresponding operation |
|
361 @{const eq_class.eq} such that @{thm eq [no_vars]}. |
|
362 The preprocessing framework does the rest by propagating the |
|
363 @{class eq} constraints through all dependent code equations. |
|
364 For datatypes, instances of @{class eq} are implicitly derived |
|
365 when possible. For other types, you may instantiate @{text eq} |
|
366 manually like any other type class. |
|
367 |
|
368 Though this @{text eq} class is designed to get rarely in |
|
369 the way, a subtlety |
|
370 enters the stage when definitions of overloaded constants |
|
371 are dependent on operational equality. For example, let |
|
372 us define a lexicographic ordering on tuples |
|
373 (also see theory @{theory Product_ord}): |
|
374 *} |
|
375 |
|
376 instantiation %quote "*" :: (order, order) order |
|
377 begin |
|
378 |
|
379 definition %quote [code del]: |
|
380 "x \<le> y \<longleftrightarrow> fst x < fst y \<or> fst x = fst y \<and> snd x \<le> snd y" |
|
381 |
|
382 definition %quote [code del]: |
|
383 "x < y \<longleftrightarrow> fst x < fst y \<or> fst x = fst y \<and> snd x < snd y" |
|
384 |
|
385 instance %quote proof |
|
386 qed (auto simp: less_eq_prod_def less_prod_def intro: order_less_trans) |
|
387 |
|
388 end %quote |
|
389 |
|
390 lemma %quote order_prod [code]: |
|
391 "(x1 \<Colon> 'a\<Colon>order, y1 \<Colon> 'b\<Colon>order) < (x2, y2) \<longleftrightarrow> |
|
392 x1 < x2 \<or> x1 = x2 \<and> y1 < y2" |
|
393 "(x1 \<Colon> 'a\<Colon>order, y1 \<Colon> 'b\<Colon>order) \<le> (x2, y2) \<longleftrightarrow> |
|
394 x1 < x2 \<or> x1 = x2 \<and> y1 \<le> y2" |
|
395 by (simp_all add: less_prod_def less_eq_prod_def) |
|
396 |
|
397 text {* |
|
398 \noindent Then code generation will fail. Why? The definition |
|
399 of @{term "op \<le>"} depends on equality on both arguments, |
|
400 which are polymorphic and impose an additional @{class eq} |
|
401 class constraint, which the preprocessor does not propagate |
|
402 (for technical reasons). |
|
403 |
|
404 The solution is to add @{class eq} explicitly to the first sort arguments in the |
|
405 code theorems: |
|
406 *} |
|
407 |
|
408 lemma %quote order_prod_code [code]: |
|
409 "(x1 \<Colon> 'a\<Colon>{order, eq}, y1 \<Colon> 'b\<Colon>order) < (x2, y2) \<longleftrightarrow> |
|
410 x1 < x2 \<or> x1 = x2 \<and> y1 < y2" |
|
411 "(x1 \<Colon> 'a\<Colon>{order, eq}, y1 \<Colon> 'b\<Colon>order) \<le> (x2, y2) \<longleftrightarrow> |
|
412 x1 < x2 \<or> x1 = x2 \<and> y1 \<le> y2" |
|
413 by (simp_all add: less_prod_def less_eq_prod_def) |
|
414 |
|
415 text {* |
|
416 \noindent Then code generation succeeds: |
|
417 *} |
|
418 |
|
419 text %quote {*@{code_stmts "op \<le> \<Colon> _ \<times> _ \<Rightarrow> _ \<times> _ \<Rightarrow> bool" (SML)}*} |
|
420 |
|
421 text {* |
|
422 In some cases, the automatically derived code equations |
|
423 for equality on a particular type may not be appropriate. |
|
424 As example, watch the following datatype representing |
|
425 monomorphic parametric types (where type constructors |
|
426 are referred to by natural numbers): |
|
427 *} |
|
428 |
|
429 datatype %quote monotype = Mono nat "monotype list" |
|
430 (*<*) |
|
431 lemma monotype_eq: |
|
432 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<equiv> |
|
433 eq_class.eq tyco1 tyco2 \<and> eq_class.eq typargs1 typargs2" by (simp add: eq) |
|
434 (*>*) |
|
435 |
|
436 text {* |
|
437 \noindent Then code generation for SML would fail with a message |
|
438 that the generated code contains illegal mutual dependencies: |
|
439 the theorem @{thm monotype_eq [no_vars]} already requires the |
|
440 instance @{text "monotype \<Colon> eq"}, which itself requires |
|
441 @{thm monotype_eq [no_vars]}; Haskell has no problem with mutually |
|
442 recursive @{text instance} and @{text function} definitions, |
|
443 but the SML serialiser does not support this. |
|
444 |
|
445 In such cases, you have to provide your own equality equations |
|
446 involving auxiliary constants. In our case, |
|
447 @{const [show_types] list_all2} can do the job: |
|
448 *} |
|
449 |
|
450 lemma %quote monotype_eq_list_all2 [code]: |
|
451 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<longleftrightarrow> |
|
452 eq_class.eq tyco1 tyco2 \<and> list_all2 eq_class.eq typargs1 typargs2" |
|
453 by (simp add: eq list_all2_eq [symmetric]) |
|
454 |
|
455 text {* |
|
456 \noindent does not depend on instance @{text "monotype \<Colon> eq"}: |
|
457 *} |
|
458 |
|
459 text %quote {*@{code_stmts "eq_class.eq :: monotype \<Rightarrow> monotype \<Rightarrow> bool" (SML)}*} |
|
460 |
|
461 |
|
462 subsection {* Explicit partiality *} |
|
463 |
|
464 text {* |
|
465 Partiality usually enters the game by partial patterns, as |
|
466 in the following example, again for amortised queues: |
|
467 *} |
|
468 |
|
469 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where |
|
470 "strict_dequeue q = (case dequeue q |
|
471 of (Some x, q') \<Rightarrow> (x, q'))" |
|
472 |
|
473 lemma %quote strict_dequeue_AQueue [code]: |
|
474 "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)" |
|
475 "strict_dequeue (AQueue xs []) = |
|
476 (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))" |
|
477 by (simp_all add: strict_dequeue_def dequeue_AQueue split: list.splits) |
|
478 |
|
479 text {* |
|
480 \noindent In the corresponding code, there is no equation |
|
481 for the pattern @{term "AQueue [] []"}: |
|
482 *} |
|
483 |
|
484 text %quote {*@{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}*} |
|
485 |
|
486 text {* |
|
487 \noindent In some cases it is desirable to have this |
|
488 pseudo-\qt{partiality} more explicitly, e.g.~as follows: |
|
489 *} |
|
490 |
|
491 axiomatization %quote empty_queue :: 'a |
|
492 |
|
493 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where |
|
494 "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)" |
|
495 |
|
496 lemma %quote strict_dequeue'_AQueue [code]: |
|
497 "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue |
|
498 else strict_dequeue' (AQueue [] (rev xs)))" |
|
499 "strict_dequeue' (AQueue xs (y # ys)) = |
|
500 (y, AQueue xs ys)" |
|
501 by (simp_all add: strict_dequeue'_def dequeue_AQueue split: list.splits) |
|
502 |
|
503 text {* |
|
504 Observe that on the right hand side of the definition of @{const |
|
505 "strict_dequeue'"} the constant @{const empty_queue} occurs |
|
506 which is unspecified. |
|
507 |
|
508 Normally, if constants without any code equations occur in a |
|
509 program, the code generator complains (since in most cases this is |
|
510 not what the user expects). But such constants can also be thought |
|
511 of as function definitions with no equations which always fail, |
|
512 since there is never a successful pattern match on the left hand |
|
513 side. In order to categorise a constant into that category |
|
514 explicitly, use @{command "code_abort"}: |
|
515 *} |
|
516 |
|
517 code_abort %quote empty_queue |
|
518 |
|
519 text {* |
|
520 \noindent Then the code generator will just insert an error or |
|
521 exception at the appropriate position: |
|
522 *} |
|
523 |
|
524 text %quote {*@{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}*} |
|
525 |
|
526 text {* |
|
527 \noindent This feature however is rarely needed in practice. |
|
528 Note also that the @{text HOL} default setup already declares |
|
529 @{const undefined} as @{command "code_abort"}, which is most |
|
530 likely to be used in such situations. |
|
531 *} |
|
532 |
|
533 end |
|
534 |