summary |
shortlog |
changelog |
graph |
tags |
branches |
files |
changeset |
file |
revisions |
annotate |
diff |
raw

doc-src/Codegen/Thy/Program.thy

author | haftmann |

Tue May 26 12:31:01 2009 +0200 (2009-05-26) | |

changeset 31254 | 03a35fbc9dc6 |

parent 30938 | c6c9359e474c |

child 32000 | 6f07563dc8a9 |

permissions | -rw-r--r-- |

documented print_codeproc command

1 theory Program

2 imports Introduction

3 begin

5 section {* Turning Theories into Programs \label{sec:program} *}

7 subsection {* The @{text "Isabelle/HOL"} default setup *}

9 text {*

10 We have already seen how by default equations stemming from

11 @{command definition}/@{command primrec}/@{command fun}

12 statements are used for code generation. This default behaviour

13 can be changed, e.g. by providing different code equations.

14 All kinds of customisation shown in this section is \emph{safe}

15 in the sense that the user does not have to worry about

16 correctness -- all programs generatable that way are partially

17 correct.

18 *}

20 subsection {* Selecting code equations *}

22 text {*

23 Coming back to our introductory example, we

24 could provide an alternative code equations for @{const dequeue}

25 explicitly:

26 *}

28 lemma %quote [code]:

29 "dequeue (AQueue xs []) =

30 (if xs = [] then (None, AQueue [] [])

31 else dequeue (AQueue [] (rev xs)))"

32 "dequeue (AQueue xs (y # ys)) =

33 (Some y, AQueue xs ys)"

34 by (cases xs, simp_all) (cases "rev xs", simp_all)

36 text {*

37 \noindent The annotation @{text "[code]"} is an @{text Isar}

38 @{text attribute} which states that the given theorems should be

39 considered as code equations for a @{text fun} statement --

40 the corresponding constant is determined syntactically. The resulting code:

41 *}

43 text %quote {*@{code_stmts dequeue (consts) dequeue (Haskell)}*}

45 text {*

46 \noindent You may note that the equality test @{term "xs = []"} has been

47 replaced by the predicate @{term "null xs"}. This is due to the default

48 setup in the \qn{preprocessor} to be discussed further below (\secref{sec:preproc}).

50 Changing the default constructor set of datatypes is also

51 possible. See \secref{sec:datatypes} for an example.

53 As told in \secref{sec:concept}, code generation is based

54 on a structured collection of code theorems.

55 For explorative purpose, this collection

56 may be inspected using the @{command code_thms} command:

57 *}

59 code_thms %quote dequeue

61 text {*

62 \noindent prints a table with \emph{all} code equations

63 for @{const dequeue}, including

64 \emph{all} code equations those equations depend

65 on recursively.

67 Similarly, the @{command code_deps} command shows a graph

68 visualising dependencies between code equations.

69 *}

71 subsection {* @{text class} and @{text instantiation} *}

73 text {*

74 Concerning type classes and code generation, let us examine an example

75 from abstract algebra:

76 *}

78 class %quote semigroup =

79 fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70)

80 assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)"

82 class %quote monoid = semigroup +

83 fixes neutral :: 'a ("\<one>")

84 assumes neutl: "\<one> \<otimes> x = x"

85 and neutr: "x \<otimes> \<one> = x"

87 instantiation %quote nat :: monoid

88 begin

90 primrec %quote mult_nat where

91 "0 \<otimes> n = (0\<Colon>nat)"

92 | "Suc m \<otimes> n = n + m \<otimes> n"

94 definition %quote neutral_nat where

95 "\<one> = Suc 0"

97 lemma %quote add_mult_distrib:

98 fixes n m q :: nat

99 shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q"

100 by (induct n) simp_all

102 instance %quote proof

103 fix m n q :: nat

104 show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)"

105 by (induct m) (simp_all add: add_mult_distrib)

106 show "\<one> \<otimes> n = n"

107 by (simp add: neutral_nat_def)

108 show "m \<otimes> \<one> = m"

109 by (induct m) (simp_all add: neutral_nat_def)

110 qed

112 end %quote

114 text {*

115 \noindent We define the natural operation of the natural numbers

116 on monoids:

117 *}

119 primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where

120 "pow 0 a = \<one>"

121 | "pow (Suc n) a = a \<otimes> pow n a"

123 text {*

124 \noindent This we use to define the discrete exponentiation function:

125 *}

127 definition %quote bexp :: "nat \<Rightarrow> nat" where

128 "bexp n = pow n (Suc (Suc 0))"

130 text {*

131 \noindent The corresponding code:

132 *}

134 text %quote {*@{code_stmts bexp (Haskell)}*}

136 text {*

137 \noindent This is a convenient place to show how explicit dictionary construction

138 manifests in generated code (here, the same example in @{text SML}):

139 *}

141 text %quote {*@{code_stmts bexp (SML)}*}

143 text {*

144 \noindent Note the parameters with trailing underscore (@{verbatim "A_"})

145 which are the dictionary parameters.

146 *}

148 subsection {* The preprocessor \label{sec:preproc} *}

150 text {*

151 Before selected function theorems are turned into abstract

152 code, a chain of definitional transformation steps is carried

153 out: \emph{preprocessing}. In essence, the preprocessor

154 consists of two components: a \emph{simpset} and \emph{function transformers}.

156 The \emph{simpset} allows to employ the full generality of the Isabelle

157 simplifier. Due to the interpretation of theorems

158 as code equations, rewrites are applied to the right

159 hand side and the arguments of the left hand side of an

160 equation, but never to the constant heading the left hand side.

161 An important special case are \emph{inline theorems} which may be

162 declared and undeclared using the

163 \emph{code inline} or \emph{code inline del} attribute respectively.

165 Some common applications:

166 *}

168 text_raw {*

169 \begin{itemize}

170 *}

172 text {*

173 \item replacing non-executable constructs by executable ones:

174 *}

176 lemma %quote [code inline]:

177 "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all

179 text {*

180 \item eliminating superfluous constants:

181 *}

183 lemma %quote [code inline]:

184 "1 = Suc 0" by simp

186 text {*

187 \item replacing executable but inconvenient constructs:

188 *}

190 lemma %quote [code inline]:

191 "xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all

193 text_raw {*

194 \end{itemize}

195 *}

197 text {*

198 \noindent \emph{Function transformers} provide a very general interface,

199 transforming a list of function theorems to another

200 list of function theorems, provided that neither the heading

201 constant nor its type change. The @{term "0\<Colon>nat"} / @{const Suc}

202 pattern elimination implemented in

203 theory @{text Efficient_Nat} (see \secref{eff_nat}) uses this

204 interface.

206 \noindent The current setup of the preprocessor may be inspected using

207 the @{command print_codeproc} command.

208 @{command code_thms} provides a convenient

209 mechanism to inspect the impact of a preprocessor setup

210 on code equations.

212 \begin{warn}

213 The attribute \emph{code unfold}

214 associated with the @{text "SML code generator"} also applies to

215 the @{text "generic code generator"}:

216 \emph{code unfold} implies \emph{code inline}.

217 \end{warn}

218 *}

220 subsection {* Datatypes \label{sec:datatypes} *}

222 text {*

223 Conceptually, any datatype is spanned by a set of

224 \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"} where @{text

225 "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is exactly the set of \emph{all} type variables in

226 @{text "\<tau>"}. The HOL datatype package by default registers any new

227 datatype in the table of datatypes, which may be inspected using the

228 @{command print_codesetup} command.

230 In some cases, it is appropriate to alter or extend this table. As

231 an example, we will develop an alternative representation of the

232 queue example given in \secref{sec:intro}. The amortised

233 representation is convenient for generating code but exposes its

234 \qt{implementation} details, which may be cumbersome when proving

235 theorems about it. Therefore, here a simple, straightforward

236 representation of queues:

237 *}

239 datatype %quote 'a queue = Queue "'a list"

241 definition %quote empty :: "'a queue" where

242 "empty = Queue []"

244 primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where

245 "enqueue x (Queue xs) = Queue (xs @ [x])"

247 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where

248 "dequeue (Queue []) = (None, Queue [])"

249 | "dequeue (Queue (x # xs)) = (Some x, Queue xs)"

251 text {*

252 \noindent This we can use directly for proving; for executing,

253 we provide an alternative characterisation:

254 *}

256 definition %quote AQueue :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a queue" where

257 "AQueue xs ys = Queue (ys @ rev xs)"

259 code_datatype %quote AQueue

261 text {*

262 \noindent Here we define a \qt{constructor} @{const "AQueue"} which

263 is defined in terms of @{text "Queue"} and interprets its arguments

264 according to what the \emph{content} of an amortised queue is supposed

265 to be. Equipped with this, we are able to prove the following equations

266 for our primitive queue operations which \qt{implement} the simple

267 queues in an amortised fashion:

268 *}

270 lemma %quote empty_AQueue [code]:

271 "empty = AQueue [] []"

272 unfolding AQueue_def empty_def by simp

274 lemma %quote enqueue_AQueue [code]:

275 "enqueue x (AQueue xs ys) = AQueue (x # xs) ys"

276 unfolding AQueue_def by simp

278 lemma %quote dequeue_AQueue [code]:

279 "dequeue (AQueue xs []) =

280 (if xs = [] then (None, AQueue [] [])

281 else dequeue (AQueue [] (rev xs)))"

282 "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"

283 unfolding AQueue_def by simp_all

285 text {*

286 \noindent For completeness, we provide a substitute for the

287 @{text case} combinator on queues:

288 *}

290 lemma %quote queue_case_AQueue [code]:

291 "queue_case f (AQueue xs ys) = f (ys @ rev xs)"

292 unfolding AQueue_def by simp

294 text {*

295 \noindent The resulting code looks as expected:

296 *}

298 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}

300 text {*

301 \noindent From this example, it can be glimpsed that using own

302 constructor sets is a little delicate since it changes the set of

303 valid patterns for values of that type. Without going into much

304 detail, here some practical hints:

306 \begin{itemize}

308 \item When changing the constructor set for datatypes, take care

309 to provide alternative equations for the @{text case} combinator.

311 \item Values in the target language need not to be normalised --

312 different values in the target language may represent the same

313 value in the logic.

315 \item Usually, a good methodology to deal with the subtleties of

316 pattern matching is to see the type as an abstract type: provide

317 a set of operations which operate on the concrete representation

318 of the type, and derive further operations by combinations of

319 these primitive ones, without relying on a particular

320 representation.

322 \end{itemize}

323 *}

326 subsection {* Equality *}

328 text {*

329 Surely you have already noticed how equality is treated

330 by the code generator:

331 *}

333 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where

334 "collect_duplicates xs ys [] = xs"

335 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs

336 then if z \<in> set ys

337 then collect_duplicates xs ys zs

338 else collect_duplicates xs (z#ys) zs

339 else collect_duplicates (z#xs) (z#ys) zs)"

341 text {*

342 \noindent The membership test during preprocessing is rewritten,

343 resulting in @{const List.member}, which itself

344 performs an explicit equality check.

345 *}

347 text %quote {*@{code_stmts collect_duplicates (SML)}*}

349 text {*

350 \noindent Obviously, polymorphic equality is implemented the Haskell

351 way using a type class. How is this achieved? HOL introduces

352 an explicit class @{class eq} with a corresponding operation

353 @{const eq_class.eq} such that @{thm eq [no_vars]}.

354 The preprocessing framework does the rest by propagating the

355 @{class eq} constraints through all dependent code equations.

356 For datatypes, instances of @{class eq} are implicitly derived

357 when possible. For other types, you may instantiate @{text eq}

358 manually like any other type class.

360 Though this @{text eq} class is designed to get rarely in

361 the way, in some cases the automatically derived code equations

362 for equality on a particular type may not be appropriate.

363 As example, watch the following datatype representing

364 monomorphic parametric types (where type constructors

365 are referred to by natural numbers):

366 *}

368 datatype %quote monotype = Mono nat "monotype list"

369 (*<*)

370 lemma monotype_eq:

371 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<equiv>

372 eq_class.eq tyco1 tyco2 \<and> eq_class.eq typargs1 typargs2" by (simp add: eq)

373 (*>*)

375 text {*

376 \noindent Then code generation for SML would fail with a message

377 that the generated code contains illegal mutual dependencies:

378 the theorem @{thm monotype_eq [no_vars]} already requires the

379 instance @{text "monotype \<Colon> eq"}, which itself requires

380 @{thm monotype_eq [no_vars]}; Haskell has no problem with mutually

381 recursive @{text instance} and @{text function} definitions,

382 but the SML serialiser does not support this.

384 In such cases, you have to provide your own equality equations

385 involving auxiliary constants. In our case,

386 @{const [show_types] list_all2} can do the job:

387 *}

389 lemma %quote monotype_eq_list_all2 [code]:

390 "eq_class.eq (Mono tyco1 typargs1) (Mono tyco2 typargs2) \<longleftrightarrow>

391 eq_class.eq tyco1 tyco2 \<and> list_all2 eq_class.eq typargs1 typargs2"

392 by (simp add: eq list_all2_eq [symmetric])

394 text {*

395 \noindent does not depend on instance @{text "monotype \<Colon> eq"}:

396 *}

398 text %quote {*@{code_stmts "eq_class.eq :: monotype \<Rightarrow> monotype \<Rightarrow> bool" (SML)}*}

401 subsection {* Explicit partiality *}

403 text {*

404 Partiality usually enters the game by partial patterns, as

405 in the following example, again for amortised queues:

406 *}

408 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where

409 "strict_dequeue q = (case dequeue q

410 of (Some x, q') \<Rightarrow> (x, q'))"

412 lemma %quote strict_dequeue_AQueue [code]:

413 "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"

414 "strict_dequeue (AQueue xs []) =

415 (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"

416 by (simp_all add: strict_dequeue_def dequeue_AQueue split: list.splits)

418 text {*

419 \noindent In the corresponding code, there is no equation

420 for the pattern @{term "AQueue [] []"}:

421 *}

423 text %quote {*@{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}*}

425 text {*

426 \noindent In some cases it is desirable to have this

427 pseudo-\qt{partiality} more explicitly, e.g.~as follows:

428 *}

430 axiomatization %quote empty_queue :: 'a

432 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where

433 "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"

435 lemma %quote strict_dequeue'_AQueue [code]:

436 "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue

437 else strict_dequeue' (AQueue [] (rev xs)))"

438 "strict_dequeue' (AQueue xs (y # ys)) =

439 (y, AQueue xs ys)"

440 by (simp_all add: strict_dequeue'_def dequeue_AQueue split: list.splits)

442 text {*

443 Observe that on the right hand side of the definition of @{const

444 "strict_dequeue'"} the constant @{const empty_queue} occurs

445 which is unspecified.

447 Normally, if constants without any code equations occur in a

448 program, the code generator complains (since in most cases this is

449 not what the user expects). But such constants can also be thought

450 of as function definitions with no equations which always fail,

451 since there is never a successful pattern match on the left hand

452 side. In order to categorise a constant into that category

453 explicitly, use @{command "code_abort"}:

454 *}

456 code_abort %quote empty_queue

458 text {*

459 \noindent Then the code generator will just insert an error or

460 exception at the appropriate position:

461 *}

463 text %quote {*@{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}*}

465 text {*

466 \noindent This feature however is rarely needed in practice.

467 Note also that the @{text HOL} default setup already declares

468 @{const undefined} as @{command "code_abort"}, which is most

469 likely to be used in such situations.

470 *}

472 end