24 application of theorem proving systems in the area of |
24 application of theorem proving systems in the area of |
25 software development and verification, its relevance manifests |
25 software development and verification, its relevance manifests |
26 for running test cases and rapid prototyping. In logical |
26 for running test cases and rapid prototyping. In logical |
27 calculi like constructive type theory, |
27 calculi like constructive type theory, |
28 a notion of executability is implicit due to the nature |
28 a notion of executability is implicit due to the nature |
29 of the calculus. In contrast, specifications in Isabelle/HOL |
29 of the calculus. In contrast, specifications in Isabelle |
30 can be highly non-executable. In order to bridge |
30 can be highly non-executable. In order to bridge |
31 the gap between logic and executable specifications, |
31 the gap between logic and executable specifications, |
32 an explicit non-trivial transformation has to be applied: |
32 an explicit non-trivial transformation has to be applied: |
33 code generation. |
33 code generation. |
34 |
34 |
36 Isabelle system \cite{isa-tutorial}. |
36 Isabelle system \cite{isa-tutorial}. |
37 Generic in the sense that the |
37 Generic in the sense that the |
38 \qn{target language} for which code shall ultimately be |
38 \qn{target language} for which code shall ultimately be |
39 generated is not fixed but may be an arbitrary state-of-the-art |
39 generated is not fixed but may be an arbitrary state-of-the-art |
40 functional programming language (currently, the implementation |
40 functional programming language (currently, the implementation |
41 supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell \cite{haskell-revised-report}). |
41 supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell |
|
42 \cite{haskell-revised-report}). |
42 We aim to provide a |
43 We aim to provide a |
43 versatile environment |
44 versatile environment |
44 suitable for software development and verification, |
45 suitable for software development and verification, |
45 structuring the process |
46 structuring the process |
46 of code generation into a small set of orthogonal principles |
47 of code generation into a small set of orthogonal principles |
47 while achieving a big coverage of application areas |
48 while achieving a big coverage of application areas |
48 with maximum flexibility. |
49 with maximum flexibility. |
49 |
50 |
50 For readers, some familiarity and experience |
51 Conceptually the code generator framework is part |
51 with the ingredients |
52 of Isabelle's @{text Pure} meta logic; the object logic |
52 of the HOL \emph{Main} theory is assumed. |
53 @{text HOL} which is an extension of @{text Pure} |
|
54 already comes with a reasonable framework setup and thus provides |
|
55 a good working horse for raising code-generation-driven |
|
56 applications. So, we assume some familiarity and experience |
|
57 with the ingredients of the @{text HOL} \emph{Main} theory |
|
58 (see also \cite{isa-tutorial}). |
53 *} |
59 *} |
54 |
60 |
55 |
61 |
56 subsection {* Overview *} |
62 subsection {* Overview *} |
57 |
63 |
58 text {* |
64 text {* |
59 The code generator aims to be usable with no further ado |
65 The code generator aims to be usable with no further ado |
60 in most cases while allowing for detailed customization. |
66 in most cases while allowing for detailed customization. |
61 This manifests in the structure of this tutorial: this introduction |
67 This manifests in the structure of this tutorial: |
62 continues with a short introduction of concepts. Section |
68 we start with a generic example \secref{sec:example} |
|
69 and introduce code generation concepts \secref{sec:concept}. |
|
70 Section |
63 \secref{sec:basics} explains how to use the framework naively, |
71 \secref{sec:basics} explains how to use the framework naively, |
64 presuming a reasonable default setup. Then, section |
72 presuming a reasonable default setup. Then, section |
65 \secref{sec:advanced} deals with advanced topics, |
73 \secref{sec:advanced} deals with advanced topics, |
66 introducing further aspects of the code generator framework |
74 introducing further aspects of the code generator framework |
67 in a motivation-driven manner. Last, section \secref{sec:ml} |
75 in a motivation-driven manner. Last, section \secref{sec:ml} |
72 is supposed to replace the already established code generator |
80 is supposed to replace the already established code generator |
73 by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. |
81 by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. |
74 So, for the moment, there are two distinct code generators |
82 So, for the moment, there are two distinct code generators |
75 in Isabelle. |
83 in Isabelle. |
76 Also note that while the framework itself is largely |
84 Also note that while the framework itself is largely |
77 object-logic independent, only HOL provides a reasonable |
85 object-logic independent, only @{text HOL} provides a reasonable |
78 framework setup. |
86 framework setup. |
79 \end{warn} |
87 \end{warn} |
80 *} |
88 *} |
81 |
89 |
82 |
90 |
83 subsection {* An exmaple: a simple theory of search trees *} |
91 section {* An example: a simple theory of search trees \label{sec:example} *} |
|
92 |
|
93 text {* |
|
94 When writing executable specifications, it is convenient to use |
|
95 three existing packages: the datatype package for defining |
|
96 datatypes, the function package for (recursive) functions, |
|
97 and the class package for overloaded definitions. |
|
98 |
|
99 We develope a small theory of search trees; trees are represented |
|
100 as a datatype with key type @{typ "'a"} and value type @{typ "'b"}: |
|
101 *} |
84 |
102 |
85 datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b |
103 datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b |
86 | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree" |
104 | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree" |
|
105 |
|
106 text {* |
|
107 \noindent Note that we have constrained the type of keys |
|
108 to the class of total orders, @{text linorder}. |
|
109 |
|
110 We define @{text find} and @{text update} functions: |
|
111 *} |
87 |
112 |
88 fun |
113 fun |
89 find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where |
114 find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where |
90 "find (Leaf key val) it = (if it = key then Some val else None)" |
115 "find (Leaf key val) it = (if it = key then Some val else None)" |
91 | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)" |
116 | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)" |
102 if it \<le> key |
127 if it \<le> key |
103 then (Branch (update (it, entry) t1) key t2) |
128 then (Branch (update (it, entry) t1) key t2) |
104 else (Branch t1 key (update (it, entry) t2)) |
129 else (Branch t1 key (update (it, entry) t2)) |
105 )" |
130 )" |
106 |
131 |
|
132 text {* |
|
133 \noindent For testing purpose, we define a small example |
|
134 using natural numbers @{typ nat} (which are a @{text linorder}) |
|
135 as keys and strings values: |
|
136 *} |
|
137 |
107 fun |
138 fun |
108 example :: "nat \<Rightarrow> (nat, string) searchtree" where |
139 example :: "nat \<Rightarrow> (nat, string) searchtree" where |
109 "example n = update (n, ''bar'') (Leaf 0 ''foo'')" |
140 "example n = update (n, ''bar'') (Leaf 0 ''foo'')" |
110 |
141 |
|
142 text {* |
|
143 \noindent Then we generate code |
|
144 *} |
|
145 |
111 code_gen example (*<*)(SML #)(*>*)(SML "examples/tree.ML") |
146 code_gen example (*<*)(SML #)(*>*)(SML "examples/tree.ML") |
112 |
147 |
113 text {* |
148 text {* |
|
149 \noindent which looks like: |
114 \lstsml{Thy/examples/tree.ML} |
150 \lstsml{Thy/examples/tree.ML} |
115 *} |
151 *} |
116 |
152 |
117 subsection {* Code generation process *} |
153 |
|
154 section {* Code generation concepts and process \label{sec:concept} *} |
118 |
155 |
119 text {* |
156 text {* |
120 \begin{figure}[h] |
157 \begin{figure}[h] |
121 \centering |
158 \centering |
122 \includegraphics[width=0.7\textwidth]{codegen_process} |
159 \includegraphics[width=0.7\textwidth]{codegen_process} |
179 fac :: "nat \<Rightarrow> nat" where |
216 fac :: "nat \<Rightarrow> nat" where |
180 "fac 0 = 1" |
217 "fac 0 = 1" |
181 | "fac (Suc n) = Suc n * fac n" |
218 | "fac (Suc n) = Suc n * fac n" |
182 |
219 |
183 text {* |
220 text {* |
184 This executable specification is now turned to SML code: |
221 \noindent This executable specification is now turned to SML code: |
185 *} |
222 *} |
186 |
223 |
187 code_gen fac (*<*)(SML #)(*>*)(SML "examples/fac.ML") |
224 code_gen fac (*<*)(SML #)(*>*)(SML "examples/fac.ML") |
188 |
225 |
189 text {* |
226 text {* |
190 The @{text "\<CODEGEN>"} command takes a space-separated list of |
227 \noindent The @{text "\<CODEGEN>"} command takes a space-separated list of |
191 constants together with \qn{serialization directives} |
228 constants together with \qn{serialization directives} |
192 in parentheses. These start with a \qn{target language} |
229 in parentheses. These start with a \qn{target language} |
193 identifier, followed by arguments, their semantics |
230 identifier, followed by arguments, their semantics |
194 depending on the target. In the SML case, a filename |
231 depending on the target. In the SML case, a filename |
195 is given where to write the generated code to. |
232 is given where to write the generated code to. |
200 code: |
237 code: |
201 |
238 |
202 \lstsml{Thy/examples/fac.ML} |
239 \lstsml{Thy/examples/fac.ML} |
203 |
240 |
204 The code generator will complain when a required |
241 The code generator will complain when a required |
205 ingredient does not provide a executable counterpart. |
242 ingredient does not provide a executable counterpart, |
206 This is the case if an involved type is not a datatype: |
243 e.g.~generating code |
207 *} |
|
208 |
|
209 (*<*) |
|
210 setup {* Sign.add_path "foo" *} |
|
211 (*>*) |
|
212 |
|
213 typedecl 'a foo |
|
214 |
|
215 definition |
|
216 bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a" where |
|
217 "bar x y = y" |
|
218 |
|
219 (*<*) |
|
220 hide type foo |
|
221 hide const bar |
|
222 |
|
223 setup {* Sign.parent_path *} |
|
224 |
|
225 datatype 'a foo = Foo |
|
226 |
|
227 definition |
|
228 bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a" where |
|
229 "bar x y = y" |
|
230 (*>*) |
|
231 |
|
232 code_gen bar (SML "examples/fail_type.ML") |
|
233 |
|
234 text {* |
|
235 \noindent will result in an error. Likewise, generating code |
|
236 for constants not yielding |
244 for constants not yielding |
237 a defining equation will fail, e.g.~the Hilbert choice |
245 a defining equation (e.g.~the Hilbert choice |
238 operation @{text "SOME"}: |
246 operation @{text "SOME"}): |
239 *} |
247 *} |
240 |
248 |
241 (*<*) |
249 (*<*) |
242 setup {* Sign.add_path "foo" *} |
250 setup {* Sign.add_path "foo" *} |
243 (*>*) |
251 (*>*) |
269 |
279 |
270 text {* |
280 text {* |
271 \noindent which displays a table of constant with corresponding |
281 \noindent which displays a table of constant with corresponding |
272 defining equations (the additional stuff displayed |
282 defining equations (the additional stuff displayed |
273 shall not bother us for the moment). If this table does |
283 shall not bother us for the moment). If this table does |
274 not provide at least one function |
284 not provide at least one defining |
275 equation, the table of primitive definitions is searched |
285 equation for a particular constant, the table of primitive |
|
286 definitions is searched |
276 whether it provides one. |
287 whether it provides one. |
277 |
288 |
278 The typical HOL tools are already set up in a way that |
289 The typical HOL tools are already set up in a way that |
279 function definitions introduced by @{text "\<FUN>"}, |
290 function definitions introduced by @{text "\<FUN>"}, |
280 @{text "\<FUNCTION>"}, @{text "\<PRIMREC>"} |
291 @{text "\<FUNCTION>"}, @{text "\<PRIMREC>"} |
1145 constants by unique identifiers. |
1168 constants by unique identifiers. |
1146 *} |
1169 *} |
1147 |
1170 |
1148 text %mlref {* |
1171 text %mlref {* |
1149 \begin{mldecls} |
1172 \begin{mldecls} |
1150 @{index_ML_type CodegenConsts.const: "string * typ list"} \\ |
1173 @{index_ML_type CodegenConsts.const: "string * string option"} \\ |
1151 @{index_ML CodegenConsts.norm_of_typ: "theory -> string * typ -> CodegenConsts.const"} \\ |
1174 @{index_ML CodegenConsts.const_of_cexpr: "theory -> string * typ -> CodegenConsts.const"} \\ |
1152 @{index_ML CodegenConsts.typ_of_inst: "theory -> CodegenConsts.const -> string * typ"} \\ |
|
1153 \end{mldecls} |
1175 \end{mldecls} |
1154 |
1176 |
1155 \begin{description} |
1177 \begin{description} |
1156 |
1178 |
1157 \item @{ML_type CodegenConsts.const} is the identifier type: |
1179 \item @{ML_type CodegenConsts.const} is the identifier type: |
1158 the product of a \emph{string} with a list of \emph{typs}. |
1180 the product of a \emph{string} with a list of \emph{typs}. |
1159 The \emph{string} is the constant name as represented inside Isabelle; |
1181 The \emph{string} is the constant name as represented inside Isabelle; |
1160 the \emph{typs} are a type instantiation in the sense of System F, |
1182 for overloaded constants, the attached \emph{string option} |
1161 with canonical names for type variables. |
1183 is either @{text SOME} type constructor denoting an instance, |
1162 |
1184 or @{text NONE} for the polymorphic constant. |
1163 \item @{ML CodegenConsts.norm_of_typ}~@{text thy}~@{text "(constname, typ)"} |
1185 |
1164 maps a constant expression @{text "(constname, typ)"} to its canonical identifier. |
1186 \item @{ML CodegenConsts.const_of_cexpr}~@{text thy}~@{text "(constname, typ)"} |
1165 |
1187 maps a constant expression @{text "(constname, typ)"} |
1166 \item @{ML CodegenConsts.typ_of_inst}~@{text thy}~@{text const} |
1188 to its canonical identifier. |
1167 maps a canonical identifier @{text const} to a constant |
|
1168 expression with appropriate type. |
|
1169 |
1189 |
1170 \end{description} |
1190 \end{description} |
1171 *} |
1191 *} |
1172 |
1192 |
1173 subsection {* Executable theory content: codegen\_data.ML *} |
1193 subsection {* Executable theory content: codegen\_data.ML *} |
1194 |
1214 |
1195 subsubsection {* Managing executable content *} |
1215 subsubsection {* Managing executable content *} |
1196 |
1216 |
1197 text %mlref {* |
1217 text %mlref {* |
1198 \begin{mldecls} |
1218 \begin{mldecls} |
1199 @{index_ML CodegenData.add_func: "thm -> theory -> theory"} \\ |
1219 @{index_ML CodegenData.add_func: "bool -> thm -> theory -> theory"} \\ |
1200 @{index_ML CodegenData.del_func: "thm -> theory -> theory"} \\ |
1220 @{index_ML CodegenData.del_func: "thm -> theory -> theory"} \\ |
1201 @{index_ML CodegenData.add_funcl: "CodegenConsts.const * thm list Susp.T -> theory -> theory"} \\ |
1221 @{index_ML CodegenData.add_funcl: "CodegenConsts.const * thm list Susp.T -> theory -> theory"} \\ |
1202 @{index_ML CodegenData.add_inline: "thm -> theory -> theory"} \\ |
1222 @{index_ML CodegenData.add_inline: "thm -> theory -> theory"} \\ |
1203 @{index_ML CodegenData.del_inline: "thm -> theory -> theory"} \\ |
1223 @{index_ML CodegenData.del_inline: "thm -> theory -> theory"} \\ |
1204 @{index_ML CodegenData.add_inline_proc: "string * (theory -> cterm list -> thm list) |
1224 @{index_ML CodegenData.add_inline_proc: "string * (theory -> cterm list -> thm list) |
1270 |
1290 |
1271 text %mlref {* |
1291 text %mlref {* |
1272 \begin{mldecls} |
1292 \begin{mldecls} |
1273 @{index_ML CodegenConsts.const_ord: "CodegenConsts.const * CodegenConsts.const -> order"} \\ |
1293 @{index_ML CodegenConsts.const_ord: "CodegenConsts.const * CodegenConsts.const -> order"} \\ |
1274 @{index_ML CodegenConsts.eq_const: "CodegenConsts.const * CodegenConsts.const -> bool"} \\ |
1294 @{index_ML CodegenConsts.eq_const: "CodegenConsts.const * CodegenConsts.const -> bool"} \\ |
1275 @{index_ML CodegenConsts.consts_of: "theory -> term -> CodegenConsts.const list"} \\ |
|
1276 @{index_ML CodegenConsts.read_const: "theory -> string -> CodegenConsts.const"} \\ |
1295 @{index_ML CodegenConsts.read_const: "theory -> string -> CodegenConsts.const"} \\ |
1277 @{index_ML_structure CodegenConsts.Consttab} \\ |
1296 @{index_ML_structure CodegenConsts.Consttab} \\ |
1278 @{index_ML CodegenFunc.typ_func: "thm -> typ"} \\ |
1297 @{index_ML CodegenFunc.typ_func: "thm -> typ"} \\ |
1279 @{index_ML CodegenFunc.rewrite_func: "thm list -> thm -> thm"} \\ |
1298 @{index_ML CodegenFunc.rewrite_func: "thm list -> thm -> thm"} \\ |
1280 \end{mldecls} |
1299 \end{mldecls} |
1284 \item @{ML CodegenConsts.const_ord},~@{ML CodegenConsts.eq_const} |
1303 \item @{ML CodegenConsts.const_ord},~@{ML CodegenConsts.eq_const} |
1285 provide order and equality on constant identifiers. |
1304 provide order and equality on constant identifiers. |
1286 |
1305 |
1287 \item @{ML_struct CodegenConsts.Consttab} |
1306 \item @{ML_struct CodegenConsts.Consttab} |
1288 provides table structures with constant identifiers as keys. |
1307 provides table structures with constant identifiers as keys. |
1289 |
|
1290 \item @{ML CodegenConsts.consts_of}~@{text thy}~@{text t} |
|
1291 returns all constant identifiers mentioned in a term @{text t}. |
|
1292 |
1308 |
1293 \item @{ML CodegenConsts.read_const}~@{text thy}~@{text s} |
1309 \item @{ML CodegenConsts.read_const}~@{text thy}~@{text s} |
1294 reads a constant as a concrete term expression @{text s}. |
1310 reads a constant as a concrete term expression @{text s}. |
1295 |
1311 |
1296 \item @{ML CodegenFunc.typ_func}~@{text thm} |
1312 \item @{ML CodegenFunc.typ_func}~@{text thm} |