author | haftmann |
Fri, 15 Feb 2013 11:47:33 +0100 | |
changeset 51160 | 599ff65b85e2 |
parent 51143 | 0a2371e7ced3 |
child 51162 | 310b94ed1815 |
permissions | -rw-r--r-- |
31050 | 1 |
theory Adaptation |
28213 | 2 |
imports Setup |
3 |
begin |
|
4 |
||
40351 | 5 |
setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) |
6 |
#> Code_Target.extend_target ("\<SMLdummy>", ("Haskell", K I)) *} |
|
28561 | 7 |
|
31050 | 8 |
section {* Adaptation to target languages \label{sec:adaptation} *} |
28419 | 9 |
|
28561 | 10 |
subsection {* Adapting code generation *} |
11 |
||
12 |
text {* |
|
13 |
The aspects of code generation introduced so far have two aspects |
|
14 |
in common: |
|
15 |
||
16 |
\begin{itemize} |
|
38450 | 17 |
|
18 |
\item They act uniformly, without reference to a specific target |
|
19 |
language. |
|
20 |
||
28561 | 21 |
\item They are \emph{safe} in the sense that as long as you trust |
22 |
the code generator meta theory and implementation, you cannot |
|
38450 | 23 |
produce programs that yield results which are not derivable in |
24 |
the logic. |
|
25 |
||
28561 | 26 |
\end{itemize} |
27 |
||
38450 | 28 |
\noindent In this section we will introduce means to \emph{adapt} |
29 |
the serialiser to a specific target language, i.e.~to print program |
|
30 |
fragments in a way which accommodates \qt{already existing} |
|
31 |
ingredients of a target language environment, for three reasons: |
|
28561 | 32 |
|
33 |
\begin{itemize} |
|
28593 | 34 |
\item improving readability and aesthetics of generated code |
28561 | 35 |
\item gaining efficiency |
36 |
\item interface with language parts which have no direct counterpart |
|
37 |
in @{text "HOL"} (say, imperative data structures) |
|
38 |
\end{itemize} |
|
39 |
||
40 |
\noindent Generally, you should avoid using those features yourself |
|
41 |
\emph{at any cost}: |
|
42 |
||
43 |
\begin{itemize} |
|
38450 | 44 |
|
45 |
\item The safe configuration methods act uniformly on every target |
|
46 |
language, whereas for adaptation you have to treat each target |
|
47 |
language separately. |
|
48 |
||
49 |
\item Application is extremely tedious since there is no |
|
50 |
abstraction which would allow for a static check, making it easy |
|
51 |
to produce garbage. |
|
52 |
||
34155 | 53 |
\item Subtle errors can be introduced unconsciously. |
38450 | 54 |
|
28561 | 55 |
\end{itemize} |
56 |
||
38450 | 57 |
\noindent However, even if you ought refrain from setting up |
58 |
adaptation yourself, already the @{text "HOL"} comes with some |
|
59 |
reasonable default adaptations (say, using target language list |
|
60 |
syntax). There also some common adaptation cases which you can |
|
61 |
setup by importing particular library theories. In order to |
|
62 |
understand these, we provide some clues here; these however are not |
|
63 |
supposed to replace a careful study of the sources. |
|
28561 | 64 |
*} |
65 |
||
38450 | 66 |
|
31050 | 67 |
subsection {* The adaptation principle *} |
28561 | 68 |
|
69 |
text {* |
|
38450 | 70 |
Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is |
71 |
conceptually supposed to be: |
|
28601 | 72 |
|
73 |
\begin{figure}[here] |
|
48954
c548d26daa8c
avoid clash with generated Adaptation.tex on case-insensible file-systems;
wenzelm
parents:
48951
diff
changeset
|
74 |
\includegraphics{adapt} |
31050 | 75 |
\caption{The adaptation principle} |
76 |
\label{fig:adaptation} |
|
28601 | 77 |
\end{figure} |
78 |
||
79 |
\noindent In the tame view, code generation acts as broker between |
|
38450 | 80 |
@{text logic}, @{text "intermediate language"} and @{text "target |
81 |
language"} by means of @{text translation} and @{text |
|
82 |
serialisation}; for the latter, the serialiser has to observe the |
|
83 |
structure of the @{text language} itself plus some @{text reserved} |
|
84 |
keywords which have to be avoided for generated code. However, if |
|
85 |
you consider @{text adaptation} mechanisms, the code generated by |
|
86 |
the serializer is just the tip of the iceberg: |
|
28601 | 87 |
|
88 |
\begin{itemize} |
|
38450 | 89 |
|
28635 | 90 |
\item @{text serialisation} can be \emph{parametrised} such that |
91 |
logical entities are mapped to target-specific ones |
|
38450 | 92 |
(e.g. target-specific list syntax, see also |
93 |
\secref{sec:adaptation_mechanisms}) |
|
94 |
||
28635 | 95 |
\item Such parametrisations can involve references to a |
38450 | 96 |
target-specific standard @{text library} (e.g. using the @{text |
97 |
Haskell} @{verbatim Maybe} type instead of the @{text HOL} |
|
98 |
@{type "option"} type); if such are used, the corresponding |
|
99 |
identifiers (in our example, @{verbatim Maybe}, @{verbatim |
|
100 |
Nothing} and @{verbatim Just}) also have to be considered @{text |
|
101 |
reserved}. |
|
102 |
||
28635 | 103 |
\item Even more, the user can enrich the library of the |
38450 | 104 |
target-language by providing code snippets (\qt{@{text |
105 |
"includes"}}) which are prepended to any generated code (see |
|
106 |
\secref{sec:include}); this typically also involves further |
|
107 |
@{text reserved} identifiers. |
|
108 |
||
28601 | 109 |
\end{itemize} |
28635 | 110 |
|
38450 | 111 |
\noindent As figure \ref{fig:adaptation} illustrates, all these |
112 |
adaptation mechanisms have to act consistently; it is at the |
|
113 |
discretion of the user to take care for this. |
|
28561 | 114 |
*} |
115 |
||
31050 | 116 |
subsection {* Common adaptation patterns *} |
28419 | 117 |
|
118 |
text {* |
|
28428 | 119 |
The @{theory HOL} @{theory Main} theory already provides a code |
38450 | 120 |
generator setup which should be suitable for most applications. |
121 |
Common extensions and modifications are available by certain |
|
122 |
theories of the @{text HOL} library; beside being useful in |
|
123 |
applications, they may serve as a tutorial for customising the code |
|
124 |
generator setup (see below \secref{sec:adaptation_mechanisms}). |
|
28419 | 125 |
|
126 |
\begin{description} |
|
127 |
||
51143
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
128 |
\item[@{theory "Code_Numeral"}] provides additional numeric |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
129 |
types @{typ integer} and @{typ natural} isomorphic to types |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
130 |
@{typ int} and @{typ nat} respectively. Type @{typ integer} |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
131 |
is mapped to target-language built-in integers; @{typ natural} |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
132 |
is implemented as abstract type over @{typ integer}. |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
133 |
Useful for code setups which involve e.g.~indexing |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
134 |
of target-language arrays. Part of @{text "HOL-Main"}. |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
135 |
|
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
136 |
\item[@{text "Code_Target_Int"}] implements type @{typ int} |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
137 |
by @{typ integer} and thus by target-language built-in integers. |
38450 | 138 |
|
51143
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
139 |
\item[@{text "Code_Binary_Nat"}] \label{eff_nat} implements type |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
140 |
@{typ nat} using a binary rather than a linear representation, |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
141 |
which yields a considerable speedup for computations. |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
142 |
Pattern matching with @{term "0\<Colon>nat"} / @{const "Suc"} is eliminated |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
143 |
by a preprocessor. |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
144 |
|
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
145 |
\item[@{text "Code_Target_Nat"}] implements type @{typ int} |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
146 |
by @{typ integer} and thus by target-language built-in integers; |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
147 |
contains @{text "Code_Binary_Nat"} as a prerequisite. |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
148 |
|
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
149 |
\item[@{text "Code_Target_Numeral"}] is a convenience node |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
150 |
containing both @{text "Code_Target_Nat"} and |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
151 |
@{text "Code_Target_Int"}. |
38450 | 152 |
|
46519 | 153 |
\item[@{text "Efficient_Nat"}] \label{eff_nat} implements |
38450 | 154 |
natural numbers by integers, which in general will result in |
155 |
higher efficiency; pattern matching with @{term "0\<Colon>nat"} / |
|
46519 | 156 |
@{const "Suc"} is eliminated; includes @{text "Code_Integer"} |
157 |
and @{text "Code_Numeral"}. |
|
38450 | 158 |
|
51143
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
159 |
\item[@{text "Code_Char"}] represents @{text HOL} characters by |
0a2371e7ced3
two target language numeral types: integer and natural, as replacement for code_numeral;
haftmann
parents:
48985
diff
changeset
|
160 |
character literals in target languages. |
38450 | 161 |
|
162 |
\item[@{theory "String"}] provides an additional datatype @{typ |
|
163 |
String.literal} which is isomorphic to strings; @{typ |
|
164 |
String.literal}s are mapped to target-language strings. Useful |
|
165 |
for code setups which involve e.g.~printing (error) messages. |
|
46519 | 166 |
Part of @{text "HOL-Main"}. |
28419 | 167 |
|
168 |
\end{description} |
|
169 |
||
170 |
\begin{warn} |
|
46519 | 171 |
When importing any of those theories which are not part of |
172 |
@{text "HOL-Main"}, they should form the last |
|
38450 | 173 |
items in an import list. Since these theories adapt the code |
174 |
generator setup in a non-conservative fashion, strange effects may |
|
175 |
occur otherwise. |
|
28419 | 176 |
\end{warn} |
177 |
*} |
|
178 |
||
179 |
||
31050 | 180 |
subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *} |
28419 | 181 |
|
182 |
text {* |
|
38450 | 183 |
Consider the following function and its corresponding SML code: |
28419 | 184 |
*} |
185 |
||
28564 | 186 |
primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where |
28419 | 187 |
"in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" |
28447 | 188 |
(*<*) |
28419 | 189 |
code_type %invisible bool |
190 |
(SML) |
|
191 |
code_const %invisible True and False and "op \<and>" and Not |
|
192 |
(SML and and and) |
|
28447 | 193 |
(*>*) |
39745 | 194 |
text %quotetypewriter {* |
39683 | 195 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
196 |
*} |
28419 | 197 |
|
198 |
text {* |
|
38450 | 199 |
\noindent Though this is correct code, it is a little bit |
200 |
unsatisfactory: boolean values and operators are materialised as |
|
201 |
distinguished entities with have nothing to do with the SML-built-in |
|
202 |
notion of \qt{bool}. This results in less readable code; |
|
203 |
additionally, eager evaluation may cause programs to loop or break |
|
204 |
which would perfectly terminate when the existing SML @{verbatim |
|
205 |
"bool"} would be used. To map the HOL @{typ bool} on SML @{verbatim |
|
206 |
"bool"}, we may use \qn{custom serialisations}: |
|
28419 | 207 |
*} |
208 |
||
39745 | 209 |
code_type %quotett bool |
28419 | 210 |
(SML "bool") |
39745 | 211 |
code_const %quotett True and False and "op \<and>" |
28419 | 212 |
(SML "true" and "false" and "_ andalso _") |
28213 | 213 |
|
28419 | 214 |
text {* |
38505 | 215 |
\noindent The @{command_def code_type} command takes a type constructor |
38450 | 216 |
as arguments together with a list of custom serialisations. Each |
217 |
custom serialisation starts with a target language identifier |
|
218 |
followed by an expression, which during code serialisation is |
|
219 |
inserted whenever the type constructor would occur. For constants, |
|
38505 | 220 |
@{command_def code_const} implements the corresponding mechanism. Each |
38450 | 221 |
``@{verbatim "_"}'' in a serialisation expression is treated as a |
222 |
placeholder for the type constructor's (the constant's) arguments. |
|
28419 | 223 |
*} |
224 |
||
39745 | 225 |
text %quotetypewriter {* |
39683 | 226 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
227 |
*} |
28419 | 228 |
|
229 |
text {* |
|
38450 | 230 |
\noindent This still is not perfect: the parentheses around the |
231 |
\qt{andalso} expression are superfluous. Though the serialiser by |
|
232 |
no means attempts to imitate the rich Isabelle syntax framework, it |
|
233 |
provides some common idioms, notably associative infixes with |
|
234 |
precedences which may be used here: |
|
28419 | 235 |
*} |
236 |
||
39745 | 237 |
code_const %quotett "op \<and>" |
28419 | 238 |
(SML infixl 1 "andalso") |
239 |
||
39745 | 240 |
text %quotetypewriter {* |
39683 | 241 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
242 |
*} |
28419 | 243 |
|
244 |
text {* |
|
38450 | 245 |
\noindent The attentive reader may ask how we assert that no |
246 |
generated code will accidentally overwrite. For this reason the |
|
247 |
serialiser has an internal table of identifiers which have to be |
|
248 |
avoided to be used for new declarations. Initially, this table |
|
249 |
typically contains the keywords of the target language. It can be |
|
250 |
extended manually, thus avoiding accidental overwrites, using the |
|
38505 | 251 |
@{command_def "code_reserved"} command: |
28561 | 252 |
*} |
253 |
||
40351 | 254 |
code_reserved %quote "\<SMLdummy>" bool true false andalso |
28561 | 255 |
|
256 |
text {* |
|
28447 | 257 |
\noindent Next, we try to map HOL pairs to SML pairs, using the |
28419 | 258 |
infix ``@{verbatim "*"}'' type constructor and parentheses: |
259 |
*} |
|
28447 | 260 |
(*<*) |
37836 | 261 |
code_type %invisible prod |
28419 | 262 |
(SML) |
263 |
code_const %invisible Pair |
|
264 |
(SML) |
|
28447 | 265 |
(*>*) |
39745 | 266 |
code_type %quotett prod |
28419 | 267 |
(SML infix 2 "*") |
39745 | 268 |
code_const %quotett Pair |
28419 | 269 |
(SML "!((_),/ (_))") |
270 |
||
271 |
text {* |
|
28593 | 272 |
\noindent The initial bang ``@{verbatim "!"}'' tells the serialiser |
38450 | 273 |
never to put parentheses around the whole expression (they are |
274 |
already present), while the parentheses around argument place |
|
275 |
holders tell not to put parentheses around the arguments. The slash |
|
276 |
``@{verbatim "/"}'' (followed by arbitrary white space) inserts a |
|
277 |
space which may be used as a break if necessary during pretty |
|
278 |
printing. |
|
28419 | 279 |
|
38450 | 280 |
These examples give a glimpse what mechanisms custom serialisations |
281 |
provide; however their usage requires careful thinking in order not |
|
282 |
to introduce inconsistencies -- or, in other words: custom |
|
283 |
serialisations are completely axiomatic. |
|
28419 | 284 |
|
39643 | 285 |
A further noteworthy detail is that any special character in a |
38450 | 286 |
custom serialisation may be quoted using ``@{verbatim "'"}''; thus, |
287 |
in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a |
|
288 |
proper underscore while the second ``@{verbatim "_"}'' is a |
|
289 |
placeholder. |
|
28419 | 290 |
*} |
291 |
||
292 |
||
293 |
subsection {* @{text Haskell} serialisation *} |
|
294 |
||
295 |
text {* |
|
38450 | 296 |
For convenience, the default @{text HOL} setup for @{text Haskell} |
39063 | 297 |
maps the @{class equal} class to its counterpart in @{text Haskell}, |
298 |
giving custom serialisations for the class @{class equal} (by command |
|
39643 | 299 |
@{command_def code_class}) and its operation @{const [source] HOL.equal} |
28419 | 300 |
*} |
301 |
||
39745 | 302 |
code_class %quotett equal |
28714 | 303 |
(Haskell "Eq") |
28419 | 304 |
|
39745 | 305 |
code_const %quotett "HOL.equal" |
28419 | 306 |
(Haskell infixl 4 "==") |
307 |
||
308 |
text {* |
|
38450 | 309 |
\noindent A problem now occurs whenever a type which is an instance |
39063 | 310 |
of @{class equal} in @{text HOL} is mapped on a @{text |
38450 | 311 |
Haskell}-built-in type which is also an instance of @{text Haskell} |
312 |
@{text Eq}: |
|
28419 | 313 |
*} |
314 |
||
28564 | 315 |
typedecl %quote bar |
28419 | 316 |
|
39063 | 317 |
instantiation %quote bar :: equal |
28419 | 318 |
begin |
319 |
||
39063 | 320 |
definition %quote "HOL.equal (x\<Colon>bar) y \<longleftrightarrow> x = y" |
28419 | 321 |
|
39063 | 322 |
instance %quote by default (simp add: equal_bar_def) |
28213 | 323 |
|
30880 | 324 |
end %quote (*<*) |
325 |
||
39745 | 326 |
(*>*) code_type %quotett bar |
28419 | 327 |
(Haskell "Integer") |
328 |
||
329 |
text {* |
|
38450 | 330 |
\noindent The code generator would produce an additional instance, |
331 |
which of course is rejected by the @{text Haskell} compiler. To |
|
38506 | 332 |
suppress this additional instance, use @{command_def "code_instance"}: |
28419 | 333 |
*} |
334 |
||
39745 | 335 |
code_instance %quotett bar :: equal |
28419 | 336 |
(Haskell -) |
337 |
||
28561 | 338 |
|
28635 | 339 |
subsection {* Enhancing the target language context \label{sec:include} *} |
28561 | 340 |
|
341 |
text {* |
|
28593 | 342 |
In rare cases it is necessary to \emph{enrich} the context of a |
38505 | 343 |
target language; this is accomplished using the @{command_def |
38450 | 344 |
"code_include"} command: |
28561 | 345 |
*} |
346 |
||
39745 | 347 |
code_include %quotett Haskell "Errno" |
28561 | 348 |
{*errno i = error ("Error number: " ++ show i)*} |
349 |
||
39745 | 350 |
code_reserved %quotett Haskell Errno |
28561 | 351 |
|
352 |
text {* |
|
38450 | 353 |
\noindent Such named @{text include}s are then prepended to every |
354 |
generated code. Inspect such code in order to find out how |
|
355 |
@{command "code_include"} behaves with respect to a particular |
|
356 |
target language. |
|
28561 | 357 |
*} |
358 |
||
28419 | 359 |
end |
46519 | 360 |