author | wenzelm |
Fri, 24 Aug 2012 16:43:37 +0200 | |
changeset 48920 | 9f84d872feba |
parent 46519 | 17dde5feea4b |
permissions | -rw-r--r-- |
31050 | 1 |
theory Adaptation |
28213 | 2 |
imports Setup |
3 |
begin |
|
4 |
||
40351 | 5 |
setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) |
6 |
#> Code_Target.extend_target ("\<SMLdummy>", ("Haskell", K I)) *} |
|
28561 | 7 |
|
31050 | 8 |
section {* Adaptation to target languages \label{sec:adaptation} *} |
28419 | 9 |
|
28561 | 10 |
subsection {* Adapting code generation *} |
11 |
||
12 |
text {* |
|
13 |
The aspects of code generation introduced so far have two aspects |
|
14 |
in common: |
|
15 |
||
16 |
\begin{itemize} |
|
38450 | 17 |
|
18 |
\item They act uniformly, without reference to a specific target |
|
19 |
language. |
|
20 |
||
28561 | 21 |
\item They are \emph{safe} in the sense that as long as you trust |
22 |
the code generator meta theory and implementation, you cannot |
|
38450 | 23 |
produce programs that yield results which are not derivable in |
24 |
the logic. |
|
25 |
||
28561 | 26 |
\end{itemize} |
27 |
||
38450 | 28 |
\noindent In this section we will introduce means to \emph{adapt} |
29 |
the serialiser to a specific target language, i.e.~to print program |
|
30 |
fragments in a way which accommodates \qt{already existing} |
|
31 |
ingredients of a target language environment, for three reasons: |
|
28561 | 32 |
|
33 |
\begin{itemize} |
|
28593 | 34 |
\item improving readability and aesthetics of generated code |
28561 | 35 |
\item gaining efficiency |
36 |
\item interface with language parts which have no direct counterpart |
|
37 |
in @{text "HOL"} (say, imperative data structures) |
|
38 |
\end{itemize} |
|
39 |
||
40 |
\noindent Generally, you should avoid using those features yourself |
|
41 |
\emph{at any cost}: |
|
42 |
||
43 |
\begin{itemize} |
|
38450 | 44 |
|
45 |
\item The safe configuration methods act uniformly on every target |
|
46 |
language, whereas for adaptation you have to treat each target |
|
47 |
language separately. |
|
48 |
||
49 |
\item Application is extremely tedious since there is no |
|
50 |
abstraction which would allow for a static check, making it easy |
|
51 |
to produce garbage. |
|
52 |
||
34155 | 53 |
\item Subtle errors can be introduced unconsciously. |
38450 | 54 |
|
28561 | 55 |
\end{itemize} |
56 |
||
38450 | 57 |
\noindent However, even if you ought refrain from setting up |
58 |
adaptation yourself, already the @{text "HOL"} comes with some |
|
59 |
reasonable default adaptations (say, using target language list |
|
60 |
syntax). There also some common adaptation cases which you can |
|
61 |
setup by importing particular library theories. In order to |
|
62 |
understand these, we provide some clues here; these however are not |
|
63 |
supposed to replace a careful study of the sources. |
|
28561 | 64 |
*} |
65 |
||
38450 | 66 |
|
31050 | 67 |
subsection {* The adaptation principle *} |
28561 | 68 |
|
69 |
text {* |
|
38450 | 70 |
Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is |
71 |
conceptually supposed to be: |
|
28601 | 72 |
|
73 |
\begin{figure}[here] |
|
31050 | 74 |
\includegraphics{adaptation} |
75 |
\caption{The adaptation principle} |
|
76 |
\label{fig:adaptation} |
|
28601 | 77 |
\end{figure} |
78 |
||
79 |
\noindent In the tame view, code generation acts as broker between |
|
38450 | 80 |
@{text logic}, @{text "intermediate language"} and @{text "target |
81 |
language"} by means of @{text translation} and @{text |
|
82 |
serialisation}; for the latter, the serialiser has to observe the |
|
83 |
structure of the @{text language} itself plus some @{text reserved} |
|
84 |
keywords which have to be avoided for generated code. However, if |
|
85 |
you consider @{text adaptation} mechanisms, the code generated by |
|
86 |
the serializer is just the tip of the iceberg: |
|
28601 | 87 |
|
88 |
\begin{itemize} |
|
38450 | 89 |
|
28635 | 90 |
\item @{text serialisation} can be \emph{parametrised} such that |
91 |
logical entities are mapped to target-specific ones |
|
38450 | 92 |
(e.g. target-specific list syntax, see also |
93 |
\secref{sec:adaptation_mechanisms}) |
|
94 |
||
28635 | 95 |
\item Such parametrisations can involve references to a |
38450 | 96 |
target-specific standard @{text library} (e.g. using the @{text |
97 |
Haskell} @{verbatim Maybe} type instead of the @{text HOL} |
|
98 |
@{type "option"} type); if such are used, the corresponding |
|
99 |
identifiers (in our example, @{verbatim Maybe}, @{verbatim |
|
100 |
Nothing} and @{verbatim Just}) also have to be considered @{text |
|
101 |
reserved}. |
|
102 |
||
28635 | 103 |
\item Even more, the user can enrich the library of the |
38450 | 104 |
target-language by providing code snippets (\qt{@{text |
105 |
"includes"}}) which are prepended to any generated code (see |
|
106 |
\secref{sec:include}); this typically also involves further |
|
107 |
@{text reserved} identifiers. |
|
108 |
||
28601 | 109 |
\end{itemize} |
28635 | 110 |
|
38450 | 111 |
\noindent As figure \ref{fig:adaptation} illustrates, all these |
112 |
adaptation mechanisms have to act consistently; it is at the |
|
113 |
discretion of the user to take care for this. |
|
28561 | 114 |
*} |
115 |
||
31050 | 116 |
subsection {* Common adaptation patterns *} |
28419 | 117 |
|
118 |
text {* |
|
28428 | 119 |
The @{theory HOL} @{theory Main} theory already provides a code |
38450 | 120 |
generator setup which should be suitable for most applications. |
121 |
Common extensions and modifications are available by certain |
|
122 |
theories of the @{text HOL} library; beside being useful in |
|
123 |
applications, they may serve as a tutorial for customising the code |
|
124 |
generator setup (see below \secref{sec:adaptation_mechanisms}). |
|
28419 | 125 |
|
126 |
\begin{description} |
|
127 |
||
46519 | 128 |
\item[@{text "Code_Integer"}] represents @{text HOL} integers by |
38450 | 129 |
big integer literals in target languages. |
130 |
||
46519 | 131 |
\item[@{text "Code_Char"}] represents @{text HOL} characters by |
28419 | 132 |
character literals in target languages. |
38450 | 133 |
|
46519 | 134 |
\item[@{text "Code_Char_chr"}] like @{text "Code_Char"}, but |
135 |
also offers treatment of character codes; includes @{text |
|
38450 | 136 |
"Code_Char"}. |
137 |
||
46519 | 138 |
\item[@{text "Efficient_Nat"}] \label{eff_nat} implements |
38450 | 139 |
natural numbers by integers, which in general will result in |
140 |
higher efficiency; pattern matching with @{term "0\<Colon>nat"} / |
|
46519 | 141 |
@{const "Suc"} is eliminated; includes @{text "Code_Integer"} |
142 |
and @{text "Code_Numeral"}. |
|
38450 | 143 |
|
31206 | 144 |
\item[@{theory "Code_Numeral"}] provides an additional datatype |
38450 | 145 |
@{typ index} which is mapped to target-language built-in |
146 |
integers. Useful for code setups which involve e.g.~indexing |
|
46519 | 147 |
of target-language arrays. Part of @{text "HOL-Main"}. |
38450 | 148 |
|
149 |
\item[@{theory "String"}] provides an additional datatype @{typ |
|
150 |
String.literal} which is isomorphic to strings; @{typ |
|
151 |
String.literal}s are mapped to target-language strings. Useful |
|
152 |
for code setups which involve e.g.~printing (error) messages. |
|
46519 | 153 |
Part of @{text "HOL-Main"}. |
28419 | 154 |
|
155 |
\end{description} |
|
156 |
||
157 |
\begin{warn} |
|
46519 | 158 |
When importing any of those theories which are not part of |
159 |
@{text "HOL-Main"}, they should form the last |
|
38450 | 160 |
items in an import list. Since these theories adapt the code |
161 |
generator setup in a non-conservative fashion, strange effects may |
|
162 |
occur otherwise. |
|
28419 | 163 |
\end{warn} |
164 |
*} |
|
165 |
||
166 |
||
31050 | 167 |
subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *} |
28419 | 168 |
|
169 |
text {* |
|
38450 | 170 |
Consider the following function and its corresponding SML code: |
28419 | 171 |
*} |
172 |
||
28564 | 173 |
primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where |
28419 | 174 |
"in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" |
28447 | 175 |
(*<*) |
28419 | 176 |
code_type %invisible bool |
177 |
(SML) |
|
178 |
code_const %invisible True and False and "op \<and>" and Not |
|
179 |
(SML and and and) |
|
28447 | 180 |
(*>*) |
39745 | 181 |
text %quotetypewriter {* |
39683 | 182 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
183 |
*} |
28419 | 184 |
|
185 |
text {* |
|
38450 | 186 |
\noindent Though this is correct code, it is a little bit |
187 |
unsatisfactory: boolean values and operators are materialised as |
|
188 |
distinguished entities with have nothing to do with the SML-built-in |
|
189 |
notion of \qt{bool}. This results in less readable code; |
|
190 |
additionally, eager evaluation may cause programs to loop or break |
|
191 |
which would perfectly terminate when the existing SML @{verbatim |
|
192 |
"bool"} would be used. To map the HOL @{typ bool} on SML @{verbatim |
|
193 |
"bool"}, we may use \qn{custom serialisations}: |
|
28419 | 194 |
*} |
195 |
||
39745 | 196 |
code_type %quotett bool |
28419 | 197 |
(SML "bool") |
39745 | 198 |
code_const %quotett True and False and "op \<and>" |
28419 | 199 |
(SML "true" and "false" and "_ andalso _") |
28213 | 200 |
|
28419 | 201 |
text {* |
38505 | 202 |
\noindent The @{command_def code_type} command takes a type constructor |
38450 | 203 |
as arguments together with a list of custom serialisations. Each |
204 |
custom serialisation starts with a target language identifier |
|
205 |
followed by an expression, which during code serialisation is |
|
206 |
inserted whenever the type constructor would occur. For constants, |
|
38505 | 207 |
@{command_def code_const} implements the corresponding mechanism. Each |
38450 | 208 |
``@{verbatim "_"}'' in a serialisation expression is treated as a |
209 |
placeholder for the type constructor's (the constant's) arguments. |
|
28419 | 210 |
*} |
211 |
||
39745 | 212 |
text %quotetypewriter {* |
39683 | 213 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
214 |
*} |
28419 | 215 |
|
216 |
text {* |
|
38450 | 217 |
\noindent This still is not perfect: the parentheses around the |
218 |
\qt{andalso} expression are superfluous. Though the serialiser by |
|
219 |
no means attempts to imitate the rich Isabelle syntax framework, it |
|
220 |
provides some common idioms, notably associative infixes with |
|
221 |
precedences which may be used here: |
|
28419 | 222 |
*} |
223 |
||
39745 | 224 |
code_const %quotett "op \<and>" |
28419 | 225 |
(SML infixl 1 "andalso") |
226 |
||
39745 | 227 |
text %quotetypewriter {* |
39683 | 228 |
@{code_stmts in_interval (SML)} |
39664
0afaf89ab591
more canonical type setting of type writer code examples
haftmann
parents:
39643
diff
changeset
|
229 |
*} |
28419 | 230 |
|
231 |
text {* |
|
38450 | 232 |
\noindent The attentive reader may ask how we assert that no |
233 |
generated code will accidentally overwrite. For this reason the |
|
234 |
serialiser has an internal table of identifiers which have to be |
|
235 |
avoided to be used for new declarations. Initially, this table |
|
236 |
typically contains the keywords of the target language. It can be |
|
237 |
extended manually, thus avoiding accidental overwrites, using the |
|
38505 | 238 |
@{command_def "code_reserved"} command: |
28561 | 239 |
*} |
240 |
||
40351 | 241 |
code_reserved %quote "\<SMLdummy>" bool true false andalso |
28561 | 242 |
|
243 |
text {* |
|
28447 | 244 |
\noindent Next, we try to map HOL pairs to SML pairs, using the |
28419 | 245 |
infix ``@{verbatim "*"}'' type constructor and parentheses: |
246 |
*} |
|
28447 | 247 |
(*<*) |
37836 | 248 |
code_type %invisible prod |
28419 | 249 |
(SML) |
250 |
code_const %invisible Pair |
|
251 |
(SML) |
|
28447 | 252 |
(*>*) |
39745 | 253 |
code_type %quotett prod |
28419 | 254 |
(SML infix 2 "*") |
39745 | 255 |
code_const %quotett Pair |
28419 | 256 |
(SML "!((_),/ (_))") |
257 |
||
258 |
text {* |
|
28593 | 259 |
\noindent The initial bang ``@{verbatim "!"}'' tells the serialiser |
38450 | 260 |
never to put parentheses around the whole expression (they are |
261 |
already present), while the parentheses around argument place |
|
262 |
holders tell not to put parentheses around the arguments. The slash |
|
263 |
``@{verbatim "/"}'' (followed by arbitrary white space) inserts a |
|
264 |
space which may be used as a break if necessary during pretty |
|
265 |
printing. |
|
28419 | 266 |
|
38450 | 267 |
These examples give a glimpse what mechanisms custom serialisations |
268 |
provide; however their usage requires careful thinking in order not |
|
269 |
to introduce inconsistencies -- or, in other words: custom |
|
270 |
serialisations are completely axiomatic. |
|
28419 | 271 |
|
39643 | 272 |
A further noteworthy detail is that any special character in a |
38450 | 273 |
custom serialisation may be quoted using ``@{verbatim "'"}''; thus, |
274 |
in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a |
|
275 |
proper underscore while the second ``@{verbatim "_"}'' is a |
|
276 |
placeholder. |
|
28419 | 277 |
*} |
278 |
||
279 |
||
280 |
subsection {* @{text Haskell} serialisation *} |
|
281 |
||
282 |
text {* |
|
38450 | 283 |
For convenience, the default @{text HOL} setup for @{text Haskell} |
39063 | 284 |
maps the @{class equal} class to its counterpart in @{text Haskell}, |
285 |
giving custom serialisations for the class @{class equal} (by command |
|
39643 | 286 |
@{command_def code_class}) and its operation @{const [source] HOL.equal} |
28419 | 287 |
*} |
288 |
||
39745 | 289 |
code_class %quotett equal |
28714 | 290 |
(Haskell "Eq") |
28419 | 291 |
|
39745 | 292 |
code_const %quotett "HOL.equal" |
28419 | 293 |
(Haskell infixl 4 "==") |
294 |
||
295 |
text {* |
|
38450 | 296 |
\noindent A problem now occurs whenever a type which is an instance |
39063 | 297 |
of @{class equal} in @{text HOL} is mapped on a @{text |
38450 | 298 |
Haskell}-built-in type which is also an instance of @{text Haskell} |
299 |
@{text Eq}: |
|
28419 | 300 |
*} |
301 |
||
28564 | 302 |
typedecl %quote bar |
28419 | 303 |
|
39063 | 304 |
instantiation %quote bar :: equal |
28419 | 305 |
begin |
306 |
||
39063 | 307 |
definition %quote "HOL.equal (x\<Colon>bar) y \<longleftrightarrow> x = y" |
28419 | 308 |
|
39063 | 309 |
instance %quote by default (simp add: equal_bar_def) |
28213 | 310 |
|
30880 | 311 |
end %quote (*<*) |
312 |
||
39745 | 313 |
(*>*) code_type %quotett bar |
28419 | 314 |
(Haskell "Integer") |
315 |
||
316 |
text {* |
|
38450 | 317 |
\noindent The code generator would produce an additional instance, |
318 |
which of course is rejected by the @{text Haskell} compiler. To |
|
38506 | 319 |
suppress this additional instance, use @{command_def "code_instance"}: |
28419 | 320 |
*} |
321 |
||
39745 | 322 |
code_instance %quotett bar :: equal |
28419 | 323 |
(Haskell -) |
324 |
||
28561 | 325 |
|
28635 | 326 |
subsection {* Enhancing the target language context \label{sec:include} *} |
28561 | 327 |
|
328 |
text {* |
|
28593 | 329 |
In rare cases it is necessary to \emph{enrich} the context of a |
38505 | 330 |
target language; this is accomplished using the @{command_def |
38450 | 331 |
"code_include"} command: |
28561 | 332 |
*} |
333 |
||
39745 | 334 |
code_include %quotett Haskell "Errno" |
28561 | 335 |
{*errno i = error ("Error number: " ++ show i)*} |
336 |
||
39745 | 337 |
code_reserved %quotett Haskell Errno |
28561 | 338 |
|
339 |
text {* |
|
38450 | 340 |
\noindent Such named @{text include}s are then prepended to every |
341 |
generated code. Inspect such code in order to find out how |
|
342 |
@{command "code_include"} behaves with respect to a particular |
|
343 |
target language. |
|
28561 | 344 |
*} |
345 |
||
28419 | 346 |
end |
46519 | 347 |