28213
|
1 |
theory Adaption
|
|
2 |
imports Setup
|
|
3 |
begin
|
|
4 |
|
28601
|
5 |
setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", I)) *}
|
28561
|
6 |
|
28419
|
7 |
section {* Adaption to target languages \label{sec:adaption} *}
|
|
8 |
|
28561
|
9 |
subsection {* Adapting code generation *}
|
|
10 |
|
|
11 |
text {*
|
|
12 |
The aspects of code generation introduced so far have two aspects
|
|
13 |
in common:
|
|
14 |
|
|
15 |
\begin{itemize}
|
|
16 |
\item They act uniformly, without reference to a specific
|
|
17 |
target language.
|
|
18 |
\item They are \emph{safe} in the sense that as long as you trust
|
|
19 |
the code generator meta theory and implementation, you cannot
|
|
20 |
produce programs that yield results which are not derivable
|
|
21 |
in the logic.
|
|
22 |
\end{itemize}
|
|
23 |
|
|
24 |
\noindent In this section we will introduce means to \emph{adapt} the serialiser
|
|
25 |
to a specific target language, i.e.~to print program fragments
|
28593
|
26 |
in a way which accommodates \qt{already existing} ingredients of
|
28561
|
27 |
a target language environment, for three reasons:
|
|
28 |
|
|
29 |
\begin{itemize}
|
28593
|
30 |
\item improving readability and aesthetics of generated code
|
28561
|
31 |
\item gaining efficiency
|
|
32 |
\item interface with language parts which have no direct counterpart
|
|
33 |
in @{text "HOL"} (say, imperative data structures)
|
|
34 |
\end{itemize}
|
|
35 |
|
|
36 |
\noindent Generally, you should avoid using those features yourself
|
|
37 |
\emph{at any cost}:
|
|
38 |
|
|
39 |
\begin{itemize}
|
|
40 |
\item The safe configuration methods act uniformly on every target language,
|
|
41 |
whereas for adaption you have to treat each target language separate.
|
|
42 |
\item Application is extremely tedious since there is no abstraction
|
28593
|
43 |
which would allow for a static check, making it easy to produce garbage.
|
|
44 |
\item More or less subtle errors can be introduced unconsciously.
|
28561
|
45 |
\end{itemize}
|
|
46 |
|
|
47 |
\noindent However, even if you ought refrain from setting up adaption
|
|
48 |
yourself, already the @{text "HOL"} comes with some reasonable default
|
|
49 |
adaptions (say, using target language list syntax). There also some
|
|
50 |
common adaption cases which you can setup by importing particular
|
|
51 |
library theories. In order to understand these, we provide some clues here;
|
|
52 |
these however are not supposed to replace a careful study of the sources.
|
|
53 |
*}
|
|
54 |
|
|
55 |
subsection {* The adaption principle *}
|
|
56 |
|
|
57 |
text {*
|
28601
|
58 |
The following figure illustrates what \qt{adaption} is conceptually
|
|
59 |
supposed to be:
|
|
60 |
|
|
61 |
\begin{figure}[here]
|
|
62 |
\begin{tikzpicture}[scale = 0.5]
|
|
63 |
\tikzstyle water=[color = blue, thick]
|
|
64 |
\tikzstyle ice=[color = black, very thick, cap = round, join = round, fill = white]
|
|
65 |
\tikzstyle process=[color = green, semithick, ->]
|
|
66 |
\tikzstyle adaption=[color = red, semithick, ->]
|
|
67 |
\tikzstyle target=[color = black]
|
|
68 |
\foreach \x in {0, ..., 24}
|
|
69 |
\draw[style=water] (\x, 0.25) sin + (0.25, 0.25) cos + (0.25, -0.25) sin
|
|
70 |
+ (0.25, -0.25) cos + (0.25, 0.25);
|
|
71 |
\draw[style=ice] (1, 0) --
|
|
72 |
(3, 6) node[above, fill=white] {logic} -- (5, 0) -- cycle;
|
|
73 |
\draw[style=ice] (9, 0) --
|
|
74 |
(11, 6) node[above, fill=white] {intermediate language} -- (13, 0) -- cycle;
|
|
75 |
\draw[style=ice] (15, -6) --
|
|
76 |
(19, 6) node[above, fill=white] {target language} -- (23, -6) -- cycle;
|
|
77 |
\draw[style=process]
|
|
78 |
(3.5, 3) .. controls (7, 5) .. node[fill=white] {translation} (10.5, 3);
|
|
79 |
\draw[style=process]
|
|
80 |
(11.5, 3) .. controls (15, 5) .. node[fill=white] (serialisation) {serialisation} (18.5, 3);
|
|
81 |
\node (adaption) at (11, -2) [style=adaption] {adaption};
|
|
82 |
\node at (19, 3) [rotate=90] {generated};
|
|
83 |
\node at (19.5, -5) {language};
|
|
84 |
\node at (19.5, -3) {library};
|
|
85 |
\node (includes) at (19.5, -1) {includes};
|
28609
|
86 |
\node (reserved) at (16.5, -3) [rotate=72] {reserved}; % proper 71.57
|
28601
|
87 |
\draw[style=process]
|
|
88 |
(includes) -- (serialisation);
|
|
89 |
\draw[style=process]
|
|
90 |
(reserved) -- (serialisation);
|
|
91 |
\draw[style=adaption]
|
|
92 |
(adaption) -- (serialisation);
|
|
93 |
\draw[style=adaption]
|
|
94 |
(adaption) -- (includes);
|
|
95 |
\draw[style=adaption]
|
|
96 |
(adaption) -- (reserved);
|
|
97 |
\end{tikzpicture}
|
|
98 |
\caption{The adaption principle}
|
|
99 |
\label{fig:adaption}
|
|
100 |
\end{figure}
|
|
101 |
|
|
102 |
\noindent In the tame view, code generation acts as broker between
|
|
103 |
@{text logic}, @{text "intermediate language"} and
|
|
104 |
@{text "target language"} by means of @{text translation} and
|
|
105 |
@{text serialisation}; for the latter, the serialiser has to observe
|
|
106 |
the structure of the @{text language} itself plus some @{text reserved}
|
|
107 |
keywords which have to be avoided for generated code.
|
|
108 |
However, if you consider @{text adaption} mechanisms, the code generated
|
|
109 |
by the serializer is just the tip of the iceberg:
|
|
110 |
|
|
111 |
\begin{itemize}
|
|
112 |
\item parametrise @{text serialisation}
|
|
113 |
\item @{text library} @{text reserved}
|
|
114 |
\item @{text "includes"} @{text reserved}
|
|
115 |
\end{itemize}
|
28561
|
116 |
*}
|
|
117 |
|
28419
|
118 |
subsection {* Common adaption cases *}
|
|
119 |
|
|
120 |
text {*
|
28428
|
121 |
The @{theory HOL} @{theory Main} theory already provides a code
|
28419
|
122 |
generator setup
|
28593
|
123 |
which should be suitable for most applications. Common extensions
|
28419
|
124 |
and modifications are available by certain theories of the @{text HOL}
|
|
125 |
library; beside being useful in applications, they may serve
|
|
126 |
as a tutorial for customising the code generator setup (see below
|
|
127 |
\secref{sec:adaption_mechanisms}).
|
|
128 |
|
|
129 |
\begin{description}
|
|
130 |
|
|
131 |
\item[@{theory "Code_Integer"}] represents @{text HOL} integers by big
|
|
132 |
integer literals in target languages.
|
|
133 |
\item[@{theory "Code_Char"}] represents @{text HOL} characters by
|
|
134 |
character literals in target languages.
|
|
135 |
\item[@{theory "Code_Char_chr"}] like @{text "Code_Char"},
|
|
136 |
but also offers treatment of character codes; includes
|
28561
|
137 |
@{theory "Code_Char"}.
|
28419
|
138 |
\item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
|
|
139 |
which in general will result in higher efficiency; pattern
|
|
140 |
matching with @{term "0\<Colon>nat"} / @{const "Suc"}
|
28561
|
141 |
is eliminated; includes @{theory "Code_Integer"}
|
|
142 |
and @{theory "Code_Index"}.
|
28419
|
143 |
\item[@{theory "Code_Index"}] provides an additional datatype
|
|
144 |
@{typ index} which is mapped to target-language built-in integers.
|
|
145 |
Useful for code setups which involve e.g. indexing of
|
|
146 |
target-language arrays.
|
|
147 |
\item[@{theory "Code_Message"}] provides an additional datatype
|
|
148 |
@{typ message_string} which is isomorphic to strings;
|
|
149 |
@{typ message_string}s are mapped to target-language strings.
|
|
150 |
Useful for code setups which involve e.g. printing (error) messages.
|
|
151 |
|
|
152 |
\end{description}
|
|
153 |
|
|
154 |
\begin{warn}
|
|
155 |
When importing any of these theories, they should form the last
|
|
156 |
items in an import list. Since these theories adapt the
|
|
157 |
code generator setup in a non-conservative fashion,
|
|
158 |
strange effects may occur otherwise.
|
|
159 |
\end{warn}
|
|
160 |
*}
|
|
161 |
|
|
162 |
|
|
163 |
subsection {* Adaption mechanisms \label{sec:adaption_mechanisms} *}
|
|
164 |
|
|
165 |
text {*
|
28561
|
166 |
Consider the following function and its corresponding
|
28419
|
167 |
SML code:
|
|
168 |
*}
|
|
169 |
|
28564
|
170 |
primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
|
28419
|
171 |
"in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
|
28447
|
172 |
(*<*)
|
28419
|
173 |
code_type %invisible bool
|
|
174 |
(SML)
|
|
175 |
code_const %invisible True and False and "op \<and>" and Not
|
|
176 |
(SML and and and)
|
28447
|
177 |
(*>*)
|
28564
|
178 |
text %quote {*@{code_stmts in_interval (SML)}*}
|
28419
|
179 |
|
|
180 |
text {*
|
|
181 |
\noindent Though this is correct code, it is a little bit unsatisfactory:
|
|
182 |
boolean values and operators are materialised as distinguished
|
|
183 |
entities with have nothing to do with the SML-built-in notion
|
|
184 |
of \qt{bool}. This results in less readable code;
|
|
185 |
additionally, eager evaluation may cause programs to
|
|
186 |
loop or break which would perfectly terminate when
|
|
187 |
the existing SML @{verbatim "bool"} would be used. To map
|
|
188 |
the HOL @{typ bool} on SML @{verbatim "bool"}, we may use
|
|
189 |
\qn{custom serialisations}:
|
|
190 |
*}
|
|
191 |
|
28564
|
192 |
code_type %quotett bool
|
28419
|
193 |
(SML "bool")
|
28564
|
194 |
code_const %quotett True and False and "op \<and>"
|
28419
|
195 |
(SML "true" and "false" and "_ andalso _")
|
28213
|
196 |
|
28419
|
197 |
text {*
|
28447
|
198 |
\noindent The @{command code_type} command takes a type constructor
|
28419
|
199 |
as arguments together with a list of custom serialisations.
|
|
200 |
Each custom serialisation starts with a target language
|
|
201 |
identifier followed by an expression, which during
|
|
202 |
code serialisation is inserted whenever the type constructor
|
|
203 |
would occur. For constants, @{command code_const} implements
|
|
204 |
the corresponding mechanism. Each ``@{verbatim "_"}'' in
|
|
205 |
a serialisation expression is treated as a placeholder
|
|
206 |
for the type constructor's (the constant's) arguments.
|
|
207 |
*}
|
|
208 |
|
28564
|
209 |
text %quote {*@{code_stmts in_interval (SML)}*}
|
28419
|
210 |
|
|
211 |
text {*
|
|
212 |
\noindent This still is not perfect: the parentheses
|
|
213 |
around the \qt{andalso} expression are superfluous.
|
28593
|
214 |
Though the serialiser
|
28419
|
215 |
by no means attempts to imitate the rich Isabelle syntax
|
|
216 |
framework, it provides some common idioms, notably
|
|
217 |
associative infixes with precedences which may be used here:
|
|
218 |
*}
|
|
219 |
|
28564
|
220 |
code_const %quotett "op \<and>"
|
28419
|
221 |
(SML infixl 1 "andalso")
|
|
222 |
|
28564
|
223 |
text %quote {*@{code_stmts in_interval (SML)}*}
|
28419
|
224 |
|
|
225 |
text {*
|
28561
|
226 |
\noindent The attentive reader may ask how we assert that no generated
|
|
227 |
code will accidentally overwrite. For this reason the serialiser has
|
|
228 |
an internal table of identifiers which have to be avoided to be used
|
|
229 |
for new declarations. Initially, this table typically contains the
|
|
230 |
keywords of the target language. It can be extended manually, thus avoiding
|
|
231 |
accidental overwrites, using the @{command "code_reserved"} command:
|
|
232 |
*}
|
|
233 |
|
28601
|
234 |
code_reserved %quote "\<SML>" bool true false andalso
|
28561
|
235 |
|
|
236 |
text {*
|
28447
|
237 |
\noindent Next, we try to map HOL pairs to SML pairs, using the
|
28419
|
238 |
infix ``@{verbatim "*"}'' type constructor and parentheses:
|
|
239 |
*}
|
28447
|
240 |
(*<*)
|
28419
|
241 |
code_type %invisible *
|
|
242 |
(SML)
|
|
243 |
code_const %invisible Pair
|
|
244 |
(SML)
|
28447
|
245 |
(*>*)
|
28564
|
246 |
code_type %quotett *
|
28419
|
247 |
(SML infix 2 "*")
|
28564
|
248 |
code_const %quotett Pair
|
28419
|
249 |
(SML "!((_),/ (_))")
|
|
250 |
|
|
251 |
text {*
|
28593
|
252 |
\noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
|
28561
|
253 |
never to put
|
28419
|
254 |
parentheses around the whole expression (they are already present),
|
|
255 |
while the parentheses around argument place holders
|
|
256 |
tell not to put parentheses around the arguments.
|
|
257 |
The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
|
|
258 |
inserts a space which may be used as a break if necessary
|
|
259 |
during pretty printing.
|
|
260 |
|
|
261 |
These examples give a glimpse what mechanisms
|
|
262 |
custom serialisations provide; however their usage
|
|
263 |
requires careful thinking in order not to introduce
|
|
264 |
inconsistencies -- or, in other words:
|
|
265 |
custom serialisations are completely axiomatic.
|
|
266 |
|
|
267 |
A further noteworthy details is that any special
|
|
268 |
character in a custom serialisation may be quoted
|
|
269 |
using ``@{verbatim "'"}''; thus, in
|
|
270 |
``@{verbatim "fn '_ => _"}'' the first
|
|
271 |
``@{verbatim "_"}'' is a proper underscore while the
|
|
272 |
second ``@{verbatim "_"}'' is a placeholder.
|
|
273 |
*}
|
|
274 |
|
|
275 |
|
|
276 |
subsection {* @{text Haskell} serialisation *}
|
|
277 |
|
|
278 |
text {*
|
|
279 |
For convenience, the default
|
|
280 |
@{text HOL} setup for @{text Haskell} maps the @{class eq} class to
|
|
281 |
its counterpart in @{text Haskell}, giving custom serialisations
|
|
282 |
for the class @{class eq} (by command @{command code_class}) and its operation
|
|
283 |
@{const HOL.eq}
|
|
284 |
*}
|
|
285 |
|
28564
|
286 |
code_class %quotett eq
|
28419
|
287 |
(Haskell "Eq" where "HOL.eq" \<equiv> "(==)")
|
|
288 |
|
28564
|
289 |
code_const %quotett "op ="
|
28419
|
290 |
(Haskell infixl 4 "==")
|
|
291 |
|
|
292 |
text {*
|
28447
|
293 |
\noindent A problem now occurs whenever a type which
|
28419
|
294 |
is an instance of @{class eq} in @{text HOL} is mapped
|
|
295 |
on a @{text Haskell}-built-in type which is also an instance
|
|
296 |
of @{text Haskell} @{text Eq}:
|
|
297 |
*}
|
|
298 |
|
28564
|
299 |
typedecl %quote bar
|
28419
|
300 |
|
28564
|
301 |
instantiation %quote bar :: eq
|
28419
|
302 |
begin
|
|
303 |
|
28564
|
304 |
definition %quote "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y"
|
28419
|
305 |
|
28564
|
306 |
instance %quote by default (simp add: eq_bar_def)
|
28213
|
307 |
|
28564
|
308 |
end %quote
|
28419
|
309 |
|
28564
|
310 |
code_type %quotett bar
|
28419
|
311 |
(Haskell "Integer")
|
|
312 |
|
|
313 |
text {*
|
28447
|
314 |
\noindent The code generator would produce
|
28593
|
315 |
an additional instance, which of course is rejected by the @{text Haskell}
|
28419
|
316 |
compiler.
|
|
317 |
To suppress this additional instance, use
|
|
318 |
@{text "code_instance"}:
|
|
319 |
*}
|
|
320 |
|
28564
|
321 |
code_instance %quotett bar :: eq
|
28419
|
322 |
(Haskell -)
|
|
323 |
|
28561
|
324 |
|
|
325 |
subsection {* Enhancing the target language context *}
|
|
326 |
|
|
327 |
text {*
|
28593
|
328 |
In rare cases it is necessary to \emph{enrich} the context of a
|
28561
|
329 |
target language; this is accomplished using the @{command "code_include"}
|
|
330 |
command:
|
|
331 |
*}
|
|
332 |
|
28564
|
333 |
code_include %quotett Haskell "Errno"
|
28561
|
334 |
{*errno i = error ("Error number: " ++ show i)*}
|
|
335 |
|
28564
|
336 |
code_reserved %quotett Haskell Errno
|
28561
|
337 |
|
|
338 |
text {*
|
|
339 |
\noindent Such named @{text include}s are then prepended to every generated code.
|
|
340 |
Inspect such code in order to find out how @{command "code_include"} behaves
|
|
341 |
with respect to a particular target language.
|
|
342 |
*}
|
|
343 |
|
28419
|
344 |
end
|