|
1 theory Adaptation |
|
2 imports Setup |
|
3 begin |
|
4 |
|
5 setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) |
|
6 #> Code_Target.extend_target ("\<SMLdummy>", ("Haskell", K I)) *} |
|
7 |
|
8 section {* Adaptation to target languages \label{sec:adaptation} *} |
|
9 |
|
10 subsection {* Adapting code generation *} |
|
11 |
|
12 text {* |
|
13 The aspects of code generation introduced so far have two aspects |
|
14 in common: |
|
15 |
|
16 \begin{itemize} |
|
17 |
|
18 \item They act uniformly, without reference to a specific target |
|
19 language. |
|
20 |
|
21 \item They are \emph{safe} in the sense that as long as you trust |
|
22 the code generator meta theory and implementation, you cannot |
|
23 produce programs that yield results which are not derivable in |
|
24 the logic. |
|
25 |
|
26 \end{itemize} |
|
27 |
|
28 \noindent In this section we will introduce means to \emph{adapt} |
|
29 the serialiser to a specific target language, i.e.~to print program |
|
30 fragments in a way which accommodates \qt{already existing} |
|
31 ingredients of a target language environment, for three reasons: |
|
32 |
|
33 \begin{itemize} |
|
34 \item improving readability and aesthetics of generated code |
|
35 \item gaining efficiency |
|
36 \item interface with language parts which have no direct counterpart |
|
37 in @{text "HOL"} (say, imperative data structures) |
|
38 \end{itemize} |
|
39 |
|
40 \noindent Generally, you should avoid using those features yourself |
|
41 \emph{at any cost}: |
|
42 |
|
43 \begin{itemize} |
|
44 |
|
45 \item The safe configuration methods act uniformly on every target |
|
46 language, whereas for adaptation you have to treat each target |
|
47 language separately. |
|
48 |
|
49 \item Application is extremely tedious since there is no |
|
50 abstraction which would allow for a static check, making it easy |
|
51 to produce garbage. |
|
52 |
|
53 \item Subtle errors can be introduced unconsciously. |
|
54 |
|
55 \end{itemize} |
|
56 |
|
57 \noindent However, even if you ought refrain from setting up |
|
58 adaptation yourself, already the @{text "HOL"} comes with some |
|
59 reasonable default adaptations (say, using target language list |
|
60 syntax). There also some common adaptation cases which you can |
|
61 setup by importing particular library theories. In order to |
|
62 understand these, we provide some clues here; these however are not |
|
63 supposed to replace a careful study of the sources. |
|
64 *} |
|
65 |
|
66 |
|
67 subsection {* The adaptation principle *} |
|
68 |
|
69 text {* |
|
70 Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is |
|
71 conceptually supposed to be: |
|
72 |
|
73 \begin{figure}[here] |
|
74 \includegraphics{adapt} |
|
75 \caption{The adaptation principle} |
|
76 \label{fig:adaptation} |
|
77 \end{figure} |
|
78 |
|
79 \noindent In the tame view, code generation acts as broker between |
|
80 @{text logic}, @{text "intermediate language"} and @{text "target |
|
81 language"} by means of @{text translation} and @{text |
|
82 serialisation}; for the latter, the serialiser has to observe the |
|
83 structure of the @{text language} itself plus some @{text reserved} |
|
84 keywords which have to be avoided for generated code. However, if |
|
85 you consider @{text adaptation} mechanisms, the code generated by |
|
86 the serializer is just the tip of the iceberg: |
|
87 |
|
88 \begin{itemize} |
|
89 |
|
90 \item @{text serialisation} can be \emph{parametrised} such that |
|
91 logical entities are mapped to target-specific ones |
|
92 (e.g. target-specific list syntax, see also |
|
93 \secref{sec:adaptation_mechanisms}) |
|
94 |
|
95 \item Such parametrisations can involve references to a |
|
96 target-specific standard @{text library} (e.g. using the @{text |
|
97 Haskell} @{verbatim Maybe} type instead of the @{text HOL} |
|
98 @{type "option"} type); if such are used, the corresponding |
|
99 identifiers (in our example, @{verbatim Maybe}, @{verbatim |
|
100 Nothing} and @{verbatim Just}) also have to be considered @{text |
|
101 reserved}. |
|
102 |
|
103 \item Even more, the user can enrich the library of the |
|
104 target-language by providing code snippets (\qt{@{text |
|
105 "includes"}}) which are prepended to any generated code (see |
|
106 \secref{sec:include}); this typically also involves further |
|
107 @{text reserved} identifiers. |
|
108 |
|
109 \end{itemize} |
|
110 |
|
111 \noindent As figure \ref{fig:adaptation} illustrates, all these |
|
112 adaptation mechanisms have to act consistently; it is at the |
|
113 discretion of the user to take care for this. |
|
114 *} |
|
115 |
|
116 subsection {* Common adaptation patterns *} |
|
117 |
|
118 text {* |
|
119 The @{theory HOL} @{theory Main} theory already provides a code |
|
120 generator setup which should be suitable for most applications. |
|
121 Common extensions and modifications are available by certain |
|
122 theories of the @{text HOL} library; beside being useful in |
|
123 applications, they may serve as a tutorial for customising the code |
|
124 generator setup (see below \secref{sec:adaptation_mechanisms}). |
|
125 |
|
126 \begin{description} |
|
127 |
|
128 \item[@{text "Code_Integer"}] represents @{text HOL} integers by |
|
129 big integer literals in target languages. |
|
130 |
|
131 \item[@{text "Code_Char"}] represents @{text HOL} characters by |
|
132 character literals in target languages. |
|
133 |
|
134 \item[@{text "Code_Char_chr"}] like @{text "Code_Char"}, but |
|
135 also offers treatment of character codes; includes @{text |
|
136 "Code_Char"}. |
|
137 |
|
138 \item[@{text "Efficient_Nat"}] \label{eff_nat} implements |
|
139 natural numbers by integers, which in general will result in |
|
140 higher efficiency; pattern matching with @{term "0\<Colon>nat"} / |
|
141 @{const "Suc"} is eliminated; includes @{text "Code_Integer"} |
|
142 and @{text "Code_Numeral"}. |
|
143 |
|
144 \item[@{theory "Code_Numeral"}] provides an additional datatype |
|
145 @{typ index} which is mapped to target-language built-in |
|
146 integers. Useful for code setups which involve e.g.~indexing |
|
147 of target-language arrays. Part of @{text "HOL-Main"}. |
|
148 |
|
149 \item[@{theory "String"}] provides an additional datatype @{typ |
|
150 String.literal} which is isomorphic to strings; @{typ |
|
151 String.literal}s are mapped to target-language strings. Useful |
|
152 for code setups which involve e.g.~printing (error) messages. |
|
153 Part of @{text "HOL-Main"}. |
|
154 |
|
155 \end{description} |
|
156 |
|
157 \begin{warn} |
|
158 When importing any of those theories which are not part of |
|
159 @{text "HOL-Main"}, they should form the last |
|
160 items in an import list. Since these theories adapt the code |
|
161 generator setup in a non-conservative fashion, strange effects may |
|
162 occur otherwise. |
|
163 \end{warn} |
|
164 *} |
|
165 |
|
166 |
|
167 subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *} |
|
168 |
|
169 text {* |
|
170 Consider the following function and its corresponding SML code: |
|
171 *} |
|
172 |
|
173 primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where |
|
174 "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" |
|
175 (*<*) |
|
176 code_type %invisible bool |
|
177 (SML) |
|
178 code_const %invisible True and False and "op \<and>" and Not |
|
179 (SML and and and) |
|
180 (*>*) |
|
181 text %quotetypewriter {* |
|
182 @{code_stmts in_interval (SML)} |
|
183 *} |
|
184 |
|
185 text {* |
|
186 \noindent Though this is correct code, it is a little bit |
|
187 unsatisfactory: boolean values and operators are materialised as |
|
188 distinguished entities with have nothing to do with the SML-built-in |
|
189 notion of \qt{bool}. This results in less readable code; |
|
190 additionally, eager evaluation may cause programs to loop or break |
|
191 which would perfectly terminate when the existing SML @{verbatim |
|
192 "bool"} would be used. To map the HOL @{typ bool} on SML @{verbatim |
|
193 "bool"}, we may use \qn{custom serialisations}: |
|
194 *} |
|
195 |
|
196 code_type %quotett bool |
|
197 (SML "bool") |
|
198 code_const %quotett True and False and "op \<and>" |
|
199 (SML "true" and "false" and "_ andalso _") |
|
200 |
|
201 text {* |
|
202 \noindent The @{command_def code_type} command takes a type constructor |
|
203 as arguments together with a list of custom serialisations. Each |
|
204 custom serialisation starts with a target language identifier |
|
205 followed by an expression, which during code serialisation is |
|
206 inserted whenever the type constructor would occur. For constants, |
|
207 @{command_def code_const} implements the corresponding mechanism. Each |
|
208 ``@{verbatim "_"}'' in a serialisation expression is treated as a |
|
209 placeholder for the type constructor's (the constant's) arguments. |
|
210 *} |
|
211 |
|
212 text %quotetypewriter {* |
|
213 @{code_stmts in_interval (SML)} |
|
214 *} |
|
215 |
|
216 text {* |
|
217 \noindent This still is not perfect: the parentheses around the |
|
218 \qt{andalso} expression are superfluous. Though the serialiser by |
|
219 no means attempts to imitate the rich Isabelle syntax framework, it |
|
220 provides some common idioms, notably associative infixes with |
|
221 precedences which may be used here: |
|
222 *} |
|
223 |
|
224 code_const %quotett "op \<and>" |
|
225 (SML infixl 1 "andalso") |
|
226 |
|
227 text %quotetypewriter {* |
|
228 @{code_stmts in_interval (SML)} |
|
229 *} |
|
230 |
|
231 text {* |
|
232 \noindent The attentive reader may ask how we assert that no |
|
233 generated code will accidentally overwrite. For this reason the |
|
234 serialiser has an internal table of identifiers which have to be |
|
235 avoided to be used for new declarations. Initially, this table |
|
236 typically contains the keywords of the target language. It can be |
|
237 extended manually, thus avoiding accidental overwrites, using the |
|
238 @{command_def "code_reserved"} command: |
|
239 *} |
|
240 |
|
241 code_reserved %quote "\<SMLdummy>" bool true false andalso |
|
242 |
|
243 text {* |
|
244 \noindent Next, we try to map HOL pairs to SML pairs, using the |
|
245 infix ``@{verbatim "*"}'' type constructor and parentheses: |
|
246 *} |
|
247 (*<*) |
|
248 code_type %invisible prod |
|
249 (SML) |
|
250 code_const %invisible Pair |
|
251 (SML) |
|
252 (*>*) |
|
253 code_type %quotett prod |
|
254 (SML infix 2 "*") |
|
255 code_const %quotett Pair |
|
256 (SML "!((_),/ (_))") |
|
257 |
|
258 text {* |
|
259 \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser |
|
260 never to put parentheses around the whole expression (they are |
|
261 already present), while the parentheses around argument place |
|
262 holders tell not to put parentheses around the arguments. The slash |
|
263 ``@{verbatim "/"}'' (followed by arbitrary white space) inserts a |
|
264 space which may be used as a break if necessary during pretty |
|
265 printing. |
|
266 |
|
267 These examples give a glimpse what mechanisms custom serialisations |
|
268 provide; however their usage requires careful thinking in order not |
|
269 to introduce inconsistencies -- or, in other words: custom |
|
270 serialisations are completely axiomatic. |
|
271 |
|
272 A further noteworthy detail is that any special character in a |
|
273 custom serialisation may be quoted using ``@{verbatim "'"}''; thus, |
|
274 in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a |
|
275 proper underscore while the second ``@{verbatim "_"}'' is a |
|
276 placeholder. |
|
277 *} |
|
278 |
|
279 |
|
280 subsection {* @{text Haskell} serialisation *} |
|
281 |
|
282 text {* |
|
283 For convenience, the default @{text HOL} setup for @{text Haskell} |
|
284 maps the @{class equal} class to its counterpart in @{text Haskell}, |
|
285 giving custom serialisations for the class @{class equal} (by command |
|
286 @{command_def code_class}) and its operation @{const [source] HOL.equal} |
|
287 *} |
|
288 |
|
289 code_class %quotett equal |
|
290 (Haskell "Eq") |
|
291 |
|
292 code_const %quotett "HOL.equal" |
|
293 (Haskell infixl 4 "==") |
|
294 |
|
295 text {* |
|
296 \noindent A problem now occurs whenever a type which is an instance |
|
297 of @{class equal} in @{text HOL} is mapped on a @{text |
|
298 Haskell}-built-in type which is also an instance of @{text Haskell} |
|
299 @{text Eq}: |
|
300 *} |
|
301 |
|
302 typedecl %quote bar |
|
303 |
|
304 instantiation %quote bar :: equal |
|
305 begin |
|
306 |
|
307 definition %quote "HOL.equal (x\<Colon>bar) y \<longleftrightarrow> x = y" |
|
308 |
|
309 instance %quote by default (simp add: equal_bar_def) |
|
310 |
|
311 end %quote (*<*) |
|
312 |
|
313 (*>*) code_type %quotett bar |
|
314 (Haskell "Integer") |
|
315 |
|
316 text {* |
|
317 \noindent The code generator would produce an additional instance, |
|
318 which of course is rejected by the @{text Haskell} compiler. To |
|
319 suppress this additional instance, use @{command_def "code_instance"}: |
|
320 *} |
|
321 |
|
322 code_instance %quotett bar :: equal |
|
323 (Haskell -) |
|
324 |
|
325 |
|
326 subsection {* Enhancing the target language context \label{sec:include} *} |
|
327 |
|
328 text {* |
|
329 In rare cases it is necessary to \emph{enrich} the context of a |
|
330 target language; this is accomplished using the @{command_def |
|
331 "code_include"} command: |
|
332 *} |
|
333 |
|
334 code_include %quotett Haskell "Errno" |
|
335 {*errno i = error ("Error number: " ++ show i)*} |
|
336 |
|
337 code_reserved %quotett Haskell Errno |
|
338 |
|
339 text {* |
|
340 \noindent Such named @{text include}s are then prepended to every |
|
341 generated code. Inspect such code in order to find out how |
|
342 @{command "code_include"} behaves with respect to a particular |
|
343 target language. |
|
344 *} |
|
345 |
|
346 end |
|
347 |