| author | traytel | 
| Mon, 10 Sep 2012 09:57:21 +0200 | |
| changeset 49241 | fd11fe9dc6bb | 
| parent 48985 | 5386df44a037 | 
| child 51143 | 0a2371e7ced3 | 
| permissions | -rw-r--r-- | 
| 31050 | 1  | 
theory Adaptation  | 
| 28213 | 2  | 
imports Setup  | 
3  | 
begin  | 
|
4  | 
||
| 40351 | 5  | 
setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I))
 | 
6  | 
  #> Code_Target.extend_target ("\<SMLdummy>", ("Haskell", K I)) *}
 | 
|
| 28561 | 7  | 
|
| 31050 | 8  | 
section {* Adaptation to target languages \label{sec:adaptation} *}
 | 
| 28419 | 9  | 
|
| 28561 | 10  | 
subsection {* Adapting code generation *}
 | 
11  | 
||
12  | 
text {*
 | 
|
13  | 
The aspects of code generation introduced so far have two aspects  | 
|
14  | 
in common:  | 
|
15  | 
||
16  | 
  \begin{itemize}
 | 
|
| 38450 | 17  | 
|
18  | 
\item They act uniformly, without reference to a specific target  | 
|
19  | 
language.  | 
|
20  | 
||
| 28561 | 21  | 
    \item They are \emph{safe} in the sense that as long as you trust
 | 
22  | 
the code generator meta theory and implementation, you cannot  | 
|
| 38450 | 23  | 
produce programs that yield results which are not derivable in  | 
24  | 
the logic.  | 
|
25  | 
||
| 28561 | 26  | 
  \end{itemize}
 | 
27  | 
||
| 38450 | 28  | 
  \noindent In this section we will introduce means to \emph{adapt}
 | 
29  | 
the serialiser to a specific target language, i.e.~to print program  | 
|
30  | 
  fragments in a way which accommodates \qt{already existing}
 | 
|
31  | 
ingredients of a target language environment, for three reasons:  | 
|
| 28561 | 32  | 
|
33  | 
  \begin{itemize}
 | 
|
| 28593 | 34  | 
\item improving readability and aesthetics of generated code  | 
| 28561 | 35  | 
\item gaining efficiency  | 
36  | 
\item interface with language parts which have no direct counterpart  | 
|
37  | 
      in @{text "HOL"} (say, imperative data structures)
 | 
|
38  | 
  \end{itemize}
 | 
|
39  | 
||
40  | 
\noindent Generally, you should avoid using those features yourself  | 
|
41  | 
  \emph{at any cost}:
 | 
|
42  | 
||
43  | 
  \begin{itemize}
 | 
|
| 38450 | 44  | 
|
45  | 
\item The safe configuration methods act uniformly on every target  | 
|
46  | 
language, whereas for adaptation you have to treat each target  | 
|
47  | 
language separately.  | 
|
48  | 
||
49  | 
\item Application is extremely tedious since there is no  | 
|
50  | 
abstraction which would allow for a static check, making it easy  | 
|
51  | 
to produce garbage.  | 
|
52  | 
||
| 34155 | 53  | 
\item Subtle errors can be introduced unconsciously.  | 
| 38450 | 54  | 
|
| 28561 | 55  | 
  \end{itemize}
 | 
56  | 
||
| 38450 | 57  | 
\noindent However, even if you ought refrain from setting up  | 
58  | 
  adaptation yourself, already the @{text "HOL"} comes with some
 | 
|
59  | 
reasonable default adaptations (say, using target language list  | 
|
60  | 
syntax). There also some common adaptation cases which you can  | 
|
61  | 
setup by importing particular library theories. In order to  | 
|
62  | 
understand these, we provide some clues here; these however are not  | 
|
63  | 
supposed to replace a careful study of the sources.  | 
|
| 28561 | 64  | 
*}  | 
65  | 
||
| 38450 | 66  | 
|
| 31050 | 67  | 
subsection {* The adaptation principle *}
 | 
| 28561 | 68  | 
|
69  | 
text {*
 | 
|
| 38450 | 70  | 
  Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is
 | 
71  | 
conceptually supposed to be:  | 
|
| 28601 | 72  | 
|
73  | 
  \begin{figure}[here]
 | 
|
| 
48954
 
c548d26daa8c
avoid clash with generated Adaptation.tex on case-insensible file-systems;
 
wenzelm 
parents: 
48951 
diff
changeset
 | 
74  | 
    \includegraphics{adapt}
 | 
| 31050 | 75  | 
    \caption{The adaptation principle}
 | 
76  | 
    \label{fig:adaptation}
 | 
|
| 28601 | 77  | 
  \end{figure}
 | 
78  | 
||
79  | 
\noindent In the tame view, code generation acts as broker between  | 
|
| 38450 | 80  | 
  @{text logic}, @{text "intermediate language"} and @{text "target
 | 
81  | 
  language"} by means of @{text translation} and @{text
 | 
|
82  | 
serialisation}; for the latter, the serialiser has to observe the  | 
|
83  | 
  structure of the @{text language} itself plus some @{text reserved}
 | 
|
84  | 
keywords which have to be avoided for generated code. However, if  | 
|
85  | 
  you consider @{text adaptation} mechanisms, the code generated by
 | 
|
86  | 
the serializer is just the tip of the iceberg:  | 
|
| 28601 | 87  | 
|
88  | 
  \begin{itemize}
 | 
|
| 38450 | 89  | 
|
| 28635 | 90  | 
    \item @{text serialisation} can be \emph{parametrised} such that
 | 
91  | 
logical entities are mapped to target-specific ones  | 
|
| 38450 | 92  | 
(e.g. target-specific list syntax, see also  | 
93  | 
      \secref{sec:adaptation_mechanisms})
 | 
|
94  | 
||
| 28635 | 95  | 
\item Such parametrisations can involve references to a  | 
| 38450 | 96  | 
      target-specific standard @{text library} (e.g. using the @{text
 | 
97  | 
      Haskell} @{verbatim Maybe} type instead of the @{text HOL}
 | 
|
98  | 
      @{type "option"} type); if such are used, the corresponding
 | 
|
99  | 
      identifiers (in our example, @{verbatim Maybe}, @{verbatim
 | 
|
100  | 
      Nothing} and @{verbatim Just}) also have to be considered @{text
 | 
|
101  | 
reserved}.  | 
|
102  | 
||
| 28635 | 103  | 
\item Even more, the user can enrich the library of the  | 
| 38450 | 104  | 
      target-language by providing code snippets (\qt{@{text
 | 
105  | 
"includes"}}) which are prepended to any generated code (see  | 
|
106  | 
      \secref{sec:include}); this typically also involves further
 | 
|
107  | 
      @{text reserved} identifiers.
 | 
|
108  | 
||
| 28601 | 109  | 
  \end{itemize}
 | 
| 28635 | 110  | 
|
| 38450 | 111  | 
  \noindent As figure \ref{fig:adaptation} illustrates, all these
 | 
112  | 
adaptation mechanisms have to act consistently; it is at the  | 
|
113  | 
discretion of the user to take care for this.  | 
|
| 28561 | 114  | 
*}  | 
115  | 
||
| 31050 | 116  | 
subsection {* Common adaptation patterns *}
 | 
| 28419 | 117  | 
|
118  | 
text {*
 | 
|
| 28428 | 119  | 
  The @{theory HOL} @{theory Main} theory already provides a code
 | 
| 38450 | 120  | 
generator setup which should be suitable for most applications.  | 
121  | 
Common extensions and modifications are available by certain  | 
|
122  | 
  theories of the @{text HOL} library; beside being useful in
 | 
|
123  | 
applications, they may serve as a tutorial for customising the code  | 
|
124  | 
  generator setup (see below \secref{sec:adaptation_mechanisms}).
 | 
|
| 28419 | 125  | 
|
126  | 
  \begin{description}
 | 
|
127  | 
||
| 46519 | 128  | 
    \item[@{text "Code_Integer"}] represents @{text HOL} integers by
 | 
| 38450 | 129  | 
big integer literals in target languages.  | 
130  | 
||
| 46519 | 131  | 
    \item[@{text "Code_Char"}] represents @{text HOL} characters by
 | 
| 28419 | 132  | 
character literals in target languages.  | 
| 38450 | 133  | 
|
| 46519 | 134  | 
    \item[@{text "Code_Char_chr"}] like @{text "Code_Char"}, but
 | 
135  | 
       also offers treatment of character codes; includes @{text
 | 
|
| 38450 | 136  | 
"Code_Char"}.  | 
137  | 
||
| 46519 | 138  | 
    \item[@{text "Efficient_Nat"}] \label{eff_nat} implements
 | 
| 38450 | 139  | 
natural numbers by integers, which in general will result in  | 
140  | 
       higher efficiency; pattern matching with @{term "0\<Colon>nat"} /
 | 
|
| 46519 | 141  | 
       @{const "Suc"} is eliminated; includes @{text "Code_Integer"}
 | 
142  | 
       and @{text "Code_Numeral"}.
 | 
|
| 38450 | 143  | 
|
| 31206 | 144  | 
    \item[@{theory "Code_Numeral"}] provides an additional datatype
 | 
| 38450 | 145  | 
       @{typ index} which is mapped to target-language built-in
 | 
146  | 
integers. Useful for code setups which involve e.g.~indexing  | 
|
| 46519 | 147  | 
       of target-language arrays.  Part of @{text "HOL-Main"}.
 | 
| 38450 | 148  | 
|
149  | 
    \item[@{theory "String"}] provides an additional datatype @{typ
 | 
|
150  | 
       String.literal} which is isomorphic to strings; @{typ
 | 
|
151  | 
String.literal}s are mapped to target-language strings. Useful  | 
|
152  | 
for code setups which involve e.g.~printing (error) messages.  | 
|
| 46519 | 153  | 
       Part of @{text "HOL-Main"}.
 | 
| 28419 | 154  | 
|
155  | 
  \end{description}
 | 
|
156  | 
||
157  | 
  \begin{warn}
 | 
|
| 46519 | 158  | 
When importing any of those theories which are not part of  | 
159  | 
    @{text "HOL-Main"}, they should form the last
 | 
|
| 38450 | 160  | 
items in an import list. Since these theories adapt the code  | 
161  | 
generator setup in a non-conservative fashion, strange effects may  | 
|
162  | 
occur otherwise.  | 
|
| 28419 | 163  | 
  \end{warn}
 | 
164  | 
*}  | 
|
165  | 
||
166  | 
||
| 31050 | 167  | 
subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *}
 | 
| 28419 | 168  | 
|
169  | 
text {*
 | 
|
| 38450 | 170  | 
Consider the following function and its corresponding SML code:  | 
| 28419 | 171  | 
*}  | 
172  | 
||
| 28564 | 173  | 
primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where  | 
| 28419 | 174  | 
"in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"  | 
| 28447 | 175  | 
(*<*)  | 
| 28419 | 176  | 
code_type %invisible bool  | 
177  | 
(SML)  | 
|
178  | 
code_const %invisible True and False and "op \<and>" and Not  | 
|
179  | 
(SML and and and)  | 
|
| 28447 | 180  | 
(*>*)  | 
| 39745 | 181  | 
text %quotetypewriter {*
 | 
| 39683 | 182  | 
  @{code_stmts in_interval (SML)}
 | 
| 
39664
 
0afaf89ab591
more canonical type setting of type writer code examples
 
haftmann 
parents: 
39643 
diff
changeset
 | 
183  | 
*}  | 
| 28419 | 184  | 
|
185  | 
text {*
 | 
|
| 38450 | 186  | 
\noindent Though this is correct code, it is a little bit  | 
187  | 
unsatisfactory: boolean values and operators are materialised as  | 
|
188  | 
distinguished entities with have nothing to do with the SML-built-in  | 
|
189  | 
  notion of \qt{bool}.  This results in less readable code;
 | 
|
190  | 
additionally, eager evaluation may cause programs to loop or break  | 
|
191  | 
  which would perfectly terminate when the existing SML @{verbatim
 | 
|
192  | 
  "bool"} would be used.  To map the HOL @{typ bool} on SML @{verbatim
 | 
|
193  | 
  "bool"}, we may use \qn{custom serialisations}:
 | 
|
| 28419 | 194  | 
*}  | 
195  | 
||
| 39745 | 196  | 
code_type %quotett bool  | 
| 28419 | 197  | 
(SML "bool")  | 
| 39745 | 198  | 
code_const %quotett True and False and "op \<and>"  | 
| 28419 | 199  | 
(SML "true" and "false" and "_ andalso _")  | 
| 28213 | 200  | 
|
| 28419 | 201  | 
text {*
 | 
| 38505 | 202  | 
  \noindent The @{command_def code_type} command takes a type constructor
 | 
| 38450 | 203  | 
as arguments together with a list of custom serialisations. Each  | 
204  | 
custom serialisation starts with a target language identifier  | 
|
205  | 
followed by an expression, which during code serialisation is  | 
|
206  | 
inserted whenever the type constructor would occur. For constants,  | 
|
| 38505 | 207  | 
  @{command_def code_const} implements the corresponding mechanism.  Each
 | 
| 38450 | 208  | 
  ``@{verbatim "_"}'' in a serialisation expression is treated as a
 | 
209  | 
placeholder for the type constructor's (the constant's) arguments.  | 
|
| 28419 | 210  | 
*}  | 
211  | 
||
| 39745 | 212  | 
text %quotetypewriter {*
 | 
| 39683 | 213  | 
  @{code_stmts in_interval (SML)}
 | 
| 
39664
 
0afaf89ab591
more canonical type setting of type writer code examples
 
haftmann 
parents: 
39643 
diff
changeset
 | 
214  | 
*}  | 
| 28419 | 215  | 
|
216  | 
text {*
 | 
|
| 38450 | 217  | 
\noindent This still is not perfect: the parentheses around the  | 
218  | 
  \qt{andalso} expression are superfluous.  Though the serialiser by
 | 
|
219  | 
no means attempts to imitate the rich Isabelle syntax framework, it  | 
|
220  | 
provides some common idioms, notably associative infixes with  | 
|
221  | 
precedences which may be used here:  | 
|
| 28419 | 222  | 
*}  | 
223  | 
||
| 39745 | 224  | 
code_const %quotett "op \<and>"  | 
| 28419 | 225  | 
(SML infixl 1 "andalso")  | 
226  | 
||
| 39745 | 227  | 
text %quotetypewriter {*
 | 
| 39683 | 228  | 
  @{code_stmts in_interval (SML)}
 | 
| 
39664
 
0afaf89ab591
more canonical type setting of type writer code examples
 
haftmann 
parents: 
39643 
diff
changeset
 | 
229  | 
*}  | 
| 28419 | 230  | 
|
231  | 
text {*
 | 
|
| 38450 | 232  | 
\noindent The attentive reader may ask how we assert that no  | 
233  | 
generated code will accidentally overwrite. For this reason the  | 
|
234  | 
serialiser has an internal table of identifiers which have to be  | 
|
235  | 
avoided to be used for new declarations. Initially, this table  | 
|
236  | 
typically contains the keywords of the target language. It can be  | 
|
237  | 
extended manually, thus avoiding accidental overwrites, using the  | 
|
| 38505 | 238  | 
  @{command_def "code_reserved"} command:
 | 
| 28561 | 239  | 
*}  | 
240  | 
||
| 40351 | 241  | 
code_reserved %quote "\<SMLdummy>" bool true false andalso  | 
| 28561 | 242  | 
|
243  | 
text {*
 | 
|
| 28447 | 244  | 
\noindent Next, we try to map HOL pairs to SML pairs, using the  | 
| 28419 | 245  | 
  infix ``@{verbatim "*"}'' type constructor and parentheses:
 | 
246  | 
*}  | 
|
| 28447 | 247  | 
(*<*)  | 
| 37836 | 248  | 
code_type %invisible prod  | 
| 28419 | 249  | 
(SML)  | 
250  | 
code_const %invisible Pair  | 
|
251  | 
(SML)  | 
|
| 28447 | 252  | 
(*>*)  | 
| 39745 | 253  | 
code_type %quotett prod  | 
| 28419 | 254  | 
(SML infix 2 "*")  | 
| 39745 | 255  | 
code_const %quotett Pair  | 
| 28419 | 256  | 
(SML "!((_),/ (_))")  | 
257  | 
||
258  | 
text {*
 | 
|
| 28593 | 259  | 
  \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
 | 
| 38450 | 260  | 
never to put parentheses around the whole expression (they are  | 
261  | 
already present), while the parentheses around argument place  | 
|
262  | 
holders tell not to put parentheses around the arguments. The slash  | 
|
263  | 
  ``@{verbatim "/"}'' (followed by arbitrary white space) inserts a
 | 
|
264  | 
space which may be used as a break if necessary during pretty  | 
|
265  | 
printing.  | 
|
| 28419 | 266  | 
|
| 38450 | 267  | 
These examples give a glimpse what mechanisms custom serialisations  | 
268  | 
provide; however their usage requires careful thinking in order not  | 
|
269  | 
to introduce inconsistencies -- or, in other words: custom  | 
|
270  | 
serialisations are completely axiomatic.  | 
|
| 28419 | 271  | 
|
| 39643 | 272  | 
A further noteworthy detail is that any special character in a  | 
| 38450 | 273  | 
  custom serialisation may be quoted using ``@{verbatim "'"}''; thus,
 | 
274  | 
  in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a
 | 
|
275  | 
  proper underscore while the second ``@{verbatim "_"}'' is a
 | 
|
276  | 
placeholder.  | 
|
| 28419 | 277  | 
*}  | 
278  | 
||
279  | 
||
280  | 
subsection {* @{text Haskell} serialisation *}
 | 
|
281  | 
||
282  | 
text {*
 | 
|
| 38450 | 283  | 
  For convenience, the default @{text HOL} setup for @{text Haskell}
 | 
| 39063 | 284  | 
  maps the @{class equal} class to its counterpart in @{text Haskell},
 | 
285  | 
  giving custom serialisations for the class @{class equal} (by command
 | 
|
| 39643 | 286  | 
  @{command_def code_class}) and its operation @{const [source] HOL.equal}
 | 
| 28419 | 287  | 
*}  | 
288  | 
||
| 39745 | 289  | 
code_class %quotett equal  | 
| 28714 | 290  | 
(Haskell "Eq")  | 
| 28419 | 291  | 
|
| 39745 | 292  | 
code_const %quotett "HOL.equal"  | 
| 28419 | 293  | 
(Haskell infixl 4 "==")  | 
294  | 
||
295  | 
text {*
 | 
|
| 38450 | 296  | 
\noindent A problem now occurs whenever a type which is an instance  | 
| 39063 | 297  | 
  of @{class equal} in @{text HOL} is mapped on a @{text
 | 
| 38450 | 298  | 
  Haskell}-built-in type which is also an instance of @{text Haskell}
 | 
299  | 
  @{text Eq}:
 | 
|
| 28419 | 300  | 
*}  | 
301  | 
||
| 28564 | 302  | 
typedecl %quote bar  | 
| 28419 | 303  | 
|
| 39063 | 304  | 
instantiation %quote bar :: equal  | 
| 28419 | 305  | 
begin  | 
306  | 
||
| 39063 | 307  | 
definition %quote "HOL.equal (x\<Colon>bar) y \<longleftrightarrow> x = y"  | 
| 28419 | 308  | 
|
| 39063 | 309  | 
instance %quote by default (simp add: equal_bar_def)  | 
| 28213 | 310  | 
|
| 30880 | 311  | 
end %quote (*<*)  | 
312  | 
||
| 39745 | 313  | 
(*>*) code_type %quotett bar  | 
| 28419 | 314  | 
(Haskell "Integer")  | 
315  | 
||
316  | 
text {*
 | 
|
| 38450 | 317  | 
\noindent The code generator would produce an additional instance,  | 
318  | 
  which of course is rejected by the @{text Haskell} compiler.  To
 | 
|
| 38506 | 319  | 
  suppress this additional instance, use @{command_def "code_instance"}:
 | 
| 28419 | 320  | 
*}  | 
321  | 
||
| 39745 | 322  | 
code_instance %quotett bar :: equal  | 
| 28419 | 323  | 
(Haskell -)  | 
324  | 
||
| 28561 | 325  | 
|
| 28635 | 326  | 
subsection {* Enhancing the target language context \label{sec:include} *}
 | 
| 28561 | 327  | 
|
328  | 
text {*
 | 
|
| 28593 | 329  | 
  In rare cases it is necessary to \emph{enrich} the context of a
 | 
| 38505 | 330  | 
  target language; this is accomplished using the @{command_def
 | 
| 38450 | 331  | 
"code_include"} command:  | 
| 28561 | 332  | 
*}  | 
333  | 
||
| 39745 | 334  | 
code_include %quotett Haskell "Errno"  | 
| 28561 | 335  | 
{*errno i = error ("Error number: " ++ show i)*}
 | 
336  | 
||
| 39745 | 337  | 
code_reserved %quotett Haskell Errno  | 
| 28561 | 338  | 
|
339  | 
text {*
 | 
|
| 38450 | 340  | 
  \noindent Such named @{text include}s are then prepended to every
 | 
341  | 
generated code. Inspect such code in order to find out how  | 
|
342  | 
  @{command "code_include"} behaves with respect to a particular
 | 
|
343  | 
target language.  | 
|
| 28561 | 344  | 
*}  | 
345  | 
||
| 28419 | 346  | 
end  | 
| 46519 | 347  |