| author | haftmann | 
| Tue, 10 Aug 2010 13:58:26 +0200 | |
| changeset 38306 | a062fdf777df | 
| parent 37836 | 2bcce92be291 | 
| child 38450 | ada5814c9d87 | 
| permissions | -rw-r--r-- | 
| 31050 | 1 | theory Adaptation | 
| 28213 | 2 | imports Setup | 
| 3 | begin | |
| 4 | ||
| 28679 | 5 | setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) *}
 | 
| 28561 | 6 | |
| 31050 | 7 | section {* Adaptation to target languages \label{sec:adaptation} *}
 | 
| 28419 | 8 | |
| 28561 | 9 | subsection {* Adapting code generation *}
 | 
| 10 | ||
| 11 | text {*
 | |
| 12 | The aspects of code generation introduced so far have two aspects | |
| 13 | in common: | |
| 14 | ||
| 15 |   \begin{itemize}
 | |
| 16 | \item They act uniformly, without reference to a specific | |
| 17 | target language. | |
| 18 |     \item They are \emph{safe} in the sense that as long as you trust
 | |
| 19 | the code generator meta theory and implementation, you cannot | |
| 20 | produce programs that yield results which are not derivable | |
| 21 | in the logic. | |
| 22 |   \end{itemize}
 | |
| 23 | ||
| 24 |   \noindent In this section we will introduce means to \emph{adapt} the serialiser
 | |
| 25 | to a specific target language, i.e.~to print program fragments | |
| 28593 | 26 |   in a way which accommodates \qt{already existing} ingredients of
 | 
| 28561 | 27 | a target language environment, for three reasons: | 
| 28 | ||
| 29 |   \begin{itemize}
 | |
| 28593 | 30 | \item improving readability and aesthetics of generated code | 
| 28561 | 31 | \item gaining efficiency | 
| 32 | \item interface with language parts which have no direct counterpart | |
| 33 |       in @{text "HOL"} (say, imperative data structures)
 | |
| 34 |   \end{itemize}
 | |
| 35 | ||
| 36 | \noindent Generally, you should avoid using those features yourself | |
| 37 |   \emph{at any cost}:
 | |
| 38 | ||
| 39 |   \begin{itemize}
 | |
| 40 | \item The safe configuration methods act uniformly on every target language, | |
| 34155 | 41 | whereas for adaptation you have to treat each target language separately. | 
| 28561 | 42 | \item Application is extremely tedious since there is no abstraction | 
| 28593 | 43 | which would allow for a static check, making it easy to produce garbage. | 
| 34155 | 44 | \item Subtle errors can be introduced unconsciously. | 
| 28561 | 45 |   \end{itemize}
 | 
| 46 | ||
| 31050 | 47 | \noindent However, even if you ought refrain from setting up adaptation | 
| 28561 | 48 |   yourself, already the @{text "HOL"} comes with some reasonable default
 | 
| 31050 | 49 | adaptations (say, using target language list syntax). There also some | 
| 50 | common adaptation cases which you can setup by importing particular | |
| 28561 | 51 | library theories. In order to understand these, we provide some clues here; | 
| 52 | these however are not supposed to replace a careful study of the sources. | |
| 53 | *} | |
| 54 | ||
| 31050 | 55 | subsection {* The adaptation principle *}
 | 
| 28561 | 56 | |
| 57 | text {*
 | |
| 31050 | 58 |   Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is conceptually
 | 
| 28601 | 59 | supposed to be: | 
| 60 | ||
| 61 |   \begin{figure}[here]
 | |
| 31050 | 62 |     \includegraphics{adaptation}
 | 
| 63 |     \caption{The adaptation principle}
 | |
| 64 |     \label{fig:adaptation}
 | |
| 28601 | 65 |   \end{figure}
 | 
| 66 | ||
| 67 | \noindent In the tame view, code generation acts as broker between | |
| 68 |   @{text logic}, @{text "intermediate language"} and
 | |
| 69 |   @{text "target language"} by means of @{text translation} and
 | |
| 70 |   @{text serialisation};  for the latter, the serialiser has to observe
 | |
| 71 |   the structure of the @{text language} itself plus some @{text reserved}
 | |
| 72 | keywords which have to be avoided for generated code. | |
| 31050 | 73 |   However, if you consider @{text adaptation} mechanisms, the code generated
 | 
| 28601 | 74 | by the serializer is just the tip of the iceberg: | 
| 75 | ||
| 76 |   \begin{itemize}
 | |
| 28635 | 77 |     \item @{text serialisation} can be \emph{parametrised} such that
 | 
| 78 | logical entities are mapped to target-specific ones | |
| 79 | (e.g. target-specific list syntax, | |
| 31050 | 80 |         see also \secref{sec:adaptation_mechanisms})
 | 
| 28635 | 81 | \item Such parametrisations can involve references to a | 
| 82 |       target-specific standard @{text library} (e.g. using
 | |
| 83 |       the @{text Haskell} @{verbatim Maybe} type instead
 | |
| 84 |       of the @{text HOL} @{type "option"} type);
 | |
| 85 | if such are used, the corresponding identifiers | |
| 86 |       (in our example, @{verbatim Maybe}, @{verbatim Nothing}
 | |
| 87 |       and @{verbatim Just}) also have to be considered @{text reserved}.
 | |
| 88 | \item Even more, the user can enrich the library of the | |
| 89 | target-language by providing code snippets | |
| 90 |       (\qt{@{text "includes"}}) which are prepended to
 | |
| 91 |       any generated code (see \secref{sec:include});  this typically
 | |
| 92 |       also involves further @{text reserved} identifiers.
 | |
| 28601 | 93 |   \end{itemize}
 | 
| 28635 | 94 | |
| 31050 | 95 |   \noindent As figure \ref{fig:adaptation} illustrates, all these adaptation mechanisms
 | 
| 28635 | 96 | have to act consistently; it is at the discretion of the user | 
| 97 | to take care for this. | |
| 28561 | 98 | *} | 
| 99 | ||
| 31050 | 100 | subsection {* Common adaptation patterns *}
 | 
| 28419 | 101 | |
| 102 | text {*
 | |
| 28428 | 103 |   The @{theory HOL} @{theory Main} theory already provides a code
 | 
| 28419 | 104 | generator setup | 
| 28593 | 105 | which should be suitable for most applications. Common extensions | 
| 28419 | 106 |   and modifications are available by certain theories of the @{text HOL}
 | 
| 107 | library; beside being useful in applications, they may serve | |
| 108 | as a tutorial for customising the code generator setup (see below | |
| 31050 | 109 |   \secref{sec:adaptation_mechanisms}).
 | 
| 28419 | 110 | |
| 111 |   \begin{description}
 | |
| 112 | ||
| 113 |     \item[@{theory "Code_Integer"}] represents @{text HOL} integers by big
 | |
| 114 | integer literals in target languages. | |
| 115 |     \item[@{theory "Code_Char"}] represents @{text HOL} characters by 
 | |
| 116 | character literals in target languages. | |
| 117 |     \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"},
 | |
| 118 | but also offers treatment of character codes; includes | |
| 28561 | 119 |        @{theory "Code_Char"}.
 | 
| 28419 | 120 |     \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
 | 
| 121 | which in general will result in higher efficiency; pattern | |
| 122 |        matching with @{term "0\<Colon>nat"} / @{const "Suc"}
 | |
| 28561 | 123 |        is eliminated;  includes @{theory "Code_Integer"}
 | 
| 31206 | 124 |        and @{theory "Code_Numeral"}.
 | 
| 125 |     \item[@{theory "Code_Numeral"}] provides an additional datatype
 | |
| 28419 | 126 |        @{typ index} which is mapped to target-language built-in integers.
 | 
| 127 | Useful for code setups which involve e.g. indexing of | |
| 128 | target-language arrays. | |
| 31050 | 129 |     \item[@{theory "String"}] provides an additional datatype
 | 
| 31205 
98370b26c2ce
String.literal replaces message_string, code_numeral replaces (code_)index
 haftmann parents: 
31050diff
changeset | 130 |        @{typ String.literal} which is isomorphic to strings;
 | 
| 
98370b26c2ce
String.literal replaces message_string, code_numeral replaces (code_)index
 haftmann parents: 
31050diff
changeset | 131 |        @{typ String.literal}s are mapped to target-language strings.
 | 
| 28419 | 132 | Useful for code setups which involve e.g. printing (error) messages. | 
| 133 | ||
| 134 |   \end{description}
 | |
| 135 | ||
| 136 |   \begin{warn}
 | |
| 137 | When importing any of these theories, they should form the last | |
| 138 | items in an import list. Since these theories adapt the | |
| 139 | code generator setup in a non-conservative fashion, | |
| 140 | strange effects may occur otherwise. | |
| 141 |   \end{warn}
 | |
| 142 | *} | |
| 143 | ||
| 144 | ||
| 31050 | 145 | subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *}
 | 
| 28419 | 146 | |
| 147 | text {*
 | |
| 28561 | 148 | Consider the following function and its corresponding | 
| 28419 | 149 | SML code: | 
| 150 | *} | |
| 151 | ||
| 28564 | 152 | primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where | 
| 28419 | 153 | "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l" | 
| 28447 | 154 | (*<*) | 
| 28419 | 155 | code_type %invisible bool | 
| 156 | (SML) | |
| 157 | code_const %invisible True and False and "op \<and>" and Not | |
| 158 | (SML and and and) | |
| 28447 | 159 | (*>*) | 
| 28564 | 160 | text %quote {*@{code_stmts in_interval (SML)}*}
 | 
| 28419 | 161 | |
| 162 | text {*
 | |
| 163 | \noindent Though this is correct code, it is a little bit unsatisfactory: | |
| 164 | boolean values and operators are materialised as distinguished | |
| 165 | entities with have nothing to do with the SML-built-in notion | |
| 166 |   of \qt{bool}.  This results in less readable code;
 | |
| 167 | additionally, eager evaluation may cause programs to | |
| 168 | loop or break which would perfectly terminate when | |
| 169 |   the existing SML @{verbatim "bool"} would be used.  To map
 | |
| 170 |   the HOL @{typ bool} on SML @{verbatim "bool"}, we may use
 | |
| 171 |   \qn{custom serialisations}:
 | |
| 172 | *} | |
| 173 | ||
| 28564 | 174 | code_type %quotett bool | 
| 28419 | 175 | (SML "bool") | 
| 28564 | 176 | code_const %quotett True and False and "op \<and>" | 
| 28419 | 177 | (SML "true" and "false" and "_ andalso _") | 
| 28213 | 178 | |
| 28419 | 179 | text {*
 | 
| 28447 | 180 |   \noindent The @{command code_type} command takes a type constructor
 | 
| 28419 | 181 | as arguments together with a list of custom serialisations. | 
| 182 | Each custom serialisation starts with a target language | |
| 183 | identifier followed by an expression, which during | |
| 184 | code serialisation is inserted whenever the type constructor | |
| 185 |   would occur.  For constants, @{command code_const} implements
 | |
| 186 |   the corresponding mechanism.  Each ``@{verbatim "_"}'' in
 | |
| 187 | a serialisation expression is treated as a placeholder | |
| 188 | for the type constructor's (the constant's) arguments. | |
| 189 | *} | |
| 190 | ||
| 28564 | 191 | text %quote {*@{code_stmts in_interval (SML)}*}
 | 
| 28419 | 192 | |
| 193 | text {*
 | |
| 194 | \noindent This still is not perfect: the parentheses | |
| 195 |   around the \qt{andalso} expression are superfluous.
 | |
| 28593 | 196 | Though the serialiser | 
| 28419 | 197 | by no means attempts to imitate the rich Isabelle syntax | 
| 198 | framework, it provides some common idioms, notably | |
| 199 | associative infixes with precedences which may be used here: | |
| 200 | *} | |
| 201 | ||
| 28564 | 202 | code_const %quotett "op \<and>" | 
| 28419 | 203 | (SML infixl 1 "andalso") | 
| 204 | ||
| 28564 | 205 | text %quote {*@{code_stmts in_interval (SML)}*}
 | 
| 28419 | 206 | |
| 207 | text {*
 | |
| 28561 | 208 | \noindent The attentive reader may ask how we assert that no generated | 
| 209 | code will accidentally overwrite. For this reason the serialiser has | |
| 210 | an internal table of identifiers which have to be avoided to be used | |
| 211 | for new declarations. Initially, this table typically contains the | |
| 212 | keywords of the target language. It can be extended manually, thus avoiding | |
| 213 |   accidental overwrites, using the @{command "code_reserved"} command:
 | |
| 214 | *} | |
| 215 | ||
| 28601 | 216 | code_reserved %quote "\<SML>" bool true false andalso | 
| 28561 | 217 | |
| 218 | text {*
 | |
| 28447 | 219 | \noindent Next, we try to map HOL pairs to SML pairs, using the | 
| 28419 | 220 |   infix ``@{verbatim "*"}'' type constructor and parentheses:
 | 
| 221 | *} | |
| 28447 | 222 | (*<*) | 
| 37836 | 223 | code_type %invisible prod | 
| 28419 | 224 | (SML) | 
| 225 | code_const %invisible Pair | |
| 226 | (SML) | |
| 28447 | 227 | (*>*) | 
| 37836 | 228 | code_type %quotett prod | 
| 28419 | 229 | (SML infix 2 "*") | 
| 28564 | 230 | code_const %quotett Pair | 
| 28419 | 231 | (SML "!((_),/ (_))") | 
| 232 | ||
| 233 | text {*
 | |
| 28593 | 234 |   \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
 | 
| 28561 | 235 | never to put | 
| 28419 | 236 | parentheses around the whole expression (they are already present), | 
| 237 | while the parentheses around argument place holders | |
| 238 | tell not to put parentheses around the arguments. | |
| 239 |   The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
 | |
| 240 | inserts a space which may be used as a break if necessary | |
| 241 | during pretty printing. | |
| 242 | ||
| 243 | These examples give a glimpse what mechanisms | |
| 244 | custom serialisations provide; however their usage | |
| 245 | requires careful thinking in order not to introduce | |
| 246 | inconsistencies -- or, in other words: | |
| 247 | custom serialisations are completely axiomatic. | |
| 248 | ||
| 249 | A further noteworthy details is that any special | |
| 250 | character in a custom serialisation may be quoted | |
| 251 |   using ``@{verbatim "'"}''; thus, in
 | |
| 252 |   ``@{verbatim "fn '_ => _"}'' the first
 | |
| 253 |   ``@{verbatim "_"}'' is a proper underscore while the
 | |
| 254 |   second ``@{verbatim "_"}'' is a placeholder.
 | |
| 255 | *} | |
| 256 | ||
| 257 | ||
| 258 | subsection {* @{text Haskell} serialisation *}
 | |
| 259 | ||
| 260 | text {*
 | |
| 261 | For convenience, the default | |
| 262 |   @{text HOL} setup for @{text Haskell} maps the @{class eq} class to
 | |
| 263 |   its counterpart in @{text Haskell}, giving custom serialisations
 | |
| 264 |   for the class @{class eq} (by command @{command code_class}) and its operation
 | |
| 265 |   @{const HOL.eq}
 | |
| 266 | *} | |
| 267 | ||
| 28564 | 268 | code_class %quotett eq | 
| 28714 | 269 | (Haskell "Eq") | 
| 28419 | 270 | |
| 28564 | 271 | code_const %quotett "op =" | 
| 28419 | 272 | (Haskell infixl 4 "==") | 
| 273 | ||
| 274 | text {*
 | |
| 28447 | 275 | \noindent A problem now occurs whenever a type which | 
| 28419 | 276 |   is an instance of @{class eq} in @{text HOL} is mapped
 | 
| 277 |   on a @{text Haskell}-built-in type which is also an instance
 | |
| 278 |   of @{text Haskell} @{text Eq}:
 | |
| 279 | *} | |
| 280 | ||
| 28564 | 281 | typedecl %quote bar | 
| 28419 | 282 | |
| 28564 | 283 | instantiation %quote bar :: eq | 
| 28419 | 284 | begin | 
| 285 | ||
| 28564 | 286 | definition %quote "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y" | 
| 28419 | 287 | |
| 28564 | 288 | instance %quote by default (simp add: eq_bar_def) | 
| 28213 | 289 | |
| 30880 | 290 | end %quote (*<*) | 
| 291 | ||
| 292 | (*>*) code_type %quotett bar | |
| 28419 | 293 | (Haskell "Integer") | 
| 294 | ||
| 295 | text {*
 | |
| 28447 | 296 | \noindent The code generator would produce | 
| 28593 | 297 |   an additional instance, which of course is rejected by the @{text Haskell}
 | 
| 28419 | 298 | compiler. | 
| 299 | To suppress this additional instance, use | |
| 300 |   @{text "code_instance"}:
 | |
| 301 | *} | |
| 302 | ||
| 28564 | 303 | code_instance %quotett bar :: eq | 
| 28419 | 304 | (Haskell -) | 
| 305 | ||
| 28561 | 306 | |
| 28635 | 307 | subsection {* Enhancing the target language context \label{sec:include} *}
 | 
| 28561 | 308 | |
| 309 | text {*
 | |
| 28593 | 310 |   In rare cases it is necessary to \emph{enrich} the context of a
 | 
| 28561 | 311 |   target language;  this is accomplished using the @{command "code_include"}
 | 
| 312 | command: | |
| 313 | *} | |
| 314 | ||
| 28564 | 315 | code_include %quotett Haskell "Errno" | 
| 28561 | 316 | {*errno i = error ("Error number: " ++ show i)*}
 | 
| 317 | ||
| 28564 | 318 | code_reserved %quotett Haskell Errno | 
| 28561 | 319 | |
| 320 | text {*
 | |
| 321 |   \noindent Such named @{text include}s are then prepended to every generated code.
 | |
| 322 |   Inspect such code in order to find out how @{command "code_include"} behaves
 | |
| 323 | with respect to a particular target language. | |
| 324 | *} | |
| 325 | ||
| 28419 | 326 | end |