doc-src/Codegen/Thy/Adaptation.thy
author haftmann
Thu Sep 23 15:46:17 2010 +0200 (2010-09-23)
changeset 39664 0afaf89ab591
parent 39643 29cc021398fc
child 39683 f75a01ee6c41
permissions -rw-r--r--
more canonical type setting of type writer code examples
haftmann@31050
     1
theory Adaptation
haftmann@28213
     2
imports Setup
haftmann@28213
     3
begin
haftmann@28213
     4
haftmann@28679
     5
setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) *}
haftmann@28561
     6
haftmann@31050
     7
section {* Adaptation to target languages \label{sec:adaptation} *}
haftmann@28419
     8
haftmann@28561
     9
subsection {* Adapting code generation *}
haftmann@28561
    10
haftmann@28561
    11
text {*
haftmann@28561
    12
  The aspects of code generation introduced so far have two aspects
haftmann@28561
    13
  in common:
haftmann@28561
    14
haftmann@28561
    15
  \begin{itemize}
haftmann@38450
    16
haftmann@38450
    17
    \item They act uniformly, without reference to a specific target
haftmann@38450
    18
       language.
haftmann@38450
    19
haftmann@28561
    20
    \item They are \emph{safe} in the sense that as long as you trust
haftmann@28561
    21
       the code generator meta theory and implementation, you cannot
haftmann@38450
    22
       produce programs that yield results which are not derivable in
haftmann@38450
    23
       the logic.
haftmann@38450
    24
haftmann@28561
    25
  \end{itemize}
haftmann@28561
    26
haftmann@38450
    27
  \noindent In this section we will introduce means to \emph{adapt}
haftmann@38450
    28
  the serialiser to a specific target language, i.e.~to print program
haftmann@38450
    29
  fragments in a way which accommodates \qt{already existing}
haftmann@38450
    30
  ingredients of a target language environment, for three reasons:
haftmann@28561
    31
haftmann@28561
    32
  \begin{itemize}
haftmann@28593
    33
    \item improving readability and aesthetics of generated code
haftmann@28561
    34
    \item gaining efficiency
haftmann@28561
    35
    \item interface with language parts which have no direct counterpart
haftmann@28561
    36
      in @{text "HOL"} (say, imperative data structures)
haftmann@28561
    37
  \end{itemize}
haftmann@28561
    38
haftmann@28561
    39
  \noindent Generally, you should avoid using those features yourself
haftmann@28561
    40
  \emph{at any cost}:
haftmann@28561
    41
haftmann@28561
    42
  \begin{itemize}
haftmann@38450
    43
haftmann@38450
    44
    \item The safe configuration methods act uniformly on every target
haftmann@38450
    45
      language, whereas for adaptation you have to treat each target
haftmann@38450
    46
      language separately.
haftmann@38450
    47
haftmann@38450
    48
    \item Application is extremely tedious since there is no
haftmann@38450
    49
      abstraction which would allow for a static check, making it easy
haftmann@38450
    50
      to produce garbage.
haftmann@38450
    51
paulson@34155
    52
    \item Subtle errors can be introduced unconsciously.
haftmann@38450
    53
haftmann@28561
    54
  \end{itemize}
haftmann@28561
    55
haftmann@38450
    56
  \noindent However, even if you ought refrain from setting up
haftmann@38450
    57
  adaptation yourself, already the @{text "HOL"} comes with some
haftmann@38450
    58
  reasonable default adaptations (say, using target language list
haftmann@38450
    59
  syntax).  There also some common adaptation cases which you can
haftmann@38450
    60
  setup by importing particular library theories.  In order to
haftmann@38450
    61
  understand these, we provide some clues here; these however are not
haftmann@38450
    62
  supposed to replace a careful study of the sources.
haftmann@28561
    63
*}
haftmann@28561
    64
haftmann@38450
    65
haftmann@31050
    66
subsection {* The adaptation principle *}
haftmann@28561
    67
haftmann@28561
    68
text {*
haftmann@38450
    69
  Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is
haftmann@38450
    70
  conceptually supposed to be:
haftmann@28601
    71
haftmann@28601
    72
  \begin{figure}[here]
haftmann@31050
    73
    \includegraphics{adaptation}
haftmann@31050
    74
    \caption{The adaptation principle}
haftmann@31050
    75
    \label{fig:adaptation}
haftmann@28601
    76
  \end{figure}
haftmann@28601
    77
haftmann@28601
    78
  \noindent In the tame view, code generation acts as broker between
haftmann@38450
    79
  @{text logic}, @{text "intermediate language"} and @{text "target
haftmann@38450
    80
  language"} by means of @{text translation} and @{text
haftmann@38450
    81
  serialisation}; for the latter, the serialiser has to observe the
haftmann@38450
    82
  structure of the @{text language} itself plus some @{text reserved}
haftmann@38450
    83
  keywords which have to be avoided for generated code.  However, if
haftmann@38450
    84
  you consider @{text adaptation} mechanisms, the code generated by
haftmann@38450
    85
  the serializer is just the tip of the iceberg:
haftmann@28601
    86
haftmann@28601
    87
  \begin{itemize}
haftmann@38450
    88
haftmann@28635
    89
    \item @{text serialisation} can be \emph{parametrised} such that
haftmann@28635
    90
      logical entities are mapped to target-specific ones
haftmann@38450
    91
      (e.g. target-specific list syntax, see also
haftmann@38450
    92
      \secref{sec:adaptation_mechanisms})
haftmann@38450
    93
haftmann@28635
    94
    \item Such parametrisations can involve references to a
haftmann@38450
    95
      target-specific standard @{text library} (e.g. using the @{text
haftmann@38450
    96
      Haskell} @{verbatim Maybe} type instead of the @{text HOL}
haftmann@38450
    97
      @{type "option"} type); if such are used, the corresponding
haftmann@38450
    98
      identifiers (in our example, @{verbatim Maybe}, @{verbatim
haftmann@38450
    99
      Nothing} and @{verbatim Just}) also have to be considered @{text
haftmann@38450
   100
      reserved}.
haftmann@38450
   101
haftmann@28635
   102
    \item Even more, the user can enrich the library of the
haftmann@38450
   103
      target-language by providing code snippets (\qt{@{text
haftmann@38450
   104
      "includes"}}) which are prepended to any generated code (see
haftmann@38450
   105
      \secref{sec:include}); this typically also involves further
haftmann@38450
   106
      @{text reserved} identifiers.
haftmann@38450
   107
haftmann@28601
   108
  \end{itemize}
haftmann@28635
   109
haftmann@38450
   110
  \noindent As figure \ref{fig:adaptation} illustrates, all these
haftmann@38450
   111
  adaptation mechanisms have to act consistently; it is at the
haftmann@38450
   112
  discretion of the user to take care for this.
haftmann@28561
   113
*}
haftmann@28561
   114
haftmann@31050
   115
subsection {* Common adaptation patterns *}
haftmann@28419
   116
haftmann@28419
   117
text {*
haftmann@28428
   118
  The @{theory HOL} @{theory Main} theory already provides a code
haftmann@38450
   119
  generator setup which should be suitable for most applications.
haftmann@38450
   120
  Common extensions and modifications are available by certain
haftmann@38450
   121
  theories of the @{text HOL} library; beside being useful in
haftmann@38450
   122
  applications, they may serve as a tutorial for customising the code
haftmann@38450
   123
  generator setup (see below \secref{sec:adaptation_mechanisms}).
haftmann@28419
   124
haftmann@28419
   125
  \begin{description}
haftmann@28419
   126
haftmann@38450
   127
    \item[@{theory "Code_Integer"}] represents @{text HOL} integers by
haftmann@38450
   128
       big integer literals in target languages.
haftmann@38450
   129
haftmann@38450
   130
    \item[@{theory "Code_Char"}] represents @{text HOL} characters by
haftmann@28419
   131
       character literals in target languages.
haftmann@38450
   132
haftmann@38450
   133
    \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"}, but
haftmann@38450
   134
       also offers treatment of character codes; includes @{theory
haftmann@38450
   135
       "Code_Char"}.
haftmann@38450
   136
haftmann@38450
   137
    \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements
haftmann@38450
   138
       natural numbers by integers, which in general will result in
haftmann@38450
   139
       higher efficiency; pattern matching with @{term "0\<Colon>nat"} /
haftmann@38450
   140
       @{const "Suc"} is eliminated; includes @{theory "Code_Integer"}
haftmann@31206
   141
       and @{theory "Code_Numeral"}.
haftmann@38450
   142
haftmann@31206
   143
    \item[@{theory "Code_Numeral"}] provides an additional datatype
haftmann@38450
   144
       @{typ index} which is mapped to target-language built-in
haftmann@38450
   145
       integers.  Useful for code setups which involve e.g.~indexing
haftmann@38450
   146
       of target-language arrays.
haftmann@38450
   147
haftmann@38450
   148
    \item[@{theory "String"}] provides an additional datatype @{typ
haftmann@38450
   149
       String.literal} which is isomorphic to strings; @{typ
haftmann@38450
   150
       String.literal}s are mapped to target-language strings.  Useful
haftmann@38450
   151
       for code setups which involve e.g.~printing (error) messages.
haftmann@28419
   152
haftmann@28419
   153
  \end{description}
haftmann@28419
   154
haftmann@28419
   155
  \begin{warn}
haftmann@28419
   156
    When importing any of these theories, they should form the last
haftmann@38450
   157
    items in an import list.  Since these theories adapt the code
haftmann@38450
   158
    generator setup in a non-conservative fashion, strange effects may
haftmann@38450
   159
    occur otherwise.
haftmann@28419
   160
  \end{warn}
haftmann@28419
   161
*}
haftmann@28419
   162
haftmann@28419
   163
haftmann@31050
   164
subsection {* Parametrising serialisation \label{sec:adaptation_mechanisms} *}
haftmann@28419
   165
haftmann@28419
   166
text {*
haftmann@38450
   167
  Consider the following function and its corresponding SML code:
haftmann@28419
   168
*}
haftmann@28419
   169
haftmann@28564
   170
primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
haftmann@28419
   171
  "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
haftmann@28447
   172
(*<*)
haftmann@28419
   173
code_type %invisible bool
haftmann@28419
   174
  (SML)
haftmann@28419
   175
code_const %invisible True and False and "op \<and>" and Not
haftmann@28419
   176
  (SML and and and)
haftmann@28447
   177
(*>*)
haftmann@39664
   178
text %quote {*
haftmann@39664
   179
  \begin{typewriter}
haftmann@39664
   180
    @{code_stmts in_interval (SML)}
haftmann@39664
   181
  \end{typewriter}
haftmann@39664
   182
*}
haftmann@28419
   183
haftmann@28419
   184
text {*
haftmann@38450
   185
  \noindent Though this is correct code, it is a little bit
haftmann@38450
   186
  unsatisfactory: boolean values and operators are materialised as
haftmann@38450
   187
  distinguished entities with have nothing to do with the SML-built-in
haftmann@38450
   188
  notion of \qt{bool}.  This results in less readable code;
haftmann@38450
   189
  additionally, eager evaluation may cause programs to loop or break
haftmann@38450
   190
  which would perfectly terminate when the existing SML @{verbatim
haftmann@38450
   191
  "bool"} would be used.  To map the HOL @{typ bool} on SML @{verbatim
haftmann@38450
   192
  "bool"}, we may use \qn{custom serialisations}:
haftmann@28419
   193
*}
haftmann@28419
   194
haftmann@28564
   195
code_type %quotett bool
haftmann@28419
   196
  (SML "bool")
haftmann@28564
   197
code_const %quotett True and False and "op \<and>"
haftmann@28419
   198
  (SML "true" and "false" and "_ andalso _")
haftmann@28213
   199
haftmann@28419
   200
text {*
haftmann@38505
   201
  \noindent The @{command_def code_type} command takes a type constructor
haftmann@38450
   202
  as arguments together with a list of custom serialisations.  Each
haftmann@38450
   203
  custom serialisation starts with a target language identifier
haftmann@38450
   204
  followed by an expression, which during code serialisation is
haftmann@38450
   205
  inserted whenever the type constructor would occur.  For constants,
haftmann@38505
   206
  @{command_def code_const} implements the corresponding mechanism.  Each
haftmann@38450
   207
  ``@{verbatim "_"}'' in a serialisation expression is treated as a
haftmann@38450
   208
  placeholder for the type constructor's (the constant's) arguments.
haftmann@28419
   209
*}
haftmann@28419
   210
haftmann@39664
   211
text %quote {*
haftmann@39664
   212
  \begin{typewriter}
haftmann@39664
   213
    @{code_stmts in_interval (SML)}
haftmann@39664
   214
  \end{typewriter}
haftmann@39664
   215
*}
haftmann@28419
   216
haftmann@28419
   217
text {*
haftmann@38450
   218
  \noindent This still is not perfect: the parentheses around the
haftmann@38450
   219
  \qt{andalso} expression are superfluous.  Though the serialiser by
haftmann@38450
   220
  no means attempts to imitate the rich Isabelle syntax framework, it
haftmann@38450
   221
  provides some common idioms, notably associative infixes with
haftmann@38450
   222
  precedences which may be used here:
haftmann@28419
   223
*}
haftmann@28419
   224
haftmann@28564
   225
code_const %quotett "op \<and>"
haftmann@28419
   226
  (SML infixl 1 "andalso")
haftmann@28419
   227
haftmann@39664
   228
text %quote {*
haftmann@39664
   229
  \begin{typewriter}
haftmann@39664
   230
    @{code_stmts in_interval (SML)}
haftmann@39664
   231
  \end{typewriter}
haftmann@39664
   232
*}
haftmann@28419
   233
haftmann@28419
   234
text {*
haftmann@38450
   235
  \noindent The attentive reader may ask how we assert that no
haftmann@38450
   236
  generated code will accidentally overwrite.  For this reason the
haftmann@38450
   237
  serialiser has an internal table of identifiers which have to be
haftmann@38450
   238
  avoided to be used for new declarations.  Initially, this table
haftmann@38450
   239
  typically contains the keywords of the target language.  It can be
haftmann@38450
   240
  extended manually, thus avoiding accidental overwrites, using the
haftmann@38505
   241
  @{command_def "code_reserved"} command:
haftmann@28561
   242
*}
haftmann@28561
   243
haftmann@28601
   244
code_reserved %quote "\<SML>" bool true false andalso
haftmann@28561
   245
haftmann@28561
   246
text {*
haftmann@28447
   247
  \noindent Next, we try to map HOL pairs to SML pairs, using the
haftmann@28419
   248
  infix ``@{verbatim "*"}'' type constructor and parentheses:
haftmann@28419
   249
*}
haftmann@28447
   250
(*<*)
haftmann@37836
   251
code_type %invisible prod
haftmann@28419
   252
  (SML)
haftmann@28419
   253
code_const %invisible Pair
haftmann@28419
   254
  (SML)
haftmann@28447
   255
(*>*)
haftmann@37836
   256
code_type %quotett prod
haftmann@28419
   257
  (SML infix 2 "*")
haftmann@28564
   258
code_const %quotett Pair
haftmann@28419
   259
  (SML "!((_),/ (_))")
haftmann@28419
   260
haftmann@28419
   261
text {*
haftmann@28593
   262
  \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
haftmann@38450
   263
  never to put parentheses around the whole expression (they are
haftmann@38450
   264
  already present), while the parentheses around argument place
haftmann@38450
   265
  holders tell not to put parentheses around the arguments.  The slash
haftmann@38450
   266
  ``@{verbatim "/"}'' (followed by arbitrary white space) inserts a
haftmann@38450
   267
  space which may be used as a break if necessary during pretty
haftmann@38450
   268
  printing.
haftmann@28419
   269
haftmann@38450
   270
  These examples give a glimpse what mechanisms custom serialisations
haftmann@38450
   271
  provide; however their usage requires careful thinking in order not
haftmann@38450
   272
  to introduce inconsistencies -- or, in other words: custom
haftmann@38450
   273
  serialisations are completely axiomatic.
haftmann@28419
   274
haftmann@39643
   275
  A further noteworthy detail is that any special character in a
haftmann@38450
   276
  custom serialisation may be quoted using ``@{verbatim "'"}''; thus,
haftmann@38450
   277
  in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a
haftmann@38450
   278
  proper underscore while the second ``@{verbatim "_"}'' is a
haftmann@38450
   279
  placeholder.
haftmann@28419
   280
*}
haftmann@28419
   281
haftmann@28419
   282
haftmann@28419
   283
subsection {* @{text Haskell} serialisation *}
haftmann@28419
   284
haftmann@28419
   285
text {*
haftmann@38450
   286
  For convenience, the default @{text HOL} setup for @{text Haskell}
haftmann@39063
   287
  maps the @{class equal} class to its counterpart in @{text Haskell},
haftmann@39063
   288
  giving custom serialisations for the class @{class equal} (by command
haftmann@39643
   289
  @{command_def code_class}) and its operation @{const [source] HOL.equal}
haftmann@28419
   290
*}
haftmann@28419
   291
haftmann@39063
   292
code_class %quotett equal
haftmann@28714
   293
  (Haskell "Eq")
haftmann@28419
   294
haftmann@39643
   295
code_const %quotett "HOL.equal"
haftmann@28419
   296
  (Haskell infixl 4 "==")
haftmann@28419
   297
haftmann@28419
   298
text {*
haftmann@38450
   299
  \noindent A problem now occurs whenever a type which is an instance
haftmann@39063
   300
  of @{class equal} in @{text HOL} is mapped on a @{text
haftmann@38450
   301
  Haskell}-built-in type which is also an instance of @{text Haskell}
haftmann@38450
   302
  @{text Eq}:
haftmann@28419
   303
*}
haftmann@28419
   304
haftmann@28564
   305
typedecl %quote bar
haftmann@28419
   306
haftmann@39063
   307
instantiation %quote bar :: equal
haftmann@28419
   308
begin
haftmann@28419
   309
haftmann@39063
   310
definition %quote "HOL.equal (x\<Colon>bar) y \<longleftrightarrow> x = y"
haftmann@28419
   311
haftmann@39063
   312
instance %quote by default (simp add: equal_bar_def)
haftmann@28213
   313
haftmann@30880
   314
end %quote (*<*)
haftmann@30880
   315
haftmann@30880
   316
(*>*) code_type %quotett bar
haftmann@28419
   317
  (Haskell "Integer")
haftmann@28419
   318
haftmann@28419
   319
text {*
haftmann@38450
   320
  \noindent The code generator would produce an additional instance,
haftmann@38450
   321
  which of course is rejected by the @{text Haskell} compiler.  To
haftmann@38506
   322
  suppress this additional instance, use @{command_def "code_instance"}:
haftmann@28419
   323
*}
haftmann@28419
   324
haftmann@39063
   325
code_instance %quotett bar :: equal
haftmann@28419
   326
  (Haskell -)
haftmann@28419
   327
haftmann@28561
   328
haftmann@28635
   329
subsection {* Enhancing the target language context \label{sec:include} *}
haftmann@28561
   330
haftmann@28561
   331
text {*
haftmann@28593
   332
  In rare cases it is necessary to \emph{enrich} the context of a
haftmann@38505
   333
  target language; this is accomplished using the @{command_def
haftmann@38450
   334
  "code_include"} command:
haftmann@28561
   335
*}
haftmann@28561
   336
haftmann@28564
   337
code_include %quotett Haskell "Errno"
haftmann@28561
   338
{*errno i = error ("Error number: " ++ show i)*}
haftmann@28561
   339
haftmann@28564
   340
code_reserved %quotett Haskell Errno
haftmann@28561
   341
haftmann@28561
   342
text {*
haftmann@38450
   343
  \noindent Such named @{text include}s are then prepended to every
haftmann@38450
   344
  generated code.  Inspect such code in order to find out how
haftmann@38450
   345
  @{command "code_include"} behaves with respect to a particular
haftmann@38450
   346
  target language.
haftmann@28561
   347
*}
haftmann@28561
   348
haftmann@28419
   349
end