doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy
author haftmann
Tue Jul 15 16:02:07 2008 +0200 (2008-07-15)
changeset 27609 b23c9ad0fe7d
parent 27557 151731493264
child 27705 f6277c8ab8ef
permissions -rw-r--r--
tuned code theorem bookkeeping
haftmann@26513
     1
haftmann@20948
     2
(* $Id$ *)
haftmann@20948
     3
haftmann@21147
     4
(*<*)
haftmann@20948
     5
theory Codegen
haftmann@20948
     6
imports Main
haftmann@21452
     7
uses "../../../antiquote_setup.ML"
haftmann@20948
     8
begin
haftmann@20948
     9
haftmann@27103
    10
ML {* CodeTarget.code_width := 74 *}
haftmann@21147
    11
haftmann@21147
    12
(*>*)
haftmann@21147
    13
haftmann@20948
    14
chapter {* Code generation from Isabelle theories *}
haftmann@20948
    15
haftmann@20948
    16
section {* Introduction *}
haftmann@20948
    17
haftmann@21058
    18
subsection {* Motivation *}
haftmann@21058
    19
haftmann@20948
    20
text {*
haftmann@21058
    21
  Executing formal specifications as programs is a well-established
haftmann@21058
    22
  topic in the theorem proving community.  With increasing
haftmann@21058
    23
  application of theorem proving systems in the area of
haftmann@21058
    24
  software development and verification, its relevance manifests
haftmann@21058
    25
  for running test cases and rapid prototyping.  In logical
haftmann@21058
    26
  calculi like constructive type theory,
haftmann@21058
    27
  a notion of executability is implicit due to the nature
haftmann@22550
    28
  of the calculus.  In contrast, specifications in Isabelle
haftmann@21058
    29
  can be highly non-executable.  In order to bridge
haftmann@21058
    30
  the gap between logic and executable specifications,
haftmann@21058
    31
  an explicit non-trivial transformation has to be applied:
haftmann@21058
    32
  code generation.
haftmann@21058
    33
haftmann@21058
    34
  This tutorial introduces a generic code generator for the
haftmann@21058
    35
  Isabelle system \cite{isa-tutorial}.
haftmann@21058
    36
  Generic in the sense that the
haftmann@21058
    37
  \qn{target language} for which code shall ultimately be
haftmann@21058
    38
  generated is not fixed but may be an arbitrary state-of-the-art
haftmann@21058
    39
  functional programming language (currently, the implementation
haftmann@22550
    40
  supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell
haftmann@22550
    41
  \cite{haskell-revised-report}).
haftmann@21058
    42
  We aim to provide a
haftmann@21058
    43
  versatile environment
haftmann@21058
    44
  suitable for software development and verification,
haftmann@21058
    45
  structuring the process
haftmann@21058
    46
  of code generation into a small set of orthogonal principles
haftmann@21058
    47
  while achieving a big coverage of application areas
haftmann@21058
    48
  with maximum flexibility.
haftmann@21189
    49
haftmann@22550
    50
  Conceptually the code generator framework is part
haftmann@22550
    51
  of Isabelle's @{text Pure} meta logic; the object logic
haftmann@22550
    52
  @{text HOL} which is an extension of @{text Pure}
haftmann@22550
    53
  already comes with a reasonable framework setup and thus provides
haftmann@22550
    54
  a good working horse for raising code-generation-driven
haftmann@22550
    55
  applications.  So, we assume some familiarity and experience
haftmann@22550
    56
  with the ingredients of the @{text HOL} \emph{Main} theory
haftmann@22550
    57
  (see also \cite{isa-tutorial}).
haftmann@21058
    58
*}
haftmann@21058
    59
haftmann@21058
    60
haftmann@21058
    61
subsection {* Overview *}
haftmann@21058
    62
haftmann@21058
    63
text {*
haftmann@21058
    64
  The code generator aims to be usable with no further ado
haftmann@21058
    65
  in most cases while allowing for detailed customization.
haftmann@22550
    66
  This manifests in the structure of this tutorial:
haftmann@22550
    67
  we start with a generic example \secref{sec:example}
haftmann@22550
    68
  and introduce code generation concepts \secref{sec:concept}.
haftmann@22550
    69
  Section
haftmann@21178
    70
  \secref{sec:basics} explains how to use the framework naively,
haftmann@21058
    71
  presuming a reasonable default setup.  Then, section
haftmann@21058
    72
  \secref{sec:advanced} deals with advanced topics,
haftmann@21058
    73
  introducing further aspects of the code generator framework
haftmann@21058
    74
  in a motivation-driven manner.  Last, section \secref{sec:ml}
haftmann@21058
    75
  introduces the framework's internal programming interfaces.
haftmann@20948
    76
haftmann@21058
    77
  \begin{warn}
haftmann@21058
    78
    Ultimately, the code generator which this tutorial deals with
haftmann@21058
    79
    is supposed to replace the already established code generator
haftmann@21058
    80
    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
haftmann@21147
    81
    So, for the moment, there are two distinct code generators
haftmann@21058
    82
    in Isabelle.
haftmann@22916
    83
    Also note that while the framework itself is
haftmann@22550
    84
    object-logic independent, only @{text HOL} provides a reasonable
haftmann@21058
    85
    framework setup.    
haftmann@21058
    86
  \end{warn}
haftmann@21058
    87
*}
haftmann@21058
    88
haftmann@21058
    89
haftmann@22550
    90
section {* An example: a simple theory of search trees \label{sec:example} *}
haftmann@22550
    91
haftmann@22550
    92
text {*
haftmann@22916
    93
  When writing executable specifications using @{text HOL},
haftmann@22916
    94
  it is convenient to use
haftmann@22550
    95
  three existing packages: the datatype package for defining
haftmann@22550
    96
  datatypes, the function package for (recursive) functions,
haftmann@22550
    97
  and the class package for overloaded definitions.
haftmann@22550
    98
haftmann@22550
    99
  We develope a small theory of search trees; trees are represented
haftmann@22550
   100
  as a datatype with key type @{typ "'a"} and value type @{typ "'b"}:
haftmann@22550
   101
*}
haftmann@22479
   102
haftmann@22479
   103
datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b
haftmann@22479
   104
  | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree"
haftmann@22479
   105
haftmann@22550
   106
text {*
haftmann@22550
   107
  \noindent Note that we have constrained the type of keys
haftmann@22550
   108
  to the class of total orders, @{text linorder}.
haftmann@22550
   109
haftmann@22550
   110
  We define @{text find} and @{text update} functions:
haftmann@22550
   111
*}
haftmann@22550
   112
haftmann@25870
   113
primrec
haftmann@22479
   114
  find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where
haftmann@22479
   115
  "find (Leaf key val) it = (if it = key then Some val else None)"
haftmann@22479
   116
  | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)"
haftmann@22479
   117
haftmann@22479
   118
fun
haftmann@22479
   119
  update :: "'a\<Colon>linorder \<times> 'b \<Rightarrow> ('a, 'b) searchtree \<Rightarrow> ('a, 'b) searchtree" where
haftmann@22479
   120
  "update (it, entry) (Leaf key val) = (
haftmann@22479
   121
    if it = key then Leaf key entry
haftmann@22479
   122
      else if it \<le> key
haftmann@22479
   123
      then Branch (Leaf it entry) it (Leaf key val)
haftmann@22479
   124
      else Branch (Leaf key val) it (Leaf it entry)
haftmann@22479
   125
   )"
haftmann@22479
   126
  | "update (it, entry) (Branch t1 key t2) = (
haftmann@22479
   127
    if it \<le> key
haftmann@22479
   128
      then (Branch (update (it, entry) t1) key t2)
haftmann@22479
   129
      else (Branch t1 key (update (it, entry) t2))
haftmann@22479
   130
   )"
haftmann@22479
   131
haftmann@22550
   132
text {*
haftmann@22550
   133
  \noindent For testing purpose, we define a small example
haftmann@22550
   134
  using natural numbers @{typ nat} (which are a @{text linorder})
haftmann@23130
   135
  as keys and list of nats as values:
haftmann@22550
   136
*}
haftmann@22550
   137
haftmann@23130
   138
definition
haftmann@23130
   139
  example :: "(nat, nat list) searchtree"
haftmann@23130
   140
where
haftmann@23130
   141
  "example = update (Suc (Suc (Suc (Suc 0))), [Suc (Suc 0), Suc (Suc 0)]) (update (Suc (Suc (Suc 0)), [Suc (Suc (Suc 0))])
haftmann@23130
   142
    (update (Suc (Suc 0), [Suc (Suc 0)]) (Leaf (Suc 0) [])))"
haftmann@22479
   143
haftmann@22550
   144
text {*
haftmann@22550
   145
  \noindent Then we generate code
haftmann@22550
   146
*}
haftmann@22550
   147
haftmann@26999
   148
export_code example (*<*)in SML (*>*)in SML file "examples/tree.ML"
haftmann@22479
   149
haftmann@22479
   150
text {*
haftmann@22550
   151
  \noindent which looks like:
haftmann@22479
   152
  \lstsml{Thy/examples/tree.ML}
haftmann@22479
   153
*}
haftmann@22479
   154
haftmann@22550
   155
haftmann@22550
   156
section {* Code generation concepts and process \label{sec:concept} *}
haftmann@21058
   157
haftmann@21058
   158
text {*
haftmann@21189
   159
  \begin{figure}[h]
haftmann@21189
   160
  \centering
haftmann@21189
   161
  \includegraphics[width=0.7\textwidth]{codegen_process}
haftmann@21189
   162
  \caption{code generator -- processing overview}
haftmann@21189
   163
  \label{fig:process}
haftmann@21189
   164
  \end{figure}
haftmann@21189
   165
haftmann@21058
   166
  The code generator employs a notion of executability
haftmann@21058
   167
  for three foundational executable ingredients known
haftmann@21058
   168
  from functional programming:
haftmann@22060
   169
  \emph{defining equations}, \emph{datatypes}, and
haftmann@22060
   170
  \emph{type classes}. A defining equation as a first approximation
haftmann@21058
   171
  is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
haftmann@21058
   172
  (an equation headed by a constant @{text f} with arguments
haftmann@22798
   173
  @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
haftmann@22060
   174
  Code generation aims to turn defining equations
haftmann@21058
   175
  into a functional program by running through
haftmann@21058
   176
  a process (see figure \ref{fig:process}):
haftmann@21058
   177
haftmann@21058
   178
  \begin{itemize}
haftmann@21058
   179
haftmann@21058
   180
    \item Out of the vast collection of theorems proven in a
haftmann@21058
   181
      \qn{theory}, a reasonable subset modeling
haftmann@22060
   182
      defining equations is \qn{selected}.
haftmann@21058
   183
haftmann@21058
   184
    \item On those selected theorems, certain
haftmann@21058
   185
      transformations are carried out
haftmann@21058
   186
      (\qn{preprocessing}).  Their purpose is to turn theorems
haftmann@21058
   187
      representing non- or badly executable
haftmann@21058
   188
      specifications into equivalent but executable counterparts.
haftmann@21058
   189
      The result is a structured collection of \qn{code theorems}.
haftmann@21058
   190
haftmann@22479
   191
    \item These \qn{code theorems} then are \qn{translated}
haftmann@21058
   192
      into an Haskell-like intermediate
haftmann@21058
   193
      language.
haftmann@21058
   194
haftmann@21058
   195
    \item Finally, out of the intermediate language the final
haftmann@21058
   196
      code in the desired \qn{target language} is \qn{serialized}.
haftmann@21058
   197
haftmann@21058
   198
  \end{itemize}
haftmann@21058
   199
haftmann@21058
   200
  From these steps, only the two last are carried out
haftmann@21058
   201
  outside the logic; by keeping this layer as
haftmann@21058
   202
  thin as possible, the amount of code to trust is
haftmann@21058
   203
  kept to a minimum.
haftmann@21058
   204
*}
haftmann@21058
   205
haftmann@21058
   206
haftmann@21058
   207
haftmann@21058
   208
section {* Basics \label{sec:basics} *}
haftmann@20948
   209
haftmann@20948
   210
subsection {* Invoking the code generator *}
haftmann@20948
   211
haftmann@21058
   212
text {*
haftmann@22916
   213
  Thanks to a reasonable setup of the @{text HOL} theories, in
haftmann@21058
   214
  most cases code generation proceeds without further ado:
haftmann@21058
   215
*}
haftmann@21058
   216
haftmann@25870
   217
primrec
haftmann@22473
   218
  fac :: "nat \<Rightarrow> nat" where
haftmann@22473
   219
    "fac 0 = 1"
haftmann@22473
   220
  | "fac (Suc n) = Suc n * fac n"
haftmann@21058
   221
haftmann@21058
   222
text {*
haftmann@22550
   223
  \noindent This executable specification is now turned to SML code:
haftmann@21058
   224
*}
haftmann@21058
   225
haftmann@24348
   226
export_code fac (*<*)in SML(*>*)in SML file "examples/fac.ML"
haftmann@21058
   227
haftmann@21058
   228
text {*
haftmann@24348
   229
  \noindent  The @{text "\<EXPORTCODE>"} command takes a space-separated list of
haftmann@21058
   230
  constants together with \qn{serialization directives}
haftmann@22845
   231
  These start with a \qn{target language}
haftmann@22845
   232
  identifier, followed by a file specification
haftmann@22845
   233
  where to write the generated code to.
haftmann@21058
   234
haftmann@22060
   235
  Internally, the defining equations for all selected
haftmann@21178
   236
  constants are taken, including any transitively required
haftmann@21058
   237
  constants, datatypes and classes, resulting in the following
haftmann@21058
   238
  code:
haftmann@21058
   239
haftmann@21058
   240
  \lstsml{Thy/examples/fac.ML}
haftmann@21058
   241
haftmann@21058
   242
  The code generator will complain when a required
haftmann@22550
   243
  ingredient does not provide a executable counterpart,
haftmann@22550
   244
  e.g.~generating code
haftmann@21058
   245
  for constants not yielding
haftmann@22550
   246
  a defining equation (e.g.~the Hilbert choice
haftmann@22550
   247
  operation @{text "SOME"}):
haftmann@21058
   248
*}
haftmann@21058
   249
(*<*)
haftmann@21058
   250
setup {* Sign.add_path "foo" *}
haftmann@21058
   251
(*>*)
haftmann@21058
   252
definition
haftmann@21993
   253
  pick_some :: "'a list \<Rightarrow> 'a" where
haftmann@21058
   254
  "pick_some xs = (SOME x. x \<in> set xs)"
haftmann@21058
   255
(*<*)
haftmann@21058
   256
hide const pick_some
haftmann@21058
   257
haftmann@21058
   258
setup {* Sign.parent_path *}
haftmann@21058
   259
haftmann@21058
   260
definition
haftmann@21993
   261
  pick_some :: "'a list \<Rightarrow> 'a" where
haftmann@21058
   262
  "pick_some = hd"
haftmann@21058
   263
(*>*)
haftmann@25870
   264
haftmann@24348
   265
export_code pick_some in SML file "examples/fail_const.ML"
haftmann@21058
   266
haftmann@22550
   267
text {* \noindent will fail. *}
haftmann@22550
   268
haftmann@20948
   269
subsection {* Theorem selection *}
haftmann@20948
   270
haftmann@21058
   271
text {*
haftmann@22060
   272
  The list of all defining equations in a theory may be inspected
haftmann@22292
   273
  using the @{text "\<PRINTCODESETUP>"} command:
haftmann@21058
   274
*}
haftmann@21058
   275
haftmann@22292
   276
print_codesetup
haftmann@21058
   277
haftmann@21058
   278
text {*
haftmann@21058
   279
  \noindent which displays a table of constant with corresponding
haftmann@22060
   280
  defining equations (the additional stuff displayed
haftmann@22751
   281
  shall not bother us for the moment).
haftmann@21058
   282
haftmann@22916
   283
  The typical @{text HOL} tools are already set up in a way that
haftmann@22751
   284
  function definitions introduced by @{text "\<DEFINITION>"},
haftmann@25870
   285
  @{text "\<PRIMREC>"}, @{text "\<FUN>"},
haftmann@25870
   286
  @{text "\<FUNCTION>"}, @{text "\<CONSTDEFS>"},
haftmann@21323
   287
  @{text "\<RECDEF>"} are implicitly propagated
haftmann@22060
   288
  to this defining equation table. Specific theorems may be
haftmann@21058
   289
  selected using an attribute: \emph{code func}. As example,
haftmann@21058
   290
  a weight selector function:
haftmann@21058
   291
*}
haftmann@21058
   292
haftmann@21058
   293
primrec
haftmann@25870
   294
  pick :: "(nat \<times> 'a) list \<Rightarrow> nat \<Rightarrow> 'a" where
haftmann@21058
   295
  "pick (x#xs) n = (let (k, v) = x in
haftmann@21058
   296
    if n < k then v else pick xs (n - k))"
haftmann@21058
   297
haftmann@21058
   298
text {*
haftmann@22798
   299
  \noindent We want to eliminate the explicit destruction
haftmann@21058
   300
  of @{term x} to @{term "(k, v)"}:
haftmann@21058
   301
*}
haftmann@21058
   302
haftmann@21058
   303
lemma [code func]:
haftmann@21058
   304
  "pick ((k, v)#xs) n = (if n < k then v else pick xs (n - k))"
haftmann@21058
   305
  by simp
haftmann@21058
   306
haftmann@24348
   307
export_code pick (*<*)in SML(*>*) in SML file "examples/pick1.ML"
haftmann@21058
   308
haftmann@21058
   309
text {*
haftmann@22798
   310
  \noindent This theorem now is used for generating code:
haftmann@21058
   311
haftmann@21058
   312
  \lstsml{Thy/examples/pick1.ML}
haftmann@21058
   313
haftmann@22798
   314
  \noindent It might be convenient to remove the pointless original
haftmann@22845
   315
  equation, using the \emph{func del} attribute:
haftmann@21058
   316
*}
haftmann@21058
   317
haftmann@22845
   318
lemmas [code func del] = pick.simps 
haftmann@21058
   319
haftmann@24348
   320
export_code pick (*<*)in SML(*>*) in SML file "examples/pick2.ML"
haftmann@21058
   321
haftmann@21058
   322
text {*
haftmann@21058
   323
  \lstsml{Thy/examples/pick2.ML}
haftmann@21058
   324
haftmann@22798
   325
  \noindent Syntactic redundancies are implicitly dropped. For example,
haftmann@21058
   326
  using a modified version of the @{const fac} function
haftmann@22060
   327
  as defining equation, the then redundant (since
haftmann@22060
   328
  syntactically subsumed) original defining equations
haftmann@21058
   329
  are dropped, resulting in a warning:
haftmann@21058
   330
*}
haftmann@21058
   331
haftmann@21058
   332
lemma [code func]:
haftmann@21058
   333
  "fac n = (case n of 0 \<Rightarrow> 1 | Suc m \<Rightarrow> n * fac m)"
haftmann@21058
   334
  by (cases n) simp_all
haftmann@21058
   335
haftmann@24348
   336
export_code fac (*<*)in SML(*>*)in SML file "examples/fac_case.ML"
haftmann@21058
   337
haftmann@21058
   338
text {*
haftmann@21058
   339
  \lstsml{Thy/examples/fac_case.ML}
haftmann@21058
   340
haftmann@21058
   341
  \begin{warn}
haftmann@22292
   342
    The attributes \emph{code} and \emph{code del}
haftmann@21058
   343
    associated with the existing code generator also apply to
haftmann@21058
   344
    the new one: \emph{code} implies \emph{code func},
haftmann@22845
   345
    and \emph{code del} implies \emph{code func del}.
haftmann@21058
   346
  \end{warn}
haftmann@21058
   347
*}
haftmann@20948
   348
haftmann@20948
   349
subsection {* Type classes *}
haftmann@20948
   350
haftmann@21058
   351
text {*
haftmann@21058
   352
  Type classes enter the game via the Isar class package.
haftmann@21075
   353
  For a short introduction how to use it, see \cite{isabelle-classes};
haftmann@21147
   354
  here we just illustrate its impact on code generation.
haftmann@21058
   355
haftmann@21058
   356
  In a target language, type classes may be represented
haftmann@22798
   357
  natively (as in the case of Haskell).  For languages
haftmann@21058
   358
  like SML, they are implemented using \emph{dictionaries}.
haftmann@21178
   359
  Our following example specifies a class \qt{null},
haftmann@21058
   360
  assigning to each of its inhabitants a \qt{null} value:
haftmann@21058
   361
*}
haftmann@21058
   362
haftmann@22473
   363
class null = type +
haftmann@21058
   364
  fixes null :: 'a
haftmann@21058
   365
haftmann@25870
   366
primrec
haftmann@25870
   367
  head :: "'a\<Colon>null list \<Rightarrow> 'a" where
haftmann@21058
   368
  "head [] = null"
haftmann@22798
   369
  | "head (x#xs) = x"
haftmann@21058
   370
haftmann@21058
   371
text {*
haftmann@25533
   372
 \noindent  We provide some instances for our @{text null}:
haftmann@21058
   373
*}
haftmann@21058
   374
haftmann@25533
   375
instantiation option and list :: (type) null
haftmann@25533
   376
begin
haftmann@25533
   377
haftmann@25533
   378
definition
haftmann@25533
   379
  "null = None"
haftmann@21058
   380
haftmann@25533
   381
definition
haftmann@25533
   382
  "null = []"
haftmann@25533
   383
haftmann@25533
   384
instance ..
haftmann@25533
   385
haftmann@25533
   386
end
haftmann@21058
   387
haftmann@21058
   388
text {*
haftmann@25533
   389
  \noindent Constructing a dummy example:
haftmann@21058
   390
*}
haftmann@21058
   391
haftmann@21058
   392
definition
haftmann@21058
   393
  "dummy = head [Some (Suc 0), None]"
haftmann@21058
   394
haftmann@21058
   395
text {*
haftmann@21178
   396
  Type classes offer a suitable occasion to introduce
haftmann@21058
   397
  the Haskell serializer.  Its usage is almost the same
haftmann@21075
   398
  as SML, but, in accordance with conventions
haftmann@21075
   399
  some Haskell systems enforce, each module ends
haftmann@21075
   400
  up in a single file. The module hierarchy is reflected in
haftmann@22845
   401
  the file system, with root directory given as file specification.
haftmann@21058
   402
*}
haftmann@21058
   403
haftmann@24348
   404
export_code dummy in Haskell file "examples/"
haftmann@21147
   405
  (* NOTE: you may use Haskell only once in this document, otherwise
haftmann@21147
   406
  you have to work in distinct subdirectories *)
haftmann@21058
   407
haftmann@21058
   408
text {*
haftmann@21058
   409
  \lsthaskell{Thy/examples/Codegen.hs}
haftmann@22798
   410
  \noindent (we have left out all other modules).
haftmann@21058
   411
haftmann@22798
   412
  \medskip
haftmann@21058
   413
haftmann@21058
   414
  The whole code in SML with explicit dictionary passing:
haftmann@21058
   415
*}
haftmann@21058
   416
haftmann@24348
   417
export_code dummy (*<*)in SML(*>*)in SML file "examples/class.ML"
haftmann@21058
   418
haftmann@21058
   419
text {*
haftmann@21058
   420
  \lstsml{Thy/examples/class.ML}
haftmann@21058
   421
haftmann@22798
   422
  \medskip
haftmann@22798
   423
haftmann@22798
   424
  \noindent or in OCaml:
haftmann@22292
   425
*}
haftmann@22292
   426
haftmann@24348
   427
export_code dummy in OCaml file "examples/class.ocaml"
haftmann@22292
   428
haftmann@22292
   429
text {*
haftmann@22292
   430
  \lstsml{Thy/examples/class.ocaml}
haftmann@22798
   431
haftmann@22798
   432
  \medskip The explicit association of constants
haftmann@22798
   433
  to classes can be inspected using the @{text "\<PRINTCLASSES>"}
haftmann@22845
   434
  command.
haftmann@22292
   435
*}
haftmann@22292
   436
haftmann@22550
   437
haftmann@21058
   438
section {* Recipes and advanced topics \label{sec:advanced} *}
haftmann@21058
   439
haftmann@21089
   440
text {*
haftmann@21089
   441
  In this tutorial, we do not attempt to give an exhaustive
haftmann@21089
   442
  description of the code generator framework; instead,
haftmann@21089
   443
  we cast a light on advanced topics by introducing
haftmann@21089
   444
  them together with practically motivated examples.  Concerning
haftmann@21089
   445
  further reading, see
haftmann@21089
   446
haftmann@21089
   447
  \begin{itemize}
haftmann@21089
   448
haftmann@21089
   449
  \item the Isabelle/Isar Reference Manual \cite{isabelle-isar-ref}
haftmann@21089
   450
    for exhaustive syntax diagrams.
haftmann@24193
   451
  \item or \cite{Haftmann-Nipkow:2007:codegen} which deals with foundational issues
haftmann@21089
   452
    of the code generator framework.
haftmann@21089
   453
haftmann@21089
   454
  \end{itemize}
haftmann@21089
   455
*}
haftmann@21058
   456
haftmann@22798
   457
subsection {* Library theories \label{sec:library} *}
haftmann@21058
   458
haftmann@21089
   459
text {*
haftmann@22916
   460
  The @{text HOL} @{text Main} theory already provides a code
haftmann@22916
   461
  generator setup
haftmann@21089
   462
  which should be suitable for most applications. Common extensions
haftmann@22916
   463
  and modifications are available by certain theories of the @{text HOL}
haftmann@21089
   464
  library; beside being useful in applications, they may serve
haftmann@21178
   465
  as a tutorial for customizing the code generator setup.
haftmann@21089
   466
haftmann@21089
   467
  \begin{description}
haftmann@21089
   468
haftmann@24994
   469
    \item[@{text "Code_Integer"}] represents @{text HOL} integers by big
haftmann@22798
   470
       integer literals in target languages.
haftmann@24994
   471
    \item[@{text "Code_Char"}] represents @{text HOL} characters by 
haftmann@22798
   472
       character literals in target languages.
haftmann@24994
   473
    \item[@{text "Code_Char_chr"}] like @{text "Code_Char"},
haftmann@22798
   474
       but also offers treatment of character codes; includes
haftmann@24994
   475
       @{text "Code_Integer"}.
haftmann@23850
   476
    \item[@{text "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
haftmann@21178
   477
       which in general will result in higher efficency; pattern
wenzelm@25369
   478
       matching with @{term "0\<Colon>nat"} / @{const "Suc"}
haftmann@24994
   479
       is eliminated;  includes @{text "Code_Integer"}.
haftmann@24994
   480
    \item[@{text "Code_Index"}] provides an additional datatype
haftmann@24994
   481
       @{text index} which is mapped to target-language built-in integers.
haftmann@24994
   482
       Useful for code setups which involve e.g. indexing of
haftmann@24994
   483
       target-language arrays.
haftmann@24994
   484
    \item[@{text "Code_Message"}] provides an additional datatype
haftmann@24994
   485
       @{text message_string} which is isomorphic to strings;
haftmann@24994
   486
       @{text message_string}s are mapped to target-language strings.
haftmann@24994
   487
       Useful for code setups which involve e.g. printing (error) messages.
haftmann@21089
   488
haftmann@21089
   489
  \end{description}
haftmann@22916
   490
haftmann@22916
   491
  \begin{warn}
haftmann@22916
   492
    When importing any of these theories, they should form the last
haftmann@22916
   493
    items in an import list.  Since these theories adapt the
haftmann@22916
   494
    code generator setup in a non-conservative fashion,
haftmann@22916
   495
    strange effects may occur otherwise.
haftmann@22916
   496
  \end{warn}
haftmann@21089
   497
*}
haftmann@21089
   498
haftmann@20948
   499
subsection {* Preprocessing *}
haftmann@20948
   500
haftmann@21089
   501
text {*
haftmann@21147
   502
  Before selected function theorems are turned into abstract
haftmann@21147
   503
  code, a chain of definitional transformation steps is carried
haftmann@27609
   504
  out: \emph{preprocessing}.  In essence, the preprocessor
haftmann@27609
   505
  consists of two components: a \emph{simpset} and \emph{function transformers}.
haftmann@21147
   506
haftmann@27557
   507
  The \emph{simpset} allows to employ the full generality of the Isabelle
haftmann@27557
   508
  simplifier.  Due to the interpretation of theorems
haftmann@22060
   509
  of defining equations, rewrites are applied to the right
haftmann@21147
   510
  hand side and the arguments of the left hand side of an
haftmann@21147
   511
  equation, but never to the constant heading the left hand side.
haftmann@27557
   512
  An important special case are \emph{inline theorems} which may be
haftmann@27557
   513
  declared an undeclared using the
haftmann@22845
   514
  \emph{code inline} or \emph{code inline del} attribute respectively.
haftmann@21147
   515
  Some common applications:
haftmann@21147
   516
*}
haftmann@21147
   517
haftmann@21147
   518
text_raw {*
haftmann@21147
   519
  \begin{itemize}
haftmann@21147
   520
*}
haftmann@21147
   521
haftmann@22845
   522
text {*
haftmann@22845
   523
     \item replacing non-executable constructs by executable ones:
haftmann@22845
   524
*}     
haftmann@21147
   525
haftmann@22845
   526
  lemma [code inline]:
haftmann@22845
   527
    "x \<in> set xs \<longleftrightarrow> x mem xs" by (induct xs) simp_all
haftmann@22845
   528
haftmann@22845
   529
text {*
haftmann@22845
   530
     \item eliminating superfluous constants:
haftmann@21089
   531
*}
haftmann@21089
   532
haftmann@22845
   533
  lemma [code inline]:
haftmann@22845
   534
    "1 = Suc 0" by simp
haftmann@22845
   535
haftmann@22845
   536
text {*
haftmann@22845
   537
     \item replacing executable but inconvenient constructs:
haftmann@22845
   538
*}
haftmann@22845
   539
haftmann@22845
   540
  lemma [code inline]:
haftmann@22845
   541
    "xs = [] \<longleftrightarrow> List.null xs" by (induct xs) simp_all
haftmann@21147
   542
haftmann@21147
   543
text_raw {*
haftmann@21147
   544
  \end{itemize}
haftmann@21147
   545
*}
haftmann@21147
   546
haftmann@21147
   547
text {*
haftmann@21147
   548
haftmann@27609
   549
  \emph{Function transformers} provide a very general interface,
haftmann@21147
   550
  transforming a list of function theorems to another
haftmann@21147
   551
  list of function theorems, provided that neither the heading
wenzelm@25369
   552
  constant nor its type change.  The @{term "0\<Colon>nat"} / @{const Suc}
haftmann@21323
   553
  pattern elimination implemented in
haftmann@27609
   554
  theory @{text "Efficient_Nat"} (\secref{eff_nat}) uses this
haftmann@21147
   555
  interface.
haftmann@21147
   556
haftmann@27557
   557
  \noindent The current setup of the preprocessor may be inspected using
haftmann@27557
   558
  the @{text "\<PRINTCODESETUP>"} command.
haftmann@27557
   559
haftmann@21147
   560
  \begin{warn}
haftmann@27609
   561
    The attribute \emph{code unfold}
haftmann@21147
   562
    associated with the existing code generator also applies to
haftmann@21147
   563
    the new one: \emph{code unfold} implies \emph{code inline}.
haftmann@21147
   564
  \end{warn}
haftmann@21147
   565
*}
haftmann@20948
   566
haftmann@22798
   567
haftmann@22798
   568
subsection {* Concerning operational equality *}
haftmann@22798
   569
haftmann@22798
   570
text {*
haftmann@22798
   571
  Surely you have already noticed how equality is treated
haftmann@22798
   572
  by the code generator:
haftmann@22798
   573
*}
haftmann@22798
   574
haftmann@25870
   575
primrec
haftmann@22798
   576
  collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
haftmann@22798
   577
    "collect_duplicates xs ys [] = xs"
haftmann@22798
   578
  | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
haftmann@22798
   579
      then if z \<in> set ys
haftmann@22798
   580
        then collect_duplicates xs ys zs
haftmann@22798
   581
        else collect_duplicates xs (z#ys) zs
haftmann@22798
   582
      else collect_duplicates (z#xs) (z#ys) zs)"
haftmann@22798
   583
haftmann@22798
   584
text {*
haftmann@22798
   585
  The membership test during preprocessing is rewritten,
haftmann@23850
   586
  resulting in @{const List.member}, which itself
haftmann@22798
   587
  performs an explicit equality check.
haftmann@22798
   588
*}
haftmann@22798
   589
haftmann@24348
   590
export_code collect_duplicates (*<*)in SML(*>*)in SML file "examples/collect_duplicates.ML"
haftmann@22798
   591
haftmann@22798
   592
text {*
haftmann@22798
   593
  \lstsml{Thy/examples/collect_duplicates.ML}
haftmann@22798
   594
*}
haftmann@22798
   595
haftmann@22798
   596
text {*
haftmann@22798
   597
  Obviously, polymorphic equality is implemented the Haskell
haftmann@26513
   598
  way using a type class.  How is this achieved?  HOL introduces
haftmann@26513
   599
  an explicit class @{text eq} with a corresponding operation
haftmann@26732
   600
  @{const eq_class.eq} such that @{thm eq [no_vars]}.
haftmann@26513
   601
  The preprocessing framework does the rest.
haftmann@22798
   602
  For datatypes, instances of @{text eq} are implicitly derived
haftmann@26513
   603
  when possible.  For other types, you may instantiate @{text eq}
haftmann@26513
   604
  manually like any other type class.
haftmann@22798
   605
haftmann@22798
   606
  Though this @{text eq} class is designed to get rarely in
haftmann@22798
   607
  the way, a subtlety
haftmann@22798
   608
  enters the stage when definitions of overloaded constants
haftmann@22798
   609
  are dependent on operational equality.  For example, let
haftmann@22798
   610
  us define a lexicographic ordering on tuples:
haftmann@22798
   611
*}
haftmann@26513
   612
haftmann@25870
   613
instantiation * :: (ord, ord) ord
haftmann@25870
   614
begin
haftmann@25870
   615
haftmann@25870
   616
definition
haftmann@25870
   617
  [code func del]: "p1 < p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in
haftmann@25870
   618
    x1 < x2 \<or> (x1 = x2 \<and> y1 < y2))"
haftmann@22798
   619
haftmann@25870
   620
definition
haftmann@25870
   621
  [code func del]: "p1 \<le> p2 \<longleftrightarrow> (let (x1, y1) = p1; (x2, y2) = p2 in
haftmann@25870
   622
    x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2))"
haftmann@25870
   623
haftmann@25870
   624
instance ..
haftmann@25870
   625
haftmann@25870
   626
end
haftmann@22798
   627
haftmann@22845
   628
lemma ord_prod [code func(*<*), code func del(*>*)]:
haftmann@22798
   629
  "(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)"
haftmann@22798
   630
  "(x1 \<Colon> 'a\<Colon>ord, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow> x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)"
haftmann@22798
   631
  unfolding less_prod_def less_eq_prod_def by simp_all
haftmann@22798
   632
haftmann@22798
   633
text {*
haftmann@22798
   634
  Then code generation will fail.  Why?  The definition
wenzelm@25369
   635
  of @{term "op \<le>"} depends on equality on both arguments,
haftmann@22798
   636
  which are polymorphic and impose an additional @{text eq}
haftmann@22798
   637
  class constraint, thus violating the type discipline
haftmann@22798
   638
  for class operations.
haftmann@22798
   639
haftmann@22798
   640
  The solution is to add @{text eq} explicitly to the first sort arguments in the
haftmann@22798
   641
  code theorems:
haftmann@22798
   642
*}
haftmann@22798
   643
haftmann@22798
   644
lemma ord_prod_code [code func]:
haftmann@22798
   645
  "(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) < (x2, y2) \<longleftrightarrow>
haftmann@22798
   646
    x1 < x2 \<or> (x1 = x2 \<and> y1 < y2)"
haftmann@22798
   647
  "(x1 \<Colon> 'a\<Colon>{ord, eq}, y1 \<Colon> 'b\<Colon>ord) \<le> (x2, y2) \<longleftrightarrow>
haftmann@22798
   648
    x1 < x2 \<or> (x1 = x2 \<and> y1 \<le> y2)"
haftmann@22798
   649
  unfolding ord_prod by rule+
haftmann@22798
   650
haftmann@22798
   651
text {*
haftmann@22798
   652
  \noindent Then code generation succeeds:
haftmann@22798
   653
*}
haftmann@22798
   654
haftmann@24348
   655
export_code "op \<le> \<Colon> 'a\<Colon>{eq, ord} \<times> 'b\<Colon>ord \<Rightarrow> 'a \<times> 'b \<Rightarrow> bool"
haftmann@22845
   656
  (*<*)in SML(*>*)in SML file "examples/lexicographic.ML"
haftmann@22798
   657
haftmann@22798
   658
text {*
haftmann@22798
   659
  \lstsml{Thy/examples/lexicographic.ML}
haftmann@22798
   660
*}
haftmann@22798
   661
haftmann@22798
   662
text {*
haftmann@22798
   663
  In general, code theorems for overloaded constants may have more
haftmann@22798
   664
  restrictive sort constraints than the underlying instance relation
haftmann@22798
   665
  between class and type constructor as long as the whole system of
haftmann@22798
   666
  constraints is coregular; code theorems violating coregularity
haftmann@22798
   667
  are rejected immediately.  Consequently, it might be necessary
haftmann@22798
   668
  to delete disturbing theorems in the code theorem table,
haftmann@22798
   669
  as we have done here with the original definitions @{text less_prod_def}
haftmann@22798
   670
  and @{text less_eq_prod_def}.
haftmann@22885
   671
haftmann@22885
   672
  In some cases, the automatically derived defining equations
haftmann@22885
   673
  for equality on a particular type may not be appropriate.
haftmann@22885
   674
  As example, watch the following datatype representing
haftmann@23130
   675
  monomorphic parametric types (where type constructors
haftmann@23130
   676
  are referred to by natural numbers):
haftmann@22798
   677
*}
haftmann@22798
   678
haftmann@23130
   679
datatype monotype = Mono nat "monotype list"
haftmann@22885
   680
(*<*)
haftmann@22885
   681
lemma monotype_eq:
haftmann@22885
   682
  "Mono tyco1 typargs1 = Mono tyco2 typargs2 \<equiv> 
haftmann@22885
   683
     tyco1 = tyco2 \<and> typargs1 = typargs2" by simp
haftmann@22885
   684
(*>*)
haftmann@22885
   685
haftmann@22885
   686
text {*
haftmann@22885
   687
  Then code generation for SML would fail with a message
haftmann@22885
   688
  that the generated code conains illegal mutual dependencies:
haftmann@23130
   689
  the theorem @{thm monotype_eq [no_vars]} already requires the
haftmann@22885
   690
  instance @{text "monotype \<Colon> eq"}, which itself requires
haftmann@23130
   691
  @{thm monotype_eq [no_vars]};  Haskell has no problem with mutually
haftmann@22885
   692
  recursive @{text instance} and @{text function} definitions,
haftmann@22885
   693
  but the SML serializer does not support this.
haftmann@22885
   694
haftmann@22885
   695
  In such cases, you have to provide you own equality equations
haftmann@22885
   696
  involving auxiliary constants.  In our case,
haftmann@22885
   697
  @{const [show_types] list_all2} can do the job:
haftmann@22885
   698
*}
haftmann@22885
   699
haftmann@22885
   700
lemma monotype_eq_list_all2 [code func]:
haftmann@22885
   701
  "Mono tyco1 typargs1 = Mono tyco2 typargs2 \<longleftrightarrow>
haftmann@22885
   702
     tyco1 = tyco2 \<and> list_all2 (op =) typargs1 typargs2"
haftmann@22885
   703
  by (simp add: list_all2_eq [symmetric])
haftmann@22885
   704
haftmann@22885
   705
text {*
haftmann@22885
   706
  does not depend on instance @{text "monotype \<Colon> eq"}:
haftmann@22885
   707
*}
haftmann@22885
   708
haftmann@24348
   709
export_code "op = :: monotype \<Rightarrow> monotype \<Rightarrow> bool"
haftmann@22885
   710
  (*<*)in SML(*>*)in SML file "examples/monotype.ML"
haftmann@22885
   711
haftmann@22885
   712
text {*
haftmann@22885
   713
  \lstsml{Thy/examples/monotype.ML}
haftmann@22885
   714
*}
haftmann@22798
   715
haftmann@22798
   716
subsection {* Programs as sets of theorems *}
haftmann@22798
   717
haftmann@22798
   718
text {*
haftmann@22798
   719
  As told in \secref{sec:concept}, code generation is based
haftmann@22798
   720
  on a structured collection of code theorems.
haftmann@22798
   721
  For explorative purpose, this collection
haftmann@22798
   722
  may be inspected using the @{text "\<CODETHMS>"} command:
haftmann@22798
   723
*}
haftmann@22798
   724
haftmann@22798
   725
code_thms "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat"
haftmann@22798
   726
haftmann@22798
   727
text {*
haftmann@22798
   728
  \noindent prints a table with \emph{all} defining equations
wenzelm@25369
   729
  for @{term "op mod :: nat \<Rightarrow> nat \<Rightarrow> nat"}, including
haftmann@22798
   730
  \emph{all} defining equations those equations depend
haftmann@22798
   731
  on recursivly.  @{text "\<CODETHMS>"} provides a convenient
haftmann@22798
   732
  mechanism to inspect the impact of a preprocessor setup
haftmann@22798
   733
  on defining equations.
haftmann@22798
   734
  
haftmann@22798
   735
  Similarly, the @{text "\<CODEDEPS>"} command shows a graph
haftmann@22798
   736
  visualizing dependencies between defining equations.
haftmann@22798
   737
*}
haftmann@22798
   738
haftmann@22798
   739
haftmann@25411
   740
subsection {* Constructor sets for datatypes *}
haftmann@25411
   741
haftmann@25411
   742
text {*
haftmann@25411
   743
  Conceptually, any datatype is spanned by a set of
haftmann@25411
   744
  \emph{constructors} of type @{text "\<tau> = \<dots> \<Rightarrow> \<kappa> \<alpha>\<^isub>1 \<dots> \<alpha>\<^isub>n"}
haftmann@25411
   745
  where @{text "{\<alpha>\<^isub>1, \<dots>, \<alpha>\<^isub>n}"} is excactly the set of \emph{all}
haftmann@25411
   746
  type variables in @{text "\<tau>"}.  The HOL datatype package
haftmann@25411
   747
  by default registers any new datatype in the table
haftmann@25411
   748
  of datatypes, which may be inspected using
haftmann@25411
   749
  the @{text "\<PRINTCODESETUP>"} command.
haftmann@25411
   750
haftmann@25411
   751
  In some cases, it may be convenient to alter or
haftmann@26999
   752
  extend this table;  as an example, we will develope an alternative
haftmann@26999
   753
  representation of natural numbers as binary digits, whose
haftmann@26999
   754
  size does increase logarithmically with its value, not linear
haftmann@26999
   755
  \footnote{Indeed, the @{text "Efficient_Nat"} theory \ref{eff_nat}
haftmann@26999
   756
    does something similar}.  First, the digit representation:
haftmann@26999
   757
*}
haftmann@26999
   758
haftmann@26999
   759
definition Dig0 :: "nat \<Rightarrow> nat" where
haftmann@26999
   760
  "Dig0 n = 2 * n"
haftmann@26999
   761
haftmann@26999
   762
definition Dig1 :: "nat \<Rightarrow> nat" where
haftmann@26999
   763
  "Dig1 n = Suc (2 * n)"
haftmann@26999
   764
haftmann@26999
   765
text {*
haftmann@26999
   766
  \noindent We will use these two ">digits"< to represent natural numbers
haftmann@26999
   767
  in binary digits, e.g.:
haftmann@26999
   768
*}
haftmann@26999
   769
haftmann@26999
   770
lemma 42: "42 = Dig0 (Dig1 (Dig0 (Dig1 (Dig0 1))))"
haftmann@26999
   771
  by (simp add: Dig0_def Dig1_def)
haftmann@26999
   772
haftmann@26999
   773
text {*
haftmann@26999
   774
  \noindent Of course we also have to provide proper code equations for
haftmann@26999
   775
  the operations, e.g. @{term "op + \<Colon> nat \<Rightarrow> nat \<Rightarrow> nat"}:
haftmann@26973
   776
*}
haftmann@25411
   777
haftmann@26999
   778
lemma plus_Dig [code func]:
haftmann@26999
   779
  "0 + n = n"
haftmann@26999
   780
  "m + 0 = m"
haftmann@26999
   781
  "1 + Dig0 n = Dig1 n"
haftmann@26999
   782
  "Dig0 m + 1 = Dig1 m"
haftmann@26999
   783
  "1 + Dig1 n = Dig0 (n + 1)"
haftmann@26999
   784
  "Dig1 m + 1 = Dig0 (m + 1)"
haftmann@26999
   785
  "Dig0 m + Dig0 n = Dig0 (m + n)"
haftmann@26999
   786
  "Dig0 m + Dig1 n = Dig1 (m + n)"
haftmann@26999
   787
  "Dig1 m + Dig0 n = Dig1 (m + n)"
haftmann@26999
   788
  "Dig1 m + Dig1 n = Dig0 (m + n + 1)"
haftmann@26999
   789
  by (simp_all add: Dig0_def Dig1_def)
haftmann@26999
   790
haftmann@26999
   791
text {*
haftmann@26999
   792
  \noindent We then instruct the code generator to view @{term "0\<Colon>nat"},
haftmann@26999
   793
  @{term "1\<Colon>nat"}, @{term Dig0} and @{term Dig1} as
haftmann@26999
   794
  datatype constructors:
haftmann@26999
   795
*}
haftmann@26999
   796
haftmann@26999
   797
code_datatype "0\<Colon>nat" "1\<Colon>nat" Dig0 Dig1
haftmann@25411
   798
haftmann@26999
   799
text {*
haftmann@26999
   800
  \noindent For the former constructor @{term Suc}, we provide a code
haftmann@26999
   801
  equation and remove some parts of the default code generator setup
haftmann@26999
   802
  which are an obstacle here:
haftmann@26999
   803
*}
haftmann@26999
   804
haftmann@26999
   805
lemma Suc_Dig [code func]:
haftmann@26999
   806
  "Suc n = n + 1"
haftmann@26999
   807
  by simp
haftmann@25411
   808
haftmann@26999
   809
declare One_nat_def [code inline del]
haftmann@26999
   810
declare add_Suc_shift [code func del] 
haftmann@26999
   811
haftmann@26999
   812
text {*
haftmann@26999
   813
  \noindent This yields the following code:
haftmann@26999
   814
*}
haftmann@26999
   815
haftmann@26999
   816
export_code "op + \<Colon> nat \<Rightarrow> nat \<Rightarrow> nat" (*<*)in SML(*>*) in SML file "examples/nat_binary.ML"
haftmann@26999
   817
haftmann@26999
   818
text {* \lstsml{Thy/examples/nat_binary.ML} *}
haftmann@25411
   819
haftmann@25411
   820
text {*
haftmann@25411
   821
  \medskip
haftmann@25411
   822
haftmann@26999
   823
  From this example, it can be easily glimpsed that using own constructor sets
haftmann@26999
   824
  is a little delicate since it changes the set of valid patterns for values
haftmann@26999
   825
  of that type.  Without going into much detail, here some practical hints:
haftmann@26999
   826
haftmann@26999
   827
  \begin{itemize}
haftmann@26999
   828
    \item When changing the constuctor set for datatypes, take care to
haftmann@26999
   829
      provide an alternative for the @{text case} combinator (e.g. by replacing
haftmann@26999
   830
      it using the preprocessor).
haftmann@26999
   831
    \item Values in the target language need not to be normalized -- different
haftmann@26999
   832
      values in the target language may represent the same value in the
haftmann@26999
   833
      logic (e.g. @{text "Dig1 0 = 1"}).
haftmann@26999
   834
    \item Usually, a good methodology to deal with the subleties of pattern
haftmann@26999
   835
      matching is to see the type as an abstract type: provide a set
haftmann@26999
   836
      of operations which operate on the concrete representation of the type,
haftmann@26999
   837
      and derive further operations by combinations of these primitive ones,
haftmann@26999
   838
      without relying on a particular representation.
haftmann@26999
   839
  \end{itemize}
haftmann@25411
   840
*}
haftmann@25411
   841
haftmann@26999
   842
code_datatype %invisible "0::nat" Suc
haftmann@26999
   843
declare %invisible plus_Dig [code func del]
haftmann@26999
   844
declare %invisible One_nat_def [code inline]
haftmann@26999
   845
declare %invisible add_Suc_shift [code func] 
haftmann@26999
   846
lemma %invisible [code func]: "0 + n = (n \<Colon> nat)" by simp
haftmann@26999
   847
haftmann@25411
   848
haftmann@21058
   849
subsection {* Customizing serialization  *}
haftmann@20948
   850
haftmann@22798
   851
subsubsection {* Basics *}
haftmann@22798
   852
haftmann@21147
   853
text {*
haftmann@21147
   854
  Consider the following function and its corresponding
haftmann@21147
   855
  SML code:
haftmann@21147
   856
*}
haftmann@21147
   857
haftmann@25870
   858
primrec
haftmann@21147
   859
  in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
haftmann@21147
   860
  "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
haftmann@21147
   861
(*<*)
haftmann@21323
   862
code_type %tt bool
haftmann@21147
   863
  (SML)
haftmann@21323
   864
code_const %tt True and False and "op \<and>" and Not
haftmann@21147
   865
  (SML and and and)
haftmann@21147
   866
(*>*)
haftmann@24348
   867
export_code in_interval (*<*)in SML(*>*)in SML file "examples/bool_literal.ML"
haftmann@21147
   868
haftmann@21147
   869
text {*
haftmann@21323
   870
  \lstsml{Thy/examples/bool_literal.ML}
haftmann@21147
   871
haftmann@22798
   872
  \noindent Though this is correct code, it is a little bit unsatisfactory:
haftmann@21147
   873
  boolean values and operators are materialized as distinguished
haftmann@21147
   874
  entities with have nothing to do with the SML-builtin notion
haftmann@21147
   875
  of \qt{bool}.  This results in less readable code;
haftmann@21147
   876
  additionally, eager evaluation may cause programs to
haftmann@21147
   877
  loop or break which would perfectly terminate when
haftmann@21147
   878
  the existing SML \qt{bool} would be used.  To map
haftmann@21147
   879
  the HOL \qt{bool} on SML \qt{bool}, we may use
haftmann@21147
   880
  \qn{custom serializations}:
haftmann@21147
   881
*}
haftmann@21147
   882
haftmann@21323
   883
code_type %tt bool
haftmann@21147
   884
  (SML "bool")
haftmann@21323
   885
code_const %tt True and False and "op \<and>"
haftmann@21147
   886
  (SML "true" and "false" and "_ andalso _")
haftmann@21147
   887
haftmann@21147
   888
text {*
haftmann@21323
   889
  The @{text "\<CODETYPE>"} commad takes a type constructor
haftmann@21147
   890
  as arguments together with a list of custom serializations.
haftmann@21147
   891
  Each custom serialization starts with a target language
haftmann@21147
   892
  identifier followed by an expression, which during
haftmann@21147
   893
  code serialization is inserted whenever the type constructor
haftmann@21323
   894
  would occur.  For constants, @{text "\<CODECONST>"} implements
haftmann@21323
   895
  the corresponding mechanism.  Each ``@{verbatim "_"}'' in
haftmann@21147
   896
  a serialization expression is treated as a placeholder
haftmann@21147
   897
  for the type constructor's (the constant's) arguments.
haftmann@21147
   898
*}
haftmann@21147
   899
haftmann@24348
   900
export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_mlbool.ML"
haftmann@21147
   901
haftmann@21147
   902
text {*
haftmann@21323
   903
  \lstsml{Thy/examples/bool_mlbool.ML}
haftmann@21147
   904
haftmann@22798
   905
  \noindent This still is not perfect: the parentheses
haftmann@21323
   906
  around the \qt{andalso} expression are superfluous.
haftmann@21323
   907
  Though the serializer
haftmann@21147
   908
  by no means attempts to imitate the rich Isabelle syntax
haftmann@21147
   909
  framework, it provides some common idioms, notably
haftmann@21147
   910
  associative infixes with precedences which may be used here:
haftmann@21147
   911
*}
haftmann@21147
   912
haftmann@21323
   913
code_const %tt "op \<and>"
haftmann@21147
   914
  (SML infixl 1 "andalso")
haftmann@21147
   915
haftmann@24348
   916
export_code in_interval (*<*)in SML(*>*) in SML file "examples/bool_infix.ML"
haftmann@21147
   917
haftmann@21147
   918
text {*
haftmann@21323
   919
  \lstsml{Thy/examples/bool_infix.ML}
haftmann@21147
   920
haftmann@22798
   921
  \medskip
haftmann@22798
   922
haftmann@21147
   923
  Next, we try to map HOL pairs to SML pairs, using the
haftmann@21323
   924
  infix ``@{verbatim "*"}'' type constructor and parentheses:
haftmann@21147
   925
*}
haftmann@21147
   926
(*<*)
haftmann@21147
   927
code_type *
haftmann@21147
   928
  (SML)
haftmann@21147
   929
code_const Pair
haftmann@21147
   930
  (SML)
haftmann@21147
   931
(*>*)
haftmann@21323
   932
code_type %tt *
haftmann@21147
   933
  (SML infix 2 "*")
haftmann@21323
   934
code_const %tt Pair
haftmann@21147
   935
  (SML "!((_),/ (_))")
haftmann@21147
   936
haftmann@21147
   937
text {*
haftmann@21323
   938
  The initial bang ``@{verbatim "!"}'' tells the serializer to never put
haftmann@21147
   939
  parentheses around the whole expression (they are already present),
haftmann@21147
   940
  while the parentheses around argument place holders
haftmann@21147
   941
  tell not to put parentheses around the arguments.
haftmann@21323
   942
  The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
haftmann@21147
   943
  inserts a space which may be used as a break if necessary
haftmann@21147
   944
  during pretty printing.
haftmann@21147
   945
haftmann@22798
   946
  These examples give a glimpse what mechanisms
haftmann@21178
   947
  custom serializations provide; however their usage
haftmann@21178
   948
  requires careful thinking in order not to introduce
haftmann@21178
   949
  inconsistencies -- or, in other words:
haftmann@21178
   950
  custom serializations are completely axiomatic.
haftmann@21147
   951
haftmann@21178
   952
  A further noteworthy details is that any special
haftmann@21178
   953
  character in a custom serialization may be quoted
haftmann@21323
   954
  using ``@{verbatim "'"}''; thus, in
haftmann@21323
   955
  ``@{verbatim "fn '_ => _"}'' the first
haftmann@21323
   956
  ``@{verbatim "_"}'' is a proper underscore while the
haftmann@21323
   957
  second ``@{verbatim "_"}'' is a placeholder.
haftmann@21147
   958
haftmann@21178
   959
  The HOL theories provide further
haftmann@25411
   960
  examples for custom serializations.
haftmann@21178
   961
*}
haftmann@21147
   962
haftmann@22188
   963
haftmann@21178
   964
subsubsection {* Haskell serialization *}
haftmann@21178
   965
haftmann@21178
   966
text {*
haftmann@21178
   967
  For convenience, the default
haftmann@21178
   968
  HOL setup for Haskell maps the @{text eq} class to
haftmann@21178
   969
  its counterpart in Haskell, giving custom serializations
haftmann@21323
   970
  for the class (@{text "\<CODECLASS>"}) and its operation:
haftmann@21178
   971
*}
haftmann@21178
   972
haftmann@21323
   973
code_class %tt eq
haftmann@22798
   974
  (Haskell "Eq" where "op =" \<equiv> "(==)")
haftmann@21178
   975
haftmann@22798
   976
code_const %tt "op ="
haftmann@21178
   977
  (Haskell infixl 4 "==")
haftmann@21178
   978
haftmann@21178
   979
text {*
haftmann@21178
   980
  A problem now occurs whenever a type which
haftmann@21178
   981
  is an instance of @{text eq} in HOL is mapped
haftmann@21178
   982
  on a Haskell-builtin type which is also an instance
haftmann@21178
   983
  of Haskell @{text Eq}:
haftmann@21147
   984
*}
haftmann@21147
   985
haftmann@21178
   986
typedecl bar
haftmann@21178
   987
haftmann@26513
   988
instantiation bar :: eq
haftmann@26513
   989
begin
haftmann@26513
   990
haftmann@26732
   991
definition "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y"
haftmann@26513
   992
haftmann@26513
   993
instance by default (simp add: eq_bar_def)
haftmann@26513
   994
haftmann@26513
   995
end
haftmann@21178
   996
haftmann@21323
   997
code_type %tt bar
haftmann@21178
   998
  (Haskell "Integer")
haftmann@21178
   999
haftmann@21178
  1000
text {*
haftmann@22188
  1001
  The code generator would produce
haftmann@22188
  1002
  an additional instance, which of course is rejected.
haftmann@22188
  1003
  To suppress this additional instance, use
haftmann@22188
  1004
  @{text "\<CODEINSTANCE>"}:
haftmann@21147
  1005
*}
haftmann@21147
  1006
haftmann@21323
  1007
code_instance %tt bar :: eq
haftmann@21178
  1008
  (Haskell -)
haftmann@21178
  1009
haftmann@21178
  1010
haftmann@22798
  1011
subsubsection {* Pretty printing *}
haftmann@21189
  1012
haftmann@21189
  1013
text {*
haftmann@22798
  1014
  The serializer provides ML interfaces to set up
haftmann@22798
  1015
  pretty serializations for expressions like lists, numerals
haftmann@22798
  1016
  and characters;  these are
haftmann@22798
  1017
  monolithic stubs and should only be used with the
haftmann@22798
  1018
  theories introduces in \secref{sec:library}.
haftmann@21189
  1019
*}
haftmann@21189
  1020
haftmann@22550
  1021
haftmann@21189
  1022
subsection {* Cyclic module dependencies *}
haftmann@21178
  1023
haftmann@21189
  1024
text {*
haftmann@21189
  1025
  Sometimes the awkward situation occurs that dependencies
haftmann@21189
  1026
  between definitions introduce cyclic dependencies
haftmann@21189
  1027
  between modules, which in the Haskell world leaves
haftmann@21189
  1028
  you to the mercy of the Haskell implementation you are using,
haftmann@21189
  1029
  while for SML code generation is not possible.
haftmann@21178
  1030
haftmann@21189
  1031
  A solution is to declare module names explicitly.
haftmann@21189
  1032
  Let use assume the three cyclically dependent
haftmann@21189
  1033
  modules are named \emph{A}, \emph{B} and \emph{C}.
haftmann@21189
  1034
  Then, by stating
haftmann@21189
  1035
*}
haftmann@21189
  1036
haftmann@21189
  1037
code_modulename SML
haftmann@21189
  1038
  A ABC
haftmann@21189
  1039
  B ABC
haftmann@21189
  1040
  C ABC
haftmann@21189
  1041
haftmann@21189
  1042
text {*
haftmann@21189
  1043
  we explicitly map all those modules on \emph{ABC},
haftmann@21189
  1044
  resulting in an ad-hoc merge of this three modules
haftmann@21189
  1045
  at serialization time.
haftmann@21189
  1046
*}
haftmann@21147
  1047
haftmann@22798
  1048
subsection {* Incremental code generation *}
haftmann@22798
  1049
haftmann@22798
  1050
text {*
haftmann@22798
  1051
  Code generation is \emph{incremental}: theorems
haftmann@22798
  1052
  and abstract intermediate code are cached and extended on demand.
haftmann@22798
  1053
  The cache may be partially or fully dropped if the underlying
haftmann@22798
  1054
  executable content of the theory changes.
haftmann@22798
  1055
  Implementation of caching is supposed to transparently
haftmann@22798
  1056
  hid away the details from the user.  Anyway, caching
haftmann@22798
  1057
  reaches the surface by using a slightly more general form
haftmann@22798
  1058
  of the @{text "\<CODETHMS>"}, @{text "\<CODEDEPS>"}
haftmann@24348
  1059
  and @{text "\<EXPORTCODE>"} commands: the list of constants
haftmann@22798
  1060
  may be omitted.  Then, all constants with code theorems
haftmann@22798
  1061
  in the current cache are referred to.
haftmann@22798
  1062
*}
haftmann@22798
  1063
haftmann@25411
  1064
(*subsection {* Code generation and proof extraction *}
haftmann@21189
  1065
haftmann@21189
  1066
text {*
haftmann@24217
  1067
  \fixme
haftmann@25411
  1068
*}*)
haftmann@21189
  1069
haftmann@20948
  1070
haftmann@21058
  1071
section {* ML interfaces \label{sec:ml} *}
haftmann@20948
  1072
haftmann@21189
  1073
text {*
haftmann@21189
  1074
  Since the code generator framework not only aims to provide
haftmann@21189
  1075
  a nice Isar interface but also to form a base for
haftmann@21189
  1076
  code-generation-based applications, here a short
haftmann@21189
  1077
  description of the most important ML interfaces.
haftmann@21189
  1078
*}
haftmann@21189
  1079
haftmann@24217
  1080
subsection {* Executable theory content: @{text Code} *}
haftmann@21147
  1081
haftmann@21147
  1082
text {*
haftmann@21147
  1083
  This Pure module implements the core notions of
haftmann@21147
  1084
  executable content of a theory.
haftmann@21147
  1085
*}
haftmann@21147
  1086
haftmann@22292
  1087
subsubsection {* Managing executable content *}
haftmann@20948
  1088
haftmann@21147
  1089
text %mlref {*
haftmann@21147
  1090
  \begin{mldecls}
wenzelm@25369
  1091
  @{index_ML Code.add_func: "thm -> theory -> theory"} \\
haftmann@24217
  1092
  @{index_ML Code.del_func: "thm -> theory -> theory"} \\
haftmann@24421
  1093
  @{index_ML Code.add_funcl: "string * thm list Susp.T -> theory -> theory"} \\
haftmann@27557
  1094
  @{index_ML Code.map_pre: "(MetaSimplifier.simpset -> MetaSimplifier.simpset) -> theory -> theory"} \\
haftmann@27557
  1095
  @{index_ML Code.map_post: "(MetaSimplifier.simpset -> MetaSimplifier.simpset) -> theory -> theory"} \\
haftmann@27609
  1096
  @{index_ML Code.add_functrans: "string * (theory -> thm list -> thm list option)
haftmann@21189
  1097
    -> theory -> theory"} \\
haftmann@27557
  1098
  @{index_ML Code.del_functrans: "string -> theory -> theory"} \\
haftmann@24421
  1099
  @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\
haftmann@24217
  1100
  @{index_ML Code.get_datatype: "theory -> string
haftmann@22479
  1101
    -> (string * sort) list * (string * typ list) list"} \\
haftmann@24421
  1102
  @{index_ML Code.get_datatype_of_constr: "theory -> string -> string option"}
haftmann@21147
  1103
  \end{mldecls}
haftmann@21147
  1104
haftmann@21147
  1105
  \begin{description}
haftmann@21147
  1106
haftmann@24217
  1107
  \item @{ML Code.add_func}~@{text "thm"}~@{text "thy"} adds function
haftmann@21189
  1108
     theorem @{text "thm"} to executable content.
haftmann@21189
  1109
haftmann@24217
  1110
  \item @{ML Code.del_func}~@{text "thm"}~@{text "thy"} removes function
haftmann@21189
  1111
     theorem @{text "thm"} from executable content, if present.
haftmann@21189
  1112
haftmann@24217
  1113
  \item @{ML Code.add_funcl}~@{text "(const, lthms)"}~@{text "thy"} adds
haftmann@22060
  1114
     suspended defining equations @{text lthms} for constant
haftmann@21189
  1115
     @{text const} to executable content.
haftmann@21189
  1116
haftmann@27557
  1117
  \item @{ML Code.map_pre}~@{text "f"}~@{text "thy"} changes
haftmann@27557
  1118
     the preprocessor simpset.
haftmann@21189
  1119
haftmann@27609
  1120
  \item @{ML Code.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds
haftmann@27609
  1121
     function transformer @{text f} (named @{text name}) to executable content;
haftmann@27609
  1122
     @{text f} is a transformer of the defining equations belonging
haftmann@21189
  1123
     to a certain function definition, depending on the
haftmann@27609
  1124
     current theory context.  Returning @{text NONE} indicates that no
haftmann@27609
  1125
     transformation took place;  otherwise, the whole process will be iterated
haftmann@27609
  1126
     with the new defining equations.
haftmann@21189
  1127
haftmann@27557
  1128
  \item @{ML Code.del_functrans}~@{text "name"}~@{text "thy"} removes
haftmann@27609
  1129
     function transformer named @{text name} from executable content.
haftmann@22046
  1130
haftmann@24421
  1131
  \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds
haftmann@24421
  1132
     a datatype to executable content, with generation
haftmann@24421
  1133
     set @{text cs}.
haftmann@21189
  1134
haftmann@24217
  1135
  \item @{ML Code.get_datatype_of_constr}~@{text "thy"}~@{text "const"}
haftmann@21189
  1136
     returns type constructor corresponding to
haftmann@21189
  1137
     constructor @{text const}; returns @{text NONE}
haftmann@21189
  1138
     if @{text const} is no constructor.
haftmann@21147
  1139
haftmann@21147
  1140
  \end{description}
haftmann@21147
  1141
*}
haftmann@21147
  1142
haftmann@22292
  1143
subsection {* Auxiliary *}
haftmann@21147
  1144
haftmann@21147
  1145
text %mlref {*
haftmann@21147
  1146
  \begin{mldecls}
haftmann@24421
  1147
  @{index_ML CodeUnit.read_const: "theory -> string -> string"} \\
haftmann@26973
  1148
  @{index_ML CodeUnit.head_func: "thm -> string * ((string * sort) list * typ)"} \\
haftmann@27557
  1149
  @{index_ML CodeUnit.rewrite_func: "MetaSimplifier.simpset -> thm -> thm"} \\
haftmann@21147
  1150
  \end{mldecls}
haftmann@21217
  1151
haftmann@21217
  1152
  \begin{description}
haftmann@21217
  1153
haftmann@24217
  1154
  \item @{ML CodeUnit.read_const}~@{text thy}~@{text s}
haftmann@21217
  1155
     reads a constant as a concrete term expression @{text s}.
haftmann@21217
  1156
haftmann@24217
  1157
  \item @{ML CodeUnit.head_func}~@{text thm}
haftmann@22751
  1158
     extracts the constant and its type from a defining equation @{text thm}.
haftmann@21217
  1159
haftmann@27557
  1160
  \item @{ML CodeUnit.rewrite_func}~@{text ss}~@{text thm}
haftmann@27557
  1161
     rewrites a defining equation @{text thm} with a simpset @{text ss};
haftmann@27557
  1162
     only arguments and right hand side are rewritten,
haftmann@22060
  1163
     not the head of the defining equation.
haftmann@21217
  1164
haftmann@21217
  1165
  \end{description}
haftmann@21217
  1166
haftmann@21147
  1167
*}
haftmann@20948
  1168
haftmann@20948
  1169
subsection {* Implementing code generator applications *}
haftmann@20948
  1170
haftmann@21147
  1171
text {*
haftmann@21217
  1172
  Implementing code generator applications on top
haftmann@21217
  1173
  of the framework set out so far usually not only
haftmann@21217
  1174
  involves using those primitive interfaces
haftmann@21217
  1175
  but also storing code-dependent data and various
haftmann@21217
  1176
  other things.
haftmann@21217
  1177
haftmann@21147
  1178
  \begin{warn}
haftmann@21147
  1179
    Some interfaces discussed here have not reached
haftmann@21147
  1180
    a final state yet.
haftmann@21147
  1181
    Changes likely to occur in future.
haftmann@21147
  1182
  \end{warn}
haftmann@21147
  1183
*}
haftmann@21147
  1184
haftmann@21147
  1185
subsubsection {* Data depending on the theory's executable content *}
haftmann@21147
  1186
haftmann@21217
  1187
text {*
haftmann@21452
  1188
  Due to incrementality of code generation, changes in the
haftmann@21452
  1189
  theory's executable content have to be propagated in a
haftmann@21452
  1190
  certain fashion.  Additionally, such changes may occur
haftmann@21452
  1191
  not only during theory extension but also during theory
haftmann@21452
  1192
  merge, which is a little bit nasty from an implementation
haftmann@21452
  1193
  point of view.  The framework provides a solution
haftmann@21452
  1194
  to this technical challenge by providing a functorial
haftmann@21452
  1195
  data slot @{ML_functor CodeDataFun}; on instantiation
haftmann@21452
  1196
  of this functor, the following types and operations
haftmann@21452
  1197
  are required:
haftmann@21452
  1198
haftmann@21217
  1199
  \medskip
haftmann@21217
  1200
  \begin{tabular}{l}
haftmann@21217
  1201
  @{text "type T"} \\
haftmann@21217
  1202
  @{text "val empty: T"} \\
haftmann@21217
  1203
  @{text "val merge: Pretty.pp \<rightarrow> T * T \<rightarrow> T"} \\
haftmann@24217
  1204
  @{text "val purge: theory option \<rightarrow> CodeUnit.const list option \<rightarrow> T \<rightarrow> T"}
haftmann@21217
  1205
  \end{tabular}
haftmann@21217
  1206
haftmann@21452
  1207
  \begin{description}
haftmann@21452
  1208
haftmann@21452
  1209
  \item @{text T} the type of data to store.
haftmann@21452
  1210
haftmann@21452
  1211
  \item @{text empty} initial (empty) data.
haftmann@21452
  1212
haftmann@21452
  1213
  \item @{text merge} merging two data slots.
haftmann@21452
  1214
haftmann@22798
  1215
  \item @{text purge}~@{text thy}~@{text consts} propagates changes in executable content;
haftmann@21452
  1216
    if possible, the current theory context is handed over
haftmann@21452
  1217
    as argument @{text thy} (if there is no current theory context (e.g.~during
haftmann@22798
  1218
    theory merge, @{ML NONE}); @{text consts} indicates the kind
haftmann@21452
  1219
    of change: @{ML NONE} stands for a fundamental change
haftmann@22798
  1220
    which invalidates any existing code, @{text "SOME consts"}
haftmann@22798
  1221
    hints that executable content for constants @{text consts}
haftmann@21452
  1222
    has changed.
haftmann@21452
  1223
haftmann@21452
  1224
  \end{description}
haftmann@21452
  1225
haftmann@21452
  1226
  An instance of @{ML_functor CodeDataFun} provides the following
haftmann@21452
  1227
  interface:
haftmann@21452
  1228
haftmann@21217
  1229
  \medskip
haftmann@21217
  1230
  \begin{tabular}{l}
haftmann@21217
  1231
  @{text "get: theory \<rightarrow> T"} \\
haftmann@21217
  1232
  @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\
haftmann@21217
  1233
  @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}
haftmann@21217
  1234
  \end{tabular}
haftmann@21217
  1235
haftmann@21217
  1236
  \begin{description}
haftmann@21217
  1237
haftmann@21452
  1238
  \item @{text get} retrieval of the current data.
haftmann@21452
  1239
haftmann@21452
  1240
  \item @{text change} update of current data (cached!)
haftmann@21452
  1241
    by giving a continuation.
haftmann@21452
  1242
haftmann@21452
  1243
  \item @{text change_yield} update with side result.
haftmann@21217
  1244
haftmann@21217
  1245
  \end{description}
haftmann@21217
  1246
*}
haftmann@21217
  1247
haftmann@24628
  1248
(*subsubsection {* Datatype hooks *}
haftmann@21147
  1249
haftmann@21452
  1250
text {*
haftmann@21452
  1251
  Isabelle/HOL's datatype package provides a mechanism to
haftmann@21452
  1252
  extend theories depending on datatype declarations:
haftmann@21452
  1253
  \emph{datatype hooks}.  For example, when declaring a new
haftmann@22060
  1254
  datatype, a hook proves defining equations for equality on
haftmann@21452
  1255
  that datatype (if possible).
haftmann@21452
  1256
*}
haftmann@21452
  1257
haftmann@21217
  1258
text %mlref {*
haftmann@21217
  1259
  \begin{mldecls}
haftmann@21323
  1260
  @{index_ML_type DatatypeHooks.hook: "string list -> theory -> theory"} \\
haftmann@21217
  1261
  @{index_ML DatatypeHooks.add: "DatatypeHooks.hook -> theory -> theory"}
haftmann@21217
  1262
  \end{mldecls}
haftmann@21452
  1263
haftmann@21452
  1264
  \begin{description}
haftmann@21452
  1265
haftmann@21452
  1266
  \item @{ML_type DatatypeHooks.hook} specifies the interface
haftmann@21452
  1267
     of \emph{datatype hooks}: a theory update
haftmann@21452
  1268
     depending on the list of newly introduced
haftmann@21452
  1269
     datatype names.
haftmann@21452
  1270
haftmann@21452
  1271
  \item @{ML DatatypeHooks.add} adds a hook to the
haftmann@21452
  1272
     chain of all hooks.
haftmann@21452
  1273
haftmann@21452
  1274
  \end{description}
haftmann@21452
  1275
*}
haftmann@21452
  1276
haftmann@21452
  1277
subsubsection {* Trivial typedefs -- type copies *}
haftmann@21452
  1278
haftmann@21452
  1279
text {*
haftmann@21452
  1280
  Sometimes packages will introduce new types
haftmann@21452
  1281
  as \emph{marked type copies} similar to Haskell's
haftmann@21452
  1282
  @{text newtype} declaration (e.g. the HOL record package)
haftmann@21452
  1283
  \emph{without} tinkering with the overhead of datatypes.
haftmann@21452
  1284
  Technically, these type copies are trivial forms of typedefs.
haftmann@21452
  1285
  Since these type copies in code generation view are nothing
haftmann@21452
  1286
  else than datatypes, they have been given a own package
haftmann@21452
  1287
  in order to faciliate code generation:
haftmann@21147
  1288
*}
haftmann@21058
  1289
haftmann@21217
  1290
text %mlref {*
haftmann@21217
  1291
  \begin{mldecls}
haftmann@21452
  1292
  @{index_ML_type TypecopyPackage.info} \\
haftmann@21217
  1293
  @{index_ML TypecopyPackage.add_typecopy: "
haftmann@21217
  1294
    bstring * string list -> typ -> (bstring * bstring) option
haftmann@21217
  1295
    -> theory -> (string * TypecopyPackage.info) * theory"} \\
haftmann@21217
  1296
  @{index_ML TypecopyPackage.get_typecopy_info: "theory
haftmann@21217
  1297
    -> string -> TypecopyPackage.info option"} \\
haftmann@21217
  1298
  @{index_ML TypecopyPackage.get_spec: "theory -> string
haftmann@21452
  1299
    -> (string * sort) list * (string * typ list) list"} \\
haftmann@21452
  1300
  @{index_ML_type TypecopyPackage.hook: "string * TypecopyPackage.info -> theory -> theory"} \\
haftmann@21452
  1301
  @{index_ML TypecopyPackage.add_hook:
haftmann@21452
  1302
     "TypecopyPackage.hook -> theory -> theory"} \\
haftmann@21217
  1303
  \end{mldecls}
haftmann@21452
  1304
haftmann@21452
  1305
  \begin{description}
haftmann@21452
  1306
haftmann@21452
  1307
  \item @{ML_type TypecopyPackage.info} a record containing
haftmann@21452
  1308
     the specification and further data of a type copy.
haftmann@21452
  1309
haftmann@21452
  1310
  \item @{ML TypecopyPackage.add_typecopy} defines a new
haftmann@21452
  1311
     type copy.
haftmann@21452
  1312
haftmann@21452
  1313
  \item @{ML TypecopyPackage.get_typecopy_info} retrieves
haftmann@21452
  1314
     data of an existing type copy.
haftmann@21452
  1315
haftmann@21452
  1316
  \item @{ML TypecopyPackage.get_spec} retrieves datatype-like
haftmann@21452
  1317
     specification of a type copy.
haftmann@21452
  1318
haftmann@21452
  1319
  \item @{ML_type TypecopyPackage.hook},~@{ML TypecopyPackage.add_hook}
haftmann@21452
  1320
     provide a hook mechanism corresponding to the hook mechanism
haftmann@21452
  1321
     on datatypes.
haftmann@21452
  1322
haftmann@21452
  1323
  \end{description}
haftmann@21452
  1324
*}
haftmann@21452
  1325
haftmann@21452
  1326
subsubsection {* Unifying type copies and datatypes *}
haftmann@21452
  1327
haftmann@21452
  1328
text {*
haftmann@21452
  1329
  Since datatypes and type copies are mapped to the same concept (datatypes)
haftmann@21452
  1330
  by code generation, the view on both is unified \qt{code types}:
haftmann@21217
  1331
*}
haftmann@21217
  1332
haftmann@21217
  1333
text %mlref {*
haftmann@21217
  1334
  \begin{mldecls}
haftmann@21452
  1335
  @{index_ML_type DatatypeCodegen.hook: "(string * (bool * ((string * sort) list
haftmann@21452
  1336
    * (string * typ list) list))) list
haftmann@21323
  1337
    -> theory -> theory"} \\
haftmann@21217
  1338
  @{index_ML DatatypeCodegen.add_codetypes_hook_bootstrap: "
haftmann@21217
  1339
      DatatypeCodegen.hook -> theory -> theory"}
haftmann@21217
  1340
  \end{mldecls}
haftmann@21217
  1341
*}
haftmann@21217
  1342
haftmann@21222
  1343
text {*
haftmann@21452
  1344
  \begin{description}
haftmann@21452
  1345
haftmann@21452
  1346
  \item @{ML_type DatatypeCodegen.hook} specifies the code type hook
haftmann@21452
  1347
     interface: a theory transformation depending on a list of
haftmann@21452
  1348
     mutual recursive code types; each entry in the list
haftmann@21452
  1349
     has the structure @{text "(name, (is_data, (vars, cons)))"}
haftmann@21452
  1350
     where @{text name} is the name of the code type, @{text is_data}
haftmann@21452
  1351
     is true iff @{text name} is a datatype rather then a type copy,
haftmann@21452
  1352
     and @{text "(vars, cons)"} is the specification of the code type.
haftmann@21452
  1353
haftmann@21452
  1354
  \item @{ML DatatypeCodegen.add_codetypes_hook_bootstrap} adds a code
haftmann@21452
  1355
     type hook;  the hook is immediately processed for all already
haftmann@21452
  1356
     existing datatypes, in blocks of mutual recursive datatypes,
haftmann@21452
  1357
     where all datatypes a block depends on are processed before
haftmann@21452
  1358
     the block.
haftmann@21452
  1359
haftmann@21452
  1360
  \end{description}
haftmann@24628
  1361
*}*)
haftmann@21452
  1362
haftmann@24628
  1363
text {*
haftmann@21452
  1364
  \emph{Happy proving, happy hacking!}
haftmann@21222
  1365
*}
haftmann@21217
  1366
haftmann@20948
  1367
end