doc-src/IsarAdvanced/Codegen/Thy/Introduction.thy
author haftmann
Wed, 01 Oct 2008 13:33:54 +0200
changeset 28447 df77ed974a78
parent 28428 fd007794561f
child 28564 1358b1ddd915
permissions -rw-r--r--
fixed
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     1
theory Introduction
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     2
imports Setup
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     3
begin
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     4
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     5
chapter {* Code generation from @{text "Isabelle/HOL"} theories *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     6
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     7
section {* Introduction and Overview *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     8
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     9
text {*
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    10
  This tutorial introduces a generic code generator for the
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    11
  @{text Isabelle} system.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    12
  Generic in the sense that the
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    13
  \qn{target language} for which code shall ultimately be
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    14
  generated is not fixed but may be an arbitrary state-of-the-art
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    15
  functional programming language (currently, the implementation
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    16
  supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    17
  \cite{haskell-revised-report}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    18
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    19
  Conceptually the code generator framework is part
28428
haftmann
parents: 28420
diff changeset
    20
  of Isabelle's @{theory Pure} meta logic framework; the logic
haftmann
parents: 28420
diff changeset
    21
  @{theory HOL} which is an extension of @{theory Pure}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    22
  already comes with a reasonable framework setup and thus provides
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    23
  a good working horse for raising code-generation-driven
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    24
  applications.  So, we assume some familiarity and experience
28428
haftmann
parents: 28420
diff changeset
    25
  with the ingredients of the @{theory HOL} distribution theories.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    26
  (see also \cite{isa-tutorial}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    27
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    28
  The code generator aims to be usable with no further ado
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    29
  in most cases while allowing for detailed customisation.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    30
  This manifests in the structure of this tutorial: after a short
28447
haftmann
parents: 28428
diff changeset
    31
  conceptual introduction with an example (\secref{sec:intro}),
haftmann
parents: 28428
diff changeset
    32
  we discuss the generic customisation facilities (\secref{sec:program}).
haftmann
parents: 28428
diff changeset
    33
  A further section (\secref{sec:adaption}) is dedicated to the matter of
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    34
  \qn{adaption} to specific target language environments.  After some
28447
haftmann
parents: 28428
diff changeset
    35
  further issues (\secref{sec:further}) we conclude with an overview
haftmann
parents: 28428
diff changeset
    36
  of some ML programming interfaces (\secref{sec:ml}).
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    37
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    38
  \begin{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    39
    Ultimately, the code generator which this tutorial deals with
28447
haftmann
parents: 28428
diff changeset
    40
    is supposed to replace the existing code generator
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    41
    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    42
    So, for the moment, there are two distinct code generators
28447
haftmann
parents: 28428
diff changeset
    43
    in Isabelle.  In case of ambiguity, we will refer to the framework
haftmann
parents: 28428
diff changeset
    44
    described here as @{text "generic code generator"}, to the
haftmann
parents: 28428
diff changeset
    45
    other as @{text "SML code generator"}.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    46
    Also note that while the framework itself is
28428
haftmann
parents: 28420
diff changeset
    47
    object-logic independent, only @{theory HOL} provides a reasonable
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    48
    framework setup.    
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    49
  \end{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    50
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    51
*}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    52
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    53
subsection {* Code generation via shallow embedding \label{sec:intro} *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    54
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    55
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    56
  The key concept for understanding @{text Isabelle}'s code generation is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    57
  \emph{shallow embedding}, i.e.~logical entities like constants, types and
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    58
  classes are identified with corresponding concepts in the target language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    59
28428
haftmann
parents: 28420
diff changeset
    60
  Inside @{theory HOL}, the @{command datatype} and
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    61
  @{command definition}/@{command primrec}/@{command fun} declarations form
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    62
  the core of a functional programming language.  The default code generator setup
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    63
  allows to turn those into functional programs immediately.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    64
  This means that \qt{naive} code generation can proceed without further ado.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    65
  For example, here a simple \qt{implementation} of amortised queues:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    66
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    67
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    68
datatype %quoteme 'a queue = Queue "'a list" "'a list"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    69
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    70
definition %quoteme empty :: "'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    71
  "empty = Queue [] []"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    72
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    73
primrec %quoteme enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    74
  "enqueue x (Queue xs ys) = Queue (x # xs) ys"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    75
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    76
fun %quoteme dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    77
    "dequeue (Queue [] []) = (None, Queue [] [])"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    78
  | "dequeue (Queue xs (y # ys)) = (Some y, Queue xs ys)"
28447
haftmann
parents: 28428
diff changeset
    79
  | "dequeue (Queue xs []) =
haftmann
parents: 28428
diff changeset
    80
      (case rev xs of y # ys \<Rightarrow> (Some y, Queue [] ys))"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    81
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    82
text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    83
28447
haftmann
parents: 28428
diff changeset
    84
export_code %quoteme empty dequeue enqueue in SML
haftmann
parents: 28428
diff changeset
    85
  module_name Example file "examples/example.ML"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    86
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    87
text {* \noindent resulting in the following code: *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    88
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    89
text %quoteme {*@{code_stmts empty enqueue dequeue (SML)}*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    90
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    91
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    92
  \noindent The @{command export_code} command takes a space-separated list of
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    93
  constants for which code shall be generated;  anything else needed for those
28447
haftmann
parents: 28428
diff changeset
    94
  is added implicitly.  Then follows a target language identifier
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    95
  (@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    96
  A file name denotes the destination to store the generated code.  Note that
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    97
  the semantics of the destination depends on the target language:  for
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    98
  @{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    99
  it denotes a \emph{directory} where a file named as the module name
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   100
  (with extension @{text ".hs"}) is written:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   101
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   102
28447
haftmann
parents: 28428
diff changeset
   103
export_code %quoteme empty dequeue enqueue in Haskell
haftmann
parents: 28428
diff changeset
   104
  module_name Example file "examples/"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   105
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   106
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   107
  \noindent This is how the corresponding code in @{text Haskell} looks like:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   108
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   109
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   110
text %quoteme {*@{code_stmts empty enqueue dequeue (Haskell)}*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   111
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   112
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   113
  \noindent This demonstrates the basic usage of the @{command export_code} command;
28447
haftmann
parents: 28428
diff changeset
   114
  for more details see \secref{sec:further}.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   115
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   116
28447
haftmann
parents: 28428
diff changeset
   117
subsection {* Code generator architecture \label{sec:concept} *}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   118
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   119
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   120
  What you have seen so far should be already enough in a lot of cases.  If you
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   121
  are content with this, you can quit reading here.  Anyway, in order to customise
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   122
  and adapt the code generator, it is inevitable to gain some understanding
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   123
  how it works.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   124
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   125
  \begin{figure}[h]
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   126
    \centering
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   127
    \includegraphics[width=0.7\textwidth]{codegen_process}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   128
    \caption{Code generator architecture}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   129
    \label{fig:arch}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   130
  \end{figure}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   131
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   132
  The code generator employs a notion of executability
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   133
  for three foundational executable ingredients known
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   134
  from functional programming:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   135
  \emph{defining equations}, \emph{datatypes}, and
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   136
  \emph{type classes}.  A defining equation as a first approximation
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   137
  is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   138
  (an equation headed by a constant @{text f} with arguments
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   139
  @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   140
  Code generation aims to turn defining equations
28447
haftmann
parents: 28428
diff changeset
   141
  into a functional program.  This is achieved by three major
haftmann
parents: 28428
diff changeset
   142
  components which operate sequentially, i.e. the result of one is
haftmann
parents: 28428
diff changeset
   143
  the input
haftmann
parents: 28428
diff changeset
   144
  of the next in the chain,  see diagram \ref{fig:arch}:
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   145
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   146
  \begin{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   147
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   148
    \item Out of the vast collection of theorems proven in a
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   149
      \qn{theory}, a reasonable subset modelling
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   150
      defining equations is \qn{selected}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   151
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   152
    \item On those selected theorems, certain
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   153
      transformations are carried out
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   154
      (\qn{preprocessing}).  Their purpose is to turn theorems
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   155
      representing non- or badly executable
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   156
      specifications into equivalent but executable counterparts.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   157
      The result is a structured collection of \qn{code theorems}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   158
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   159
    \item Before the selected defining equations are continued with,
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   160
      they can be \qn{preprocessed}, i.e. subjected to theorem
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   161
      transformations.  This \qn{preprocessor} is an interface which
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   162
      allows to apply
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   163
      the full expressiveness of ML-based theorem transformations
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   164
      to code generation;  motivating examples are shown below, see
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   165
      \secref{sec:preproc}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   166
      The result of the preprocessing step is a structured collection
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   167
      of defining equations.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   168
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   169
    \item These defining equations are \qn{translated} to a program
28447
haftmann
parents: 28428
diff changeset
   170
      in an abstract intermediate language.  Think of it as a kind
haftmann
parents: 28428
diff changeset
   171
      of \qt{Mini-Haskell} with four \qn{statements}: @{text data}
haftmann
parents: 28428
diff changeset
   172
      (for datatypes), @{text fun} (stemming from defining equations),
haftmann
parents: 28428
diff changeset
   173
      also @{text class} and @{text inst} (for type classes).
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   174
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   175
    \item Finally, the abstract program is \qn{serialised} into concrete
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   176
      source code of a target language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   177
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   178
  \end{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   179
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   180
  \noindent From these steps, only the two last are carried out outside the logic;  by
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   181
  keeping this layer as thin as possible, the amount of code to trust is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   182
  kept to a minimum.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   183
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   184
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   185
end