doc-src/IsarAdvanced/Codegen/Thy/Introduction.thy
author haftmann
Tue, 30 Sep 2008 14:30:44 +0200
changeset 28428 fd007794561f
parent 28420 293b166c45c5
child 28447 df77ed974a78
permissions -rw-r--r--
tuned
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     1
theory Introduction
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     2
imports Setup
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     3
begin
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     4
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     5
chapter {* Code generation from @{text "Isabelle/HOL"} theories *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     6
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     7
section {* Introduction and Overview *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     8
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     9
text {*
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    10
  This tutorial introduces a generic code generator for the
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    11
  @{text Isabelle} system.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    12
  Generic in the sense that the
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    13
  \qn{target language} for which code shall ultimately be
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    14
  generated is not fixed but may be an arbitrary state-of-the-art
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    15
  functional programming language (currently, the implementation
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    16
  supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    17
  \cite{haskell-revised-report}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    18
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    19
  Conceptually the code generator framework is part
28428
haftmann
parents: 28420
diff changeset
    20
  of Isabelle's @{theory Pure} meta logic framework; the logic
haftmann
parents: 28420
diff changeset
    21
  @{theory HOL} which is an extension of @{theory Pure}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    22
  already comes with a reasonable framework setup and thus provides
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    23
  a good working horse for raising code-generation-driven
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    24
  applications.  So, we assume some familiarity and experience
28428
haftmann
parents: 28420
diff changeset
    25
  with the ingredients of the @{theory HOL} distribution theories.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    26
  (see also \cite{isa-tutorial}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    27
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    28
  The code generator aims to be usable with no further ado
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    29
  in most cases while allowing for detailed customisation.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    30
  This manifests in the structure of this tutorial: after a short
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    31
  conceptual introduction with an example \secref{sec:intro},
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    32
  we discuss the generic customisation facilities \secref{sec:program}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    33
  A further section \secref{sec:adaption} is dedicated to the matter of
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    34
  \qn{adaption} to specific target language environments.  After some
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    35
  further issues \secref{sec:further} we conclude with an overview
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    36
  of some ML programming interfaces \secref{sec:ml}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    37
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    38
  \begin{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    39
    Ultimately, the code generator which this tutorial deals with
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    40
    is supposed to replace the already established code generator
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    41
    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    42
    So, for the moment, there are two distinct code generators
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    43
    in Isabelle.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    44
    Also note that while the framework itself is
28428
haftmann
parents: 28420
diff changeset
    45
    object-logic independent, only @{theory HOL} provides a reasonable
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    46
    framework setup.    
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    47
  \end{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    48
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    49
*}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    50
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    51
subsection {* Code generation via shallow embedding \label{sec:intro} *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    52
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    53
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    54
  The key concept for understanding @{text Isabelle}'s code generation is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    55
  \emph{shallow embedding}, i.e.~logical entities like constants, types and
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    56
  classes are identified with corresponding concepts in the target language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    57
28428
haftmann
parents: 28420
diff changeset
    58
  Inside @{theory HOL}, the @{command datatype} and
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    59
  @{command definition}/@{command primrec}/@{command fun} declarations form
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    60
  the core of a functional programming language.  The default code generator setup
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    61
  allows to turn those into functional programs immediately.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    62
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    63
  This means that \qt{naive} code generation can proceed without further ado.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    64
  For example, here a simple \qt{implementation} of amortised queues:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    65
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    66
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    67
datatype %quoteme 'a queue = Queue "'a list" "'a list"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    68
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    69
definition %quoteme empty :: "'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    70
  "empty = Queue [] []"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    71
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    72
primrec %quoteme enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    73
  "enqueue x (Queue xs ys) = Queue (x # xs) ys"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    74
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    75
fun %quoteme dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    76
    "dequeue (Queue [] []) = (None, Queue [] [])"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    77
  | "dequeue (Queue xs (y # ys)) = (Some y, Queue xs ys)"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    78
  | "dequeue (Queue xs []) = (case rev xs of y # ys \<Rightarrow> (Some y, Queue [] ys))"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    79
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    80
text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    81
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    82
export_code %quoteme empty dequeue enqueue in SML module_name Example file "examples/example.ML"
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    83
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    84
text {* \noindent resulting in the following code: *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    85
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    86
text %quoteme {*@{code_stmts empty enqueue dequeue (SML)}*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    87
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    88
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    89
  \noindent The @{command export_code} command takes a space-separated list of
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    90
  constants for which code shall be generated;  anything else needed for those
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    91
  is added implicitly.  Then follows by a target language identifier
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    92
  (@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    93
  A file name denotes the destination to store the generated code.  Note that
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    94
  the semantics of the destination depends on the target language:  for
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    95
  @{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    96
  it denotes a \emph{directory} where a file named as the module name
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    97
  (with extension @{text ".hs"}) is written:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    98
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    99
28420
293b166c45c5 fixed slips
haftmann
parents: 28419
diff changeset
   100
export_code %quoteme empty dequeue enqueue in Haskell module_name Example file "examples/"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   101
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   102
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   103
  \noindent This is how the corresponding code in @{text Haskell} looks like:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   104
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   105
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   106
text %quoteme {*@{code_stmts empty enqueue dequeue (Haskell)}*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   107
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   108
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   109
  \noindent This demonstrates the basic usage of the @{command export_code} command;
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   110
  for more details see \secref{sec:serializer}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   111
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   112
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   113
subsection {* Code generator architecture *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   114
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   115
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   116
  What you have seen so far should be already enough in a lot of cases.  If you
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   117
  are content with this, you can quit reading here.  Anyway, in order to customise
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   118
  and adapt the code generator, it is inevitable to gain some understanding
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   119
  how it works.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   120
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   121
  \begin{figure}[h]
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   122
    \centering
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   123
    \includegraphics[width=0.7\textwidth]{codegen_process}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   124
    \caption{Code generator architecture}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   125
    \label{fig:arch}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   126
  \end{figure}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   127
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   128
  The code generator itself consists of three major components
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   129
  which operate sequentially, i.e. the result of one is the input
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   130
  of the next in the chain,  see diagram \ref{fig:arch}:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   131
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   132
  The code generator employs a notion of executability
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   133
  for three foundational executable ingredients known
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   134
  from functional programming:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   135
  \emph{defining equations}, \emph{datatypes}, and
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   136
  \emph{type classes}.  A defining equation as a first approximation
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   137
  is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   138
  (an equation headed by a constant @{text f} with arguments
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   139
  @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   140
  Code generation aims to turn defining equations
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   141
  into a functional program by running through the following
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   142
  process:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   143
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   144
  \begin{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   145
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   146
    \item Out of the vast collection of theorems proven in a
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   147
      \qn{theory}, a reasonable subset modelling
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   148
      defining equations is \qn{selected}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   149
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   150
    \item On those selected theorems, certain
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   151
      transformations are carried out
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   152
      (\qn{preprocessing}).  Their purpose is to turn theorems
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   153
      representing non- or badly executable
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   154
      specifications into equivalent but executable counterparts.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   155
      The result is a structured collection of \qn{code theorems}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   156
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   157
    \item Before the selected defining equations are continued with,
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   158
      they can be \qn{preprocessed}, i.e. subjected to theorem
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   159
      transformations.  This \qn{preprocessor} is an interface which
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   160
      allows to apply
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   161
      the full expressiveness of ML-based theorem transformations
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   162
      to code generation;  motivating examples are shown below, see
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   163
      \secref{sec:preproc}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   164
      The result of the preprocessing step is a structured collection
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   165
      of defining equations.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   166
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   167
    \item These defining equations are \qn{translated} to a program
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   168
      in an abstract intermediate language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   169
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   170
    \item Finally, the abstract program is \qn{serialised} into concrete
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   171
      source code of a target language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   172
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   173
  \end{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   174
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   175
  \noindent From these steps, only the two last are carried out outside the logic;  by
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   176
  keeping this layer as thin as possible, the amount of code to trust is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   177
  kept to a minimum.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   178
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   179
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   180
end