doc-src/Codegen/Thy/Introduction.thy
author haftmann
Wed, 01 Apr 2009 15:16:09 +0200
changeset 30836 1344132160bb
parent 30227 853abb4853cc
child 30880 257cbe43faa8
permissions -rw-r--r--
proper external tikz pictures
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     1
theory Introduction
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     2
imports Setup
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     3
begin
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     4
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     5
section {* Introduction and Overview *}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     6
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     7
text {*
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
     8
  This tutorial introduces a generic code generator for the
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
     9
  @{text Isabelle} system.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    10
  Generic in the sense that the
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    11
  \qn{target language} for which code shall ultimately be
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    12
  generated is not fixed but may be an arbitrary state-of-the-art
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    13
  functional programming language (currently, the implementation
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    14
  supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    15
  \cite{haskell-revised-report}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    16
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    17
  Conceptually the code generator framework is part
28428
haftmann
parents: 28420
diff changeset
    18
  of Isabelle's @{theory Pure} meta logic framework; the logic
haftmann
parents: 28420
diff changeset
    19
  @{theory HOL} which is an extension of @{theory Pure}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    20
  already comes with a reasonable framework setup and thus provides
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    21
  a good working horse for raising code-generation-driven
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    22
  applications.  So, we assume some familiarity and experience
28428
haftmann
parents: 28420
diff changeset
    23
  with the ingredients of the @{theory HOL} distribution theories.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    24
  (see also \cite{isa-tutorial}).
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    25
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    26
  The code generator aims to be usable with no further ado
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    27
  in most cases while allowing for detailed customisation.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    28
  This manifests in the structure of this tutorial: after a short
28447
haftmann
parents: 28428
diff changeset
    29
  conceptual introduction with an example (\secref{sec:intro}),
haftmann
parents: 28428
diff changeset
    30
  we discuss the generic customisation facilities (\secref{sec:program}).
haftmann
parents: 28428
diff changeset
    31
  A further section (\secref{sec:adaption}) is dedicated to the matter of
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    32
  \qn{adaption} to specific target language environments.  After some
28447
haftmann
parents: 28428
diff changeset
    33
  further issues (\secref{sec:further}) we conclude with an overview
haftmann
parents: 28428
diff changeset
    34
  of some ML programming interfaces (\secref{sec:ml}).
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    35
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    36
  \begin{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    37
    Ultimately, the code generator which this tutorial deals with
28447
haftmann
parents: 28428
diff changeset
    38
    is supposed to replace the existing code generator
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    39
    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    40
    So, for the moment, there are two distinct code generators
28447
haftmann
parents: 28428
diff changeset
    41
    in Isabelle.  In case of ambiguity, we will refer to the framework
haftmann
parents: 28428
diff changeset
    42
    described here as @{text "generic code generator"}, to the
haftmann
parents: 28428
diff changeset
    43
    other as @{text "SML code generator"}.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    44
    Also note that while the framework itself is
28428
haftmann
parents: 28420
diff changeset
    45
    object-logic independent, only @{theory HOL} provides a reasonable
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    46
    framework setup.    
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    47
  \end{warn}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    48
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    49
*}
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    50
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    51
subsection {* Code generation via shallow embedding \label{sec:intro} *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    52
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    53
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    54
  The key concept for understanding @{text Isabelle}'s code generation is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    55
  \emph{shallow embedding}, i.e.~logical entities like constants, types and
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    56
  classes are identified with corresponding concepts in the target language.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    57
28428
haftmann
parents: 28420
diff changeset
    58
  Inside @{theory HOL}, the @{command datatype} and
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    59
  @{command definition}/@{command primrec}/@{command fun} declarations form
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    60
  the core of a functional programming language.  The default code generator setup
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    61
  allows to turn those into functional programs immediately.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    62
  This means that \qt{naive} code generation can proceed without further ado.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    63
  For example, here a simple \qt{implementation} of amortised queues:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    64
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    65
29798
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    66
datatype %quote 'a queue = AQueue "'a list" "'a list"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    67
28564
haftmann
parents: 28447
diff changeset
    68
definition %quote empty :: "'a queue" where
29798
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    69
  "empty = AQueue [] []"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    70
28564
haftmann
parents: 28447
diff changeset
    71
primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where
29798
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    72
  "enqueue x (AQueue xs ys) = AQueue (x # xs) ys"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    73
28564
haftmann
parents: 28447
diff changeset
    74
fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
29798
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    75
    "dequeue (AQueue [] []) = (None, AQueue [] [])"
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    76
  | "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    77
  | "dequeue (AQueue xs []) =
6df726203e39 proper datatype abstraction example
haftmann
parents: 29560
diff changeset
    78
      (case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    79
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    80
text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
    81
28564
haftmann
parents: 28447
diff changeset
    82
export_code %quote empty dequeue enqueue in SML
28447
haftmann
parents: 28428
diff changeset
    83
  module_name Example file "examples/example.ML"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    84
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    85
text {* \noindent resulting in the following code: *}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    86
28564
haftmann
parents: 28447
diff changeset
    87
text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    88
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    89
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    90
  \noindent The @{command export_code} command takes a space-separated list of
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    91
  constants for which code shall be generated;  anything else needed for those
28447
haftmann
parents: 28428
diff changeset
    92
  is added implicitly.  Then follows a target language identifier
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    93
  (@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    94
  A file name denotes the destination to store the generated code.  Note that
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    95
  the semantics of the destination depends on the target language:  for
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    96
  @{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    97
  it denotes a \emph{directory} where a file named as the module name
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    98
  (with extension @{text ".hs"}) is written:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
    99
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   100
28564
haftmann
parents: 28447
diff changeset
   101
export_code %quote empty dequeue enqueue in Haskell
28447
haftmann
parents: 28428
diff changeset
   102
  module_name Example file "examples/"
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   103
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   104
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   105
  \noindent This is how the corresponding code in @{text Haskell} looks like:
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   106
*}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   107
28564
haftmann
parents: 28447
diff changeset
   108
text %quote {*@{code_stmts empty enqueue dequeue (Haskell)}*}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   109
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   110
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   111
  \noindent This demonstrates the basic usage of the @{command export_code} command;
28447
haftmann
parents: 28428
diff changeset
   112
  for more details see \secref{sec:further}.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   113
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   114
28447
haftmann
parents: 28428
diff changeset
   115
subsection {* Code generator architecture \label{sec:concept} *}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   116
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   117
text {*
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   118
  What you have seen so far should be already enough in a lot of cases.  If you
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   119
  are content with this, you can quit reading here.  Anyway, in order to customise
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   120
  and adapt the code generator, it is inevitable to gain some understanding
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   121
  how it works.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   122
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   123
  \begin{figure}[h]
30836
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   124
    \includegraphics{Thy/pictures/architecture}
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   125
    \caption{Code generator architecture}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   126
    \label{fig:arch}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   127
  \end{figure}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   128
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   129
  The code generator employs a notion of executability
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   130
  for three foundational executable ingredients known
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   131
  from functional programming:
29560
fa6c5d62adf5 "code equation" replaces "defining equation"
haftmann
parents: 28635
diff changeset
   132
  \emph{code equations}, \emph{datatypes}, and
fa6c5d62adf5 "code equation" replaces "defining equation"
haftmann
parents: 28635
diff changeset
   133
  \emph{type classes}.  A code equation as a first approximation
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   134
  is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   135
  (an equation headed by a constant @{text f} with arguments
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   136
  @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
29560
fa6c5d62adf5 "code equation" replaces "defining equation"
haftmann
parents: 28635
diff changeset
   137
  Code generation aims to turn code equations
28447
haftmann
parents: 28428
diff changeset
   138
  into a functional program.  This is achieved by three major
haftmann
parents: 28428
diff changeset
   139
  components which operate sequentially, i.e. the result of one is
haftmann
parents: 28428
diff changeset
   140
  the input
haftmann
parents: 28428
diff changeset
   141
  of the next in the chain,  see diagram \ref{fig:arch}:
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   142
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   143
  \begin{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   144
30836
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   145
    \item Starting point is a collection of raw code equations in a
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   146
      theory; due to proof irrelevance it is not relevant where they
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   147
      stem from but typically they are either descendant of specification
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   148
      tools or explicit proofs by the user.
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   149
      
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   150
    \item Before these raw code equations are continued
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   151
      with, they can be subjected to theorem transformations.  This
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   152
      \qn{preprocessor} is an interface which allows to apply the full
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   153
      expressiveness of ML-based theorem transformations to code
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   154
      generation.  The result of the preprocessing step is a
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   155
      structured collection of code equations.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   156
30836
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   157
    \item These code equations are \qn{translated} to a program in an
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   158
      abstract intermediate language.  Think of it as a kind
28447
haftmann
parents: 28428
diff changeset
   159
      of \qt{Mini-Haskell} with four \qn{statements}: @{text data}
29560
fa6c5d62adf5 "code equation" replaces "defining equation"
haftmann
parents: 28635
diff changeset
   160
      (for datatypes), @{text fun} (stemming from code equations),
28447
haftmann
parents: 28428
diff changeset
   161
      also @{text class} and @{text inst} (for type classes).
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   162
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   163
    \item Finally, the abstract program is \qn{serialised} into concrete
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   164
      source code of a target language.
30836
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   165
      This step only produces concrete syntax but does not change the
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   166
      program in essence; all conceptual transformations occur in the
1344132160bb proper external tikz pictures
haftmann
parents: 30227
diff changeset
   167
      translation step.
28419
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   168
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   169
  \end{itemize}
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   170
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   171
  \noindent From these steps, only the two last are carried out outside the logic;  by
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   172
  keeping this layer as thin as possible, the amount of code to trust is
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   173
  kept to a minimum.
f65e8b318581 re-canibalised manual
haftmann
parents: 28213
diff changeset
   174
*}
28213
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   175
b52f9205a02d New outline for codegen tutorial -- draft
haftmann
parents:
diff changeset
   176
end