| author | haftmann | 
| Fri, 13 Aug 2010 14:40:15 +0200 | |
| changeset 38405 | 7935b334893e | 
| parent 38402 | 58fc3a3af71f | 
| child 38437 | ffb1c5bf0425 | 
| permissions | -rw-r--r-- | 
| 28213 | 1 | theory Introduction | 
| 2 | imports Setup | |
| 3 | begin | |
| 4 | ||
| 38405 | 5 | section {* Introduction *}
 | 
| 6 | ||
| 7 | subsection {* Code generation fundamental: shallow embedding *}
 | |
| 8 | ||
| 9 | subsection {* A quick start with the @{text "Isabelle/HOL"} toolbox *}
 | |
| 10 | ||
| 11 | subsection {* Type classes *}
 | |
| 12 | ||
| 13 | subsection {* How to continue from here *}
 | |
| 14 | ||
| 15 | subsection {* If something goes utterly wrong *}
 | |
| 28213 | 16 | |
| 17 | text {*
 | |
| 18 | This tutorial introduces a generic code generator for the | |
| 28419 | 19 |   @{text Isabelle} system.
 | 
| 34155 | 20 | The | 
| 21 |   \qn{target language} for which code is
 | |
| 22 | generated is not fixed, but may be one of several | |
| 23 | functional programming languages (currently, the implementation | |
| 28419 | 24 |   supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
 | 
| 25 |   \cite{haskell-revised-report}).
 | |
| 26 | ||
| 27 | Conceptually the code generator framework is part | |
| 28428 | 28 |   of Isabelle's @{theory Pure} meta logic framework; the logic
 | 
| 34155 | 29 |   @{theory HOL} \cite{isa-tutorial}, which is an extension of @{theory Pure},
 | 
| 28419 | 30 | already comes with a reasonable framework setup and thus provides | 
| 34155 | 31 | a good basis for creating code-generation-driven | 
| 28419 | 32 | applications. So, we assume some familiarity and experience | 
| 28428 | 33 |   with the ingredients of the @{theory HOL} distribution theories.
 | 
| 28419 | 34 | |
| 35 | The code generator aims to be usable with no further ado | |
| 34155 | 36 | in most cases, while allowing for detailed customisation. | 
| 37 | This can be seen in the structure of this tutorial: after a short | |
| 28447 | 38 |   conceptual introduction with an example (\secref{sec:intro}),
 | 
| 39 |   we discuss the generic customisation facilities (\secref{sec:program}).
 | |
| 31050 | 40 |   A further section (\secref{sec:adaptation}) is dedicated to the matter of
 | 
| 41 |   \qn{adaptation} to specific target language environments.  After some
 | |
| 28447 | 42 |   further issues (\secref{sec:further}) we conclude with an overview
 | 
| 43 |   of some ML programming interfaces (\secref{sec:ml}).
 | |
| 28419 | 44 | |
| 45 |   \begin{warn}
 | |
| 46 | Ultimately, the code generator which this tutorial deals with | |
| 28447 | 47 | is supposed to replace the existing code generator | 
| 28419 | 48 |     by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
 | 
| 49 | So, for the moment, there are two distinct code generators | |
| 28447 | 50 | in Isabelle. In case of ambiguity, we will refer to the framework | 
| 51 |     described here as @{text "generic code generator"}, to the
 | |
| 52 |     other as @{text "SML code generator"}.
 | |
| 28419 | 53 | Also note that while the framework itself is | 
| 28428 | 54 |     object-logic independent, only @{theory HOL} provides a reasonable
 | 
| 28419 | 55 | framework setup. | 
| 56 |   \end{warn}
 | |
| 57 | ||
| 28213 | 58 | *} | 
| 59 | ||
| 28419 | 60 | subsection {* Code generation via shallow embedding \label{sec:intro} *}
 | 
| 61 | ||
| 62 | text {*
 | |
| 63 |   The key concept for understanding @{text Isabelle}'s code generation is
 | |
| 64 |   \emph{shallow embedding}, i.e.~logical entities like constants, types and
 | |
| 65 | classes are identified with corresponding concepts in the target language. | |
| 66 | ||
| 28428 | 67 |   Inside @{theory HOL}, the @{command datatype} and
 | 
| 28419 | 68 |   @{command definition}/@{command primrec}/@{command fun} declarations form
 | 
| 69 | the core of a functional programming language. The default code generator setup | |
| 34155 | 70 | transforms those into functional programs immediately. | 
| 28419 | 71 |   This means that \qt{naive} code generation can proceed without further ado.
 | 
| 72 |   For example, here a simple \qt{implementation} of amortised queues:
 | |
| 73 | *} | |
| 74 | ||
| 29798 | 75 | datatype %quote 'a queue = AQueue "'a list" "'a list" | 
| 28419 | 76 | |
| 28564 | 77 | definition %quote empty :: "'a queue" where | 
| 29798 | 78 | "empty = AQueue [] []" | 
| 28419 | 79 | |
| 28564 | 80 | primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where | 
| 29798 | 81 | "enqueue x (AQueue xs ys) = AQueue (x # xs) ys" | 
| 28419 | 82 | |
| 28564 | 83 | fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where | 
| 29798 | 84 | "dequeue (AQueue [] []) = (None, AQueue [] [])" | 
| 85 | | "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)" | |
| 86 | | "dequeue (AQueue xs []) = | |
| 87 | (case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))" | |
| 28419 | 88 | |
| 89 | text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
 | |
| 28213 | 90 | |
| 28564 | 91 | export_code %quote empty dequeue enqueue in SML | 
| 28447 | 92 | module_name Example file "examples/example.ML" | 
| 28419 | 93 | |
| 94 | text {* \noindent resulting in the following code: *}
 | |
| 95 | ||
| 28564 | 96 | text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
 | 
| 28419 | 97 | |
| 98 | text {*
 | |
| 99 |   \noindent The @{command export_code} command takes a space-separated list of
 | |
| 100 | constants for which code shall be generated; anything else needed for those | |
| 28447 | 101 | is added implicitly. Then follows a target language identifier | 
| 28419 | 102 |   (@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
 | 
| 103 | A file name denotes the destination to store the generated code. Note that | |
| 104 | the semantics of the destination depends on the target language: for | |
| 105 |   @{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
 | |
| 106 |   it denotes a \emph{directory} where a file named as the module name
 | |
| 107 |   (with extension @{text ".hs"}) is written:
 | |
| 108 | *} | |
| 109 | ||
| 28564 | 110 | export_code %quote empty dequeue enqueue in Haskell | 
| 28447 | 111 | module_name Example file "examples/" | 
| 28419 | 112 | |
| 113 | text {*
 | |
| 34155 | 114 |   \noindent This is the corresponding code in @{text Haskell}:
 | 
| 28419 | 115 | *} | 
| 116 | ||
| 28564 | 117 | text %quote {*@{code_stmts empty enqueue dequeue (Haskell)}*}
 | 
| 28419 | 118 | |
| 119 | text {*
 | |
| 120 |   \noindent This demonstrates the basic usage of the @{command export_code} command;
 | |
| 28447 | 121 |   for more details see \secref{sec:further}.
 | 
| 28419 | 122 | *} | 
| 28213 | 123 | |
| 38402 | 124 | subsection {* If something utterly fails *}
 | 
| 125 | ||
| 126 | text {*
 | |
| 127 | Under certain circumstances, the code generator fails to produce | |
| 128 | code entirely. | |
| 129 | ||
| 130 |   \begin{description}
 | |
| 131 | ||
| 132 |     \ditem{generate only one module}
 | |
| 133 | ||
| 134 |     \ditem{check with a different target language}
 | |
| 135 | ||
| 136 |     \ditem{inspect code equations}
 | |
| 137 | ||
| 138 |     \ditem{inspect preprocessor setup}
 | |
| 139 | ||
| 140 |     \ditem{generate exceptions}
 | |
| 141 | ||
| 142 |     \ditem{remove offending code equations}
 | |
| 143 | ||
| 144 |   \end{description}
 | |
| 145 | *} | |
| 146 | ||
| 28447 | 147 | subsection {* Code generator architecture \label{sec:concept} *}
 | 
| 28213 | 148 | |
| 28419 | 149 | text {*
 | 
| 150 | What you have seen so far should be already enough in a lot of cases. If you | |
| 151 | are content with this, you can quit reading here. Anyway, in order to customise | |
| 34155 | 152 | and adapt the code generator, it is necessary to gain some understanding | 
| 28419 | 153 | how it works. | 
| 154 | ||
| 155 |   \begin{figure}[h]
 | |
| 30882 
d15725e84091
moved generated eps/pdf to main directory, for proper display in dvi;
 wenzelm parents: 
30880diff
changeset | 156 |     \includegraphics{architecture}
 | 
| 28419 | 157 |     \caption{Code generator architecture}
 | 
| 158 |     \label{fig:arch}
 | |
| 159 |   \end{figure}
 | |
| 160 | ||
| 161 | The code generator employs a notion of executability | |
| 162 | for three foundational executable ingredients known | |
| 163 | from functional programming: | |
| 29560 | 164 |   \emph{code equations}, \emph{datatypes}, and
 | 
| 165 |   \emph{type classes}.  A code equation as a first approximation
 | |
| 28419 | 166 |   is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
 | 
| 167 |   (an equation headed by a constant @{text f} with arguments
 | |
| 168 |   @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
 | |
| 29560 | 169 | Code generation aims to turn code equations | 
| 28447 | 170 | into a functional program. This is achieved by three major | 
| 171 | components which operate sequentially, i.e. the result of one is | |
| 172 | the input | |
| 30880 | 173 |   of the next in the chain,  see figure \ref{fig:arch}:
 | 
| 28419 | 174 | |
| 175 |   \begin{itemize}
 | |
| 176 | ||
| 34155 | 177 | \item The starting point is a collection of raw code equations in a | 
| 178 | theory. It is not relevant where they | |
| 179 | stem from, but typically they were either produced by specification | |
| 180 | tools or proved explicitly by the user. | |
| 30836 | 181 | |
| 34155 | 182 | \item These raw code equations can be subjected to theorem transformations. This | 
| 183 |       \qn{preprocessor} can apply the full
 | |
| 30836 | 184 | expressiveness of ML-based theorem transformations to code | 
| 34155 | 185 | generation. The result of preprocessing is a | 
| 30836 | 186 | structured collection of code equations. | 
| 28419 | 187 | |
| 30836 | 188 |     \item These code equations are \qn{translated} to a program in an
 | 
| 189 | abstract intermediate language. Think of it as a kind | |
| 28447 | 190 |       of \qt{Mini-Haskell} with four \qn{statements}: @{text data}
 | 
| 29560 | 191 |       (for datatypes), @{text fun} (stemming from code equations),
 | 
| 28447 | 192 |       also @{text class} and @{text inst} (for type classes).
 | 
| 28419 | 193 | |
| 194 |     \item Finally, the abstract program is \qn{serialised} into concrete
 | |
| 195 | source code of a target language. | |
| 30836 | 196 | This step only produces concrete syntax but does not change the | 
| 197 | program in essence; all conceptual transformations occur in the | |
| 198 | translation step. | |
| 28419 | 199 | |
| 200 |   \end{itemize}
 | |
| 201 | ||
| 202 | \noindent From these steps, only the two last are carried out outside the logic; by | |
| 203 | keeping this layer as thin as possible, the amount of code to trust is | |
| 204 | kept to a minimum. | |
| 205 | *} | |
| 28213 | 206 | |
| 207 | end |