| author | bulwahn |
| Wed, 21 Jul 2010 18:11:51 +0200 | |
| changeset 37920 | 581c1e5f53e0 |
| parent 34155 | 14aaccb399b3 |
| child 38402 | 58fc3a3af71f |
| permissions | -rw-r--r-- |
| 28213 | 1 |
theory Introduction |
2 |
imports Setup |
|
3 |
begin |
|
4 |
||
5 |
section {* Introduction and Overview *}
|
|
6 |
||
7 |
text {*
|
|
8 |
This tutorial introduces a generic code generator for the |
|
| 28419 | 9 |
@{text Isabelle} system.
|
| 34155 | 10 |
The |
11 |
\qn{target language} for which code is
|
|
12 |
generated is not fixed, but may be one of several |
|
13 |
functional programming languages (currently, the implementation |
|
| 28419 | 14 |
supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
|
15 |
\cite{haskell-revised-report}).
|
|
16 |
||
17 |
Conceptually the code generator framework is part |
|
| 28428 | 18 |
of Isabelle's @{theory Pure} meta logic framework; the logic
|
| 34155 | 19 |
@{theory HOL} \cite{isa-tutorial}, which is an extension of @{theory Pure},
|
| 28419 | 20 |
already comes with a reasonable framework setup and thus provides |
| 34155 | 21 |
a good basis for creating code-generation-driven |
| 28419 | 22 |
applications. So, we assume some familiarity and experience |
| 28428 | 23 |
with the ingredients of the @{theory HOL} distribution theories.
|
| 28419 | 24 |
|
25 |
The code generator aims to be usable with no further ado |
|
| 34155 | 26 |
in most cases, while allowing for detailed customisation. |
27 |
This can be seen in the structure of this tutorial: after a short |
|
| 28447 | 28 |
conceptual introduction with an example (\secref{sec:intro}),
|
29 |
we discuss the generic customisation facilities (\secref{sec:program}).
|
|
| 31050 | 30 |
A further section (\secref{sec:adaptation}) is dedicated to the matter of
|
31 |
\qn{adaptation} to specific target language environments. After some
|
|
| 28447 | 32 |
further issues (\secref{sec:further}) we conclude with an overview
|
33 |
of some ML programming interfaces (\secref{sec:ml}).
|
|
| 28419 | 34 |
|
35 |
\begin{warn}
|
|
36 |
Ultimately, the code generator which this tutorial deals with |
|
| 28447 | 37 |
is supposed to replace the existing code generator |
| 28419 | 38 |
by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
|
39 |
So, for the moment, there are two distinct code generators |
|
| 28447 | 40 |
in Isabelle. In case of ambiguity, we will refer to the framework |
41 |
described here as @{text "generic code generator"}, to the
|
|
42 |
other as @{text "SML code generator"}.
|
|
| 28419 | 43 |
Also note that while the framework itself is |
| 28428 | 44 |
object-logic independent, only @{theory HOL} provides a reasonable
|
| 28419 | 45 |
framework setup. |
46 |
\end{warn}
|
|
47 |
||
| 28213 | 48 |
*} |
49 |
||
| 28419 | 50 |
subsection {* Code generation via shallow embedding \label{sec:intro} *}
|
51 |
||
52 |
text {*
|
|
53 |
The key concept for understanding @{text Isabelle}'s code generation is
|
|
54 |
\emph{shallow embedding}, i.e.~logical entities like constants, types and
|
|
55 |
classes are identified with corresponding concepts in the target language. |
|
56 |
||
| 28428 | 57 |
Inside @{theory HOL}, the @{command datatype} and
|
| 28419 | 58 |
@{command definition}/@{command primrec}/@{command fun} declarations form
|
59 |
the core of a functional programming language. The default code generator setup |
|
| 34155 | 60 |
transforms those into functional programs immediately. |
| 28419 | 61 |
This means that \qt{naive} code generation can proceed without further ado.
|
62 |
For example, here a simple \qt{implementation} of amortised queues:
|
|
63 |
*} |
|
64 |
||
| 29798 | 65 |
datatype %quote 'a queue = AQueue "'a list" "'a list" |
| 28419 | 66 |
|
| 28564 | 67 |
definition %quote empty :: "'a queue" where |
| 29798 | 68 |
"empty = AQueue [] []" |
| 28419 | 69 |
|
| 28564 | 70 |
primrec %quote enqueue :: "'a \<Rightarrow> 'a queue \<Rightarrow> 'a queue" where |
| 29798 | 71 |
"enqueue x (AQueue xs ys) = AQueue (x # xs) ys" |
| 28419 | 72 |
|
| 28564 | 73 |
fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where |
| 29798 | 74 |
"dequeue (AQueue [] []) = (None, AQueue [] [])" |
75 |
| "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)" |
|
76 |
| "dequeue (AQueue xs []) = |
|
77 |
(case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))" |
|
| 28419 | 78 |
|
79 |
text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
|
|
| 28213 | 80 |
|
| 28564 | 81 |
export_code %quote empty dequeue enqueue in SML |
| 28447 | 82 |
module_name Example file "examples/example.ML" |
| 28419 | 83 |
|
84 |
text {* \noindent resulting in the following code: *}
|
|
85 |
||
| 28564 | 86 |
text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
|
| 28419 | 87 |
|
88 |
text {*
|
|
89 |
\noindent The @{command export_code} command takes a space-separated list of
|
|
90 |
constants for which code shall be generated; anything else needed for those |
|
| 28447 | 91 |
is added implicitly. Then follows a target language identifier |
| 28419 | 92 |
(@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
|
93 |
A file name denotes the destination to store the generated code. Note that |
|
94 |
the semantics of the destination depends on the target language: for |
|
95 |
@{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
|
|
96 |
it denotes a \emph{directory} where a file named as the module name
|
|
97 |
(with extension @{text ".hs"}) is written:
|
|
98 |
*} |
|
99 |
||
| 28564 | 100 |
export_code %quote empty dequeue enqueue in Haskell |
| 28447 | 101 |
module_name Example file "examples/" |
| 28419 | 102 |
|
103 |
text {*
|
|
| 34155 | 104 |
\noindent This is the corresponding code in @{text Haskell}:
|
| 28419 | 105 |
*} |
106 |
||
| 28564 | 107 |
text %quote {*@{code_stmts empty enqueue dequeue (Haskell)}*}
|
| 28419 | 108 |
|
109 |
text {*
|
|
110 |
\noindent This demonstrates the basic usage of the @{command export_code} command;
|
|
| 28447 | 111 |
for more details see \secref{sec:further}.
|
| 28419 | 112 |
*} |
| 28213 | 113 |
|
| 28447 | 114 |
subsection {* Code generator architecture \label{sec:concept} *}
|
| 28213 | 115 |
|
| 28419 | 116 |
text {*
|
117 |
What you have seen so far should be already enough in a lot of cases. If you |
|
118 |
are content with this, you can quit reading here. Anyway, in order to customise |
|
| 34155 | 119 |
and adapt the code generator, it is necessary to gain some understanding |
| 28419 | 120 |
how it works. |
121 |
||
122 |
\begin{figure}[h]
|
|
|
30882
d15725e84091
moved generated eps/pdf to main directory, for proper display in dvi;
wenzelm
parents:
30880
diff
changeset
|
123 |
\includegraphics{architecture}
|
| 28419 | 124 |
\caption{Code generator architecture}
|
125 |
\label{fig:arch}
|
|
126 |
\end{figure}
|
|
127 |
||
128 |
The code generator employs a notion of executability |
|
129 |
for three foundational executable ingredients known |
|
130 |
from functional programming: |
|
| 29560 | 131 |
\emph{code equations}, \emph{datatypes}, and
|
132 |
\emph{type classes}. A code equation as a first approximation
|
|
| 28419 | 133 |
is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
|
134 |
(an equation headed by a constant @{text f} with arguments
|
|
135 |
@{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
|
|
| 29560 | 136 |
Code generation aims to turn code equations |
| 28447 | 137 |
into a functional program. This is achieved by three major |
138 |
components which operate sequentially, i.e. the result of one is |
|
139 |
the input |
|
| 30880 | 140 |
of the next in the chain, see figure \ref{fig:arch}:
|
| 28419 | 141 |
|
142 |
\begin{itemize}
|
|
143 |
||
| 34155 | 144 |
\item The starting point is a collection of raw code equations in a |
145 |
theory. It is not relevant where they |
|
146 |
stem from, but typically they were either produced by specification |
|
147 |
tools or proved explicitly by the user. |
|
| 30836 | 148 |
|
| 34155 | 149 |
\item These raw code equations can be subjected to theorem transformations. This |
150 |
\qn{preprocessor} can apply the full
|
|
| 30836 | 151 |
expressiveness of ML-based theorem transformations to code |
| 34155 | 152 |
generation. The result of preprocessing is a |
| 30836 | 153 |
structured collection of code equations. |
| 28419 | 154 |
|
| 30836 | 155 |
\item These code equations are \qn{translated} to a program in an
|
156 |
abstract intermediate language. Think of it as a kind |
|
| 28447 | 157 |
of \qt{Mini-Haskell} with four \qn{statements}: @{text data}
|
| 29560 | 158 |
(for datatypes), @{text fun} (stemming from code equations),
|
| 28447 | 159 |
also @{text class} and @{text inst} (for type classes).
|
| 28419 | 160 |
|
161 |
\item Finally, the abstract program is \qn{serialised} into concrete
|
|
162 |
source code of a target language. |
|
| 30836 | 163 |
This step only produces concrete syntax but does not change the |
164 |
program in essence; all conceptual transformations occur in the |
|
165 |
translation step. |
|
| 28419 | 166 |
|
167 |
\end{itemize}
|
|
168 |
||
169 |
\noindent From these steps, only the two last are carried out outside the logic; by |
|
170 |
keeping this layer as thin as possible, the amount of code to trust is |
|
171 |
kept to a minimum. |
|
172 |
*} |
|
| 28213 | 173 |
|
174 |
end |