doc-src/IsarAdvanced/Codegen/Thy/document/Introduction.tex
changeset 28447 df77ed974a78
child 28564 1358b1ddd915
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/doc-src/IsarAdvanced/Codegen/Thy/document/Introduction.tex	Wed Oct 01 13:33:54 2008 +0200
     1.3 @@ -0,0 +1,369 @@
     1.4 +%
     1.5 +\begin{isabellebody}%
     1.6 +\def\isabellecontext{Introduction}%
     1.7 +%
     1.8 +\isadelimtheory
     1.9 +%
    1.10 +\endisadelimtheory
    1.11 +%
    1.12 +\isatagtheory
    1.13 +\isacommand{theory}\isamarkupfalse%
    1.14 +\ Introduction\isanewline
    1.15 +\isakeyword{imports}\ Setup\isanewline
    1.16 +\isakeyword{begin}%
    1.17 +\endisatagtheory
    1.18 +{\isafoldtheory}%
    1.19 +%
    1.20 +\isadelimtheory
    1.21 +%
    1.22 +\endisadelimtheory
    1.23 +%
    1.24 +\isamarkupchapter{Code generation from \isa{Isabelle{\isacharslash}HOL} theories%
    1.25 +}
    1.26 +\isamarkuptrue%
    1.27 +%
    1.28 +\isamarkupsection{Introduction and Overview%
    1.29 +}
    1.30 +\isamarkuptrue%
    1.31 +%
    1.32 +\begin{isamarkuptext}%
    1.33 +This tutorial introduces a generic code generator for the
    1.34 +  \isa{Isabelle} system.
    1.35 +  Generic in the sense that the
    1.36 +  \qn{target language} for which code shall ultimately be
    1.37 +  generated is not fixed but may be an arbitrary state-of-the-art
    1.38 +  functional programming language (currently, the implementation
    1.39 +  supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell}
    1.40 +  \cite{haskell-revised-report}).
    1.41 +
    1.42 +  Conceptually the code generator framework is part
    1.43 +  of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic
    1.44 +  \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}}
    1.45 +  already comes with a reasonable framework setup and thus provides
    1.46 +  a good working horse for raising code-generation-driven
    1.47 +  applications.  So, we assume some familiarity and experience
    1.48 +  with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories.
    1.49 +  (see also \cite{isa-tutorial}).
    1.50 +
    1.51 +  The code generator aims to be usable with no further ado
    1.52 +  in most cases while allowing for detailed customisation.
    1.53 +  This manifests in the structure of this tutorial: after a short
    1.54 +  conceptual introduction with an example (\secref{sec:intro}),
    1.55 +  we discuss the generic customisation facilities (\secref{sec:program}).
    1.56 +  A further section (\secref{sec:adaption}) is dedicated to the matter of
    1.57 +  \qn{adaption} to specific target language environments.  After some
    1.58 +  further issues (\secref{sec:further}) we conclude with an overview
    1.59 +  of some ML programming interfaces (\secref{sec:ml}).
    1.60 +
    1.61 +  \begin{warn}
    1.62 +    Ultimately, the code generator which this tutorial deals with
    1.63 +    is supposed to replace the existing code generator
    1.64 +    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
    1.65 +    So, for the moment, there are two distinct code generators
    1.66 +    in Isabelle.  In case of ambiguity, we will refer to the framework
    1.67 +    described here as \isa{generic\ code\ generator}, to the
    1.68 +    other as \isa{SML\ code\ generator}.
    1.69 +    Also note that while the framework itself is
    1.70 +    object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable
    1.71 +    framework setup.    
    1.72 +  \end{warn}%
    1.73 +\end{isamarkuptext}%
    1.74 +\isamarkuptrue%
    1.75 +%
    1.76 +\isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}%
    1.77 +}
    1.78 +\isamarkuptrue%
    1.79 +%
    1.80 +\begin{isamarkuptext}%
    1.81 +The key concept for understanding \isa{Isabelle}'s code generation is
    1.82 +  \emph{shallow embedding}, i.e.~logical entities like constants, types and
    1.83 +  classes are identified with corresponding concepts in the target language.
    1.84 +
    1.85 +  Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and
    1.86 +  \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form
    1.87 +  the core of a functional programming language.  The default code generator setup
    1.88 +  allows to turn those into functional programs immediately.
    1.89 +  This means that \qt{naive} code generation can proceed without further ado.
    1.90 +  For example, here a simple \qt{implementation} of amortised queues:%
    1.91 +\end{isamarkuptext}%
    1.92 +\isamarkuptrue%
    1.93 +%
    1.94 +\isadelimquoteme
    1.95 +%
    1.96 +\endisadelimquoteme
    1.97 +%
    1.98 +\isatagquoteme
    1.99 +\isacommand{datatype}\isamarkupfalse%
   1.100 +\ {\isacharprime}a\ queue\ {\isacharequal}\ Queue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline
   1.101 +\isanewline
   1.102 +\isacommand{definition}\isamarkupfalse%
   1.103 +\ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   1.104 +\ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline
   1.105 +\isanewline
   1.106 +\isacommand{primrec}\isamarkupfalse%
   1.107 +\ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   1.108 +\ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}Queue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ Queue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline
   1.109 +\isanewline
   1.110 +\isacommand{fun}\isamarkupfalse%
   1.111 +\ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   1.112 +\ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline
   1.113 +\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ Queue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline
   1.114 +\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline
   1.115 +\ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}%
   1.116 +\endisatagquoteme
   1.117 +{\isafoldquoteme}%
   1.118 +%
   1.119 +\isadelimquoteme
   1.120 +%
   1.121 +\endisadelimquoteme
   1.122 +%
   1.123 +\begin{isamarkuptext}%
   1.124 +\noindent Then we can generate code e.g.~for \isa{SML} as follows:%
   1.125 +\end{isamarkuptext}%
   1.126 +\isamarkuptrue%
   1.127 +%
   1.128 +\isadelimquoteme
   1.129 +%
   1.130 +\endisadelimquoteme
   1.131 +%
   1.132 +\isatagquoteme
   1.133 +\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
   1.134 +\ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline
   1.135 +\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}%
   1.136 +\endisatagquoteme
   1.137 +{\isafoldquoteme}%
   1.138 +%
   1.139 +\isadelimquoteme
   1.140 +%
   1.141 +\endisadelimquoteme
   1.142 +%
   1.143 +\begin{isamarkuptext}%
   1.144 +\noindent resulting in the following code:%
   1.145 +\end{isamarkuptext}%
   1.146 +\isamarkuptrue%
   1.147 +%
   1.148 +\isadelimquoteme
   1.149 +%
   1.150 +\endisadelimquoteme
   1.151 +%
   1.152 +\isatagquoteme
   1.153 +%
   1.154 +\begin{isamarkuptext}%
   1.155 +\isaverbatim%
   1.156 +\noindent%
   1.157 +\verb|structure Example = |\newline%
   1.158 +\verb|struct|\newline%
   1.159 +\newline%
   1.160 +\verb|fun foldl f a [] = a|\newline%
   1.161 +\verb|  |\verb,|,\verb| foldl f a (x :: xs) = foldl f (f a x) xs;|\newline%
   1.162 +\newline%
   1.163 +\verb|fun rev xs = foldl (fn xsa => fn x => x :: xsa) [] xs;|\newline%
   1.164 +\newline%
   1.165 +\verb|fun list_case f1 f2 (a :: lista) = f2 a lista|\newline%
   1.166 +\verb|  |\verb,|,\verb| list_case f1 f2 [] = f1;|\newline%
   1.167 +\newline%
   1.168 +\verb|datatype 'a queue = Queue of 'a list * 'a list;|\newline%
   1.169 +\newline%
   1.170 +\verb|val empty : 'a queue = Queue ([], []);|\newline%
   1.171 +\newline%
   1.172 +\verb|fun dequeue (Queue ([], [])) = (NONE, Queue ([], []))|\newline%
   1.173 +\verb|  |\verb,|,\verb| dequeue (Queue (xs, y :: ys)) = (SOME y, Queue (xs, ys))|\newline%
   1.174 +\verb|  |\verb,|,\verb| dequeue (Queue (v :: va, [])) =|\newline%
   1.175 +\verb|    let|\newline%
   1.176 +\verb|      val y :: ys = rev (v :: va);|\newline%
   1.177 +\verb|    in|\newline%
   1.178 +\verb|      (SOME y, Queue ([], ys))|\newline%
   1.179 +\verb|    end;|\newline%
   1.180 +\newline%
   1.181 +\verb|fun enqueue x (Queue (xs, ys)) = Queue (x :: xs, ys);|\newline%
   1.182 +\newline%
   1.183 +\verb|end; (*struct Example*)|%
   1.184 +\end{isamarkuptext}%
   1.185 +\isamarkuptrue%
   1.186 +%
   1.187 +\endisatagquoteme
   1.188 +{\isafoldquoteme}%
   1.189 +%
   1.190 +\isadelimquoteme
   1.191 +%
   1.192 +\endisadelimquoteme
   1.193 +%
   1.194 +\begin{isamarkuptext}%
   1.195 +\noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of
   1.196 +  constants for which code shall be generated;  anything else needed for those
   1.197 +  is added implicitly.  Then follows a target language identifier
   1.198 +  (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name.
   1.199 +  A file name denotes the destination to store the generated code.  Note that
   1.200 +  the semantics of the destination depends on the target language:  for
   1.201 +  \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell}
   1.202 +  it denotes a \emph{directory} where a file named as the module name
   1.203 +  (with extension \isa{{\isachardot}hs}) is written:%
   1.204 +\end{isamarkuptext}%
   1.205 +\isamarkuptrue%
   1.206 +%
   1.207 +\isadelimquoteme
   1.208 +%
   1.209 +\endisadelimquoteme
   1.210 +%
   1.211 +\isatagquoteme
   1.212 +\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
   1.213 +\ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline
   1.214 +\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}%
   1.215 +\endisatagquoteme
   1.216 +{\isafoldquoteme}%
   1.217 +%
   1.218 +\isadelimquoteme
   1.219 +%
   1.220 +\endisadelimquoteme
   1.221 +%
   1.222 +\begin{isamarkuptext}%
   1.223 +\noindent This is how the corresponding code in \isa{Haskell} looks like:%
   1.224 +\end{isamarkuptext}%
   1.225 +\isamarkuptrue%
   1.226 +%
   1.227 +\isadelimquoteme
   1.228 +%
   1.229 +\endisadelimquoteme
   1.230 +%
   1.231 +\isatagquoteme
   1.232 +%
   1.233 +\begin{isamarkuptext}%
   1.234 +\isaverbatim%
   1.235 +\noindent%
   1.236 +\verb|module Example where {|\newline%
   1.237 +\newline%
   1.238 +\newline%
   1.239 +\verb|foldla :: forall a b. (a -> b -> a) -> a -> [b] -> a;|\newline%
   1.240 +\verb|foldla f a [] = a;|\newline%
   1.241 +\verb|foldla f a (x : xs) = foldla f (f a x) xs;|\newline%
   1.242 +\newline%
   1.243 +\verb|rev :: forall a. [a] -> [a];|\newline%
   1.244 +\verb|rev xs = foldla (\ xsa x -> x : xsa) [] xs;|\newline%
   1.245 +\newline%
   1.246 +\verb|list_case :: forall t a. t -> (a -> [a] -> t) -> [a] -> t;|\newline%
   1.247 +\verb|list_case f1 f2 (a : list) = f2 a list;|\newline%
   1.248 +\verb|list_case f1 f2 [] = f1;|\newline%
   1.249 +\newline%
   1.250 +\verb|data Queue a = Queue [a] [a];|\newline%
   1.251 +\newline%
   1.252 +\verb|empty :: forall a. Queue a;|\newline%
   1.253 +\verb|empty = Queue [] [];|\newline%
   1.254 +\newline%
   1.255 +\verb|dequeue :: forall a. Queue a -> (Maybe a, Queue a);|\newline%
   1.256 +\verb|dequeue (Queue [] []) = (Nothing, Queue [] []);|\newline%
   1.257 +\verb|dequeue (Queue xs (y : ys)) = (Just y, Queue xs ys);|\newline%
   1.258 +\verb|dequeue (Queue (v : va) []) =|\newline%
   1.259 +\verb|  let {|\newline%
   1.260 +\verb|    (y : ys) = rev (v : va);|\newline%
   1.261 +\verb|  } in (Just y, Queue [] ys);|\newline%
   1.262 +\newline%
   1.263 +\verb|enqueue :: forall a. a -> Queue a -> Queue a;|\newline%
   1.264 +\verb|enqueue x (Queue xs ys) = Queue (x : xs) ys;|\newline%
   1.265 +\newline%
   1.266 +\verb|}|%
   1.267 +\end{isamarkuptext}%
   1.268 +\isamarkuptrue%
   1.269 +%
   1.270 +\endisatagquoteme
   1.271 +{\isafoldquoteme}%
   1.272 +%
   1.273 +\isadelimquoteme
   1.274 +%
   1.275 +\endisadelimquoteme
   1.276 +%
   1.277 +\begin{isamarkuptext}%
   1.278 +\noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command;
   1.279 +  for more details see \secref{sec:further}.%
   1.280 +\end{isamarkuptext}%
   1.281 +\isamarkuptrue%
   1.282 +%
   1.283 +\isamarkupsubsection{Code generator architecture \label{sec:concept}%
   1.284 +}
   1.285 +\isamarkuptrue%
   1.286 +%
   1.287 +\begin{isamarkuptext}%
   1.288 +What you have seen so far should be already enough in a lot of cases.  If you
   1.289 +  are content with this, you can quit reading here.  Anyway, in order to customise
   1.290 +  and adapt the code generator, it is inevitable to gain some understanding
   1.291 +  how it works.
   1.292 +
   1.293 +  \begin{figure}[h]
   1.294 +    \centering
   1.295 +    \includegraphics[width=0.7\textwidth]{codegen_process}
   1.296 +    \caption{Code generator architecture}
   1.297 +    \label{fig:arch}
   1.298 +  \end{figure}
   1.299 +
   1.300 +  The code generator employs a notion of executability
   1.301 +  for three foundational executable ingredients known
   1.302 +  from functional programming:
   1.303 +  \emph{defining equations}, \emph{datatypes}, and
   1.304 +  \emph{type classes}.  A defining equation as a first approximation
   1.305 +  is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t}
   1.306 +  (an equation headed by a constant \isa{f} with arguments
   1.307 +  \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}).
   1.308 +  Code generation aims to turn defining equations
   1.309 +  into a functional program.  This is achieved by three major
   1.310 +  components which operate sequentially, i.e. the result of one is
   1.311 +  the input
   1.312 +  of the next in the chain,  see diagram \ref{fig:arch}:
   1.313 +
   1.314 +  \begin{itemize}
   1.315 +
   1.316 +    \item Out of the vast collection of theorems proven in a
   1.317 +      \qn{theory}, a reasonable subset modelling
   1.318 +      defining equations is \qn{selected}.
   1.319 +
   1.320 +    \item On those selected theorems, certain
   1.321 +      transformations are carried out
   1.322 +      (\qn{preprocessing}).  Their purpose is to turn theorems
   1.323 +      representing non- or badly executable
   1.324 +      specifications into equivalent but executable counterparts.
   1.325 +      The result is a structured collection of \qn{code theorems}.
   1.326 +
   1.327 +    \item Before the selected defining equations are continued with,
   1.328 +      they can be \qn{preprocessed}, i.e. subjected to theorem
   1.329 +      transformations.  This \qn{preprocessor} is an interface which
   1.330 +      allows to apply
   1.331 +      the full expressiveness of ML-based theorem transformations
   1.332 +      to code generation;  motivating examples are shown below, see
   1.333 +      \secref{sec:preproc}.
   1.334 +      The result of the preprocessing step is a structured collection
   1.335 +      of defining equations.
   1.336 +
   1.337 +    \item These defining equations are \qn{translated} to a program
   1.338 +      in an abstract intermediate language.  Think of it as a kind
   1.339 +      of \qt{Mini-Haskell} with four \qn{statements}: \isa{data}
   1.340 +      (for datatypes), \isa{fun} (stemming from defining equations),
   1.341 +      also \isa{class} and \isa{inst} (for type classes).
   1.342 +
   1.343 +    \item Finally, the abstract program is \qn{serialised} into concrete
   1.344 +      source code of a target language.
   1.345 +
   1.346 +  \end{itemize}
   1.347 +
   1.348 +  \noindent From these steps, only the two last are carried out outside the logic;  by
   1.349 +  keeping this layer as thin as possible, the amount of code to trust is
   1.350 +  kept to a minimum.%
   1.351 +\end{isamarkuptext}%
   1.352 +\isamarkuptrue%
   1.353 +%
   1.354 +\isadelimtheory
   1.355 +%
   1.356 +\endisadelimtheory
   1.357 +%
   1.358 +\isatagtheory
   1.359 +\isacommand{end}\isamarkupfalse%
   1.360 +%
   1.361 +\endisatagtheory
   1.362 +{\isafoldtheory}%
   1.363 +%
   1.364 +\isadelimtheory
   1.365 +%
   1.366 +\endisadelimtheory
   1.367 +\isanewline
   1.368 +\end{isabellebody}%
   1.369 +%%% Local Variables:
   1.370 +%%% mode: latex
   1.371 +%%% TeX-master: "root"
   1.372 +%%% End: