diff -r 2775062fd3a9 -r 2f4684e2ea95 doc-src/Codegen/Thy/document/Introduction.tex --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc-src/Codegen/Thy/document/Introduction.tex Tue Mar 03 11:00:51 2009 +0100 @@ -0,0 +1,390 @@ +% +\begin{isabellebody}% +\def\isabellecontext{Introduction}% +% +\isadelimtheory +% +\endisadelimtheory +% +\isatagtheory +\isacommand{theory}\isamarkupfalse% +\ Introduction\isanewline +\isakeyword{imports}\ Setup\isanewline +\isakeyword{begin}% +\endisatagtheory +{\isafoldtheory}% +% +\isadelimtheory +% +\endisadelimtheory +% +\isamarkupchapter{Code generation from \isa{Isabelle{\isacharslash}HOL} theories% +} +\isamarkuptrue% +% +\isamarkupsection{Introduction and Overview% +} +\isamarkuptrue% +% +\begin{isamarkuptext}% +This tutorial introduces a generic code generator for the + \isa{Isabelle} system. + Generic in the sense that the + \qn{target language} for which code shall ultimately be + generated is not fixed but may be an arbitrary state-of-the-art + functional programming language (currently, the implementation + supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell} + \cite{haskell-revised-report}). + + Conceptually the code generator framework is part + of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic + \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} + already comes with a reasonable framework setup and thus provides + a good working horse for raising code-generation-driven + applications. So, we assume some familiarity and experience + with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories. + (see also \cite{isa-tutorial}). + + The code generator aims to be usable with no further ado + in most cases while allowing for detailed customisation. + This manifests in the structure of this tutorial: after a short + conceptual introduction with an example (\secref{sec:intro}), + we discuss the generic customisation facilities (\secref{sec:program}). + A further section (\secref{sec:adaption}) is dedicated to the matter of + \qn{adaption} to specific target language environments. After some + further issues (\secref{sec:further}) we conclude with an overview + of some ML programming interfaces (\secref{sec:ml}). + + \begin{warn} + Ultimately, the code generator which this tutorial deals with + is supposed to replace the existing code generator + by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. + So, for the moment, there are two distinct code generators + in Isabelle. In case of ambiguity, we will refer to the framework + described here as \isa{generic\ code\ generator}, to the + other as \isa{SML\ code\ generator}. + Also note that while the framework itself is + object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable + framework setup. + \end{warn}% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}% +} +\isamarkuptrue% +% +\begin{isamarkuptext}% +The key concept for understanding \isa{Isabelle}'s code generation is + \emph{shallow embedding}, i.e.~logical entities like constants, types and + classes are identified with corresponding concepts in the target language. + + Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and + \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form + the core of a functional programming language. The default code generator setup + allows to turn those into functional programs immediately. + This means that \qt{naive} code generation can proceed without further ado. + For example, here a simple \qt{implementation} of amortised queues:% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimquote +% +\endisadelimquote +% +\isatagquote +\isacommand{datatype}\isamarkupfalse% +\ {\isacharprime}a\ queue\ {\isacharequal}\ AQueue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline +\isanewline +\isacommand{definition}\isamarkupfalse% +\ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline +\ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline +\isanewline +\isacommand{primrec}\isamarkupfalse% +\ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline +\ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}AQueue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ AQueue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline +\isanewline +\isacommand{fun}\isamarkupfalse% +\ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline +\ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline +\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline +\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline +\ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}% +\endisatagquote +{\isafoldquote}% +% +\isadelimquote +% +\endisadelimquote +% +\begin{isamarkuptext}% +\noindent Then we can generate code e.g.~for \isa{SML} as follows:% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimquote +% +\endisadelimquote +% +\isatagquote +\isacommand{export{\isacharunderscore}code}\isamarkupfalse% +\ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline +\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}% +\endisatagquote +{\isafoldquote}% +% +\isadelimquote +% +\endisadelimquote +% +\begin{isamarkuptext}% +\noindent resulting in the following code:% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimquote +% +\endisadelimquote +% +\isatagquote +% +\begin{isamarkuptext}% +\isatypewriter% +\noindent% +\hspace*{0pt}structure Example = \\ +\hspace*{0pt}struct\\ +\hspace*{0pt}\\ +\hspace*{0pt}fun foldl f a [] = a\\ +\hspace*{0pt} ~| foldl f a (x ::~xs) = foldl f (f a x) xs;\\ +\hspace*{0pt}\\ +\hspace*{0pt}fun rev xs = foldl (fn xsa => fn x => x ::~xsa) [] xs;\\ +\hspace*{0pt}\\ +\hspace*{0pt}fun list{\char95}case f1 f2 (a ::~lista) = f2 a lista\\ +\hspace*{0pt} ~| list{\char95}case f1 f2 [] = f1;\\ +\hspace*{0pt}\\ +\hspace*{0pt}datatype 'a queue = AQueue of 'a list * 'a list;\\ +\hspace*{0pt}\\ +\hspace*{0pt}val empty :~'a queue = AQueue ([],~[])\\ +\hspace*{0pt}\\ +\hspace*{0pt}fun dequeue (AQueue ([],~[])) = (NONE,~AQueue ([],~[]))\\ +\hspace*{0pt} ~| dequeue (AQueue (xs,~y ::~ys)) = (SOME y,~AQueue (xs,~ys))\\ +\hspace*{0pt} ~| dequeue (AQueue (v ::~va,~[])) =\\ +\hspace*{0pt} ~~~let\\ +\hspace*{0pt} ~~~~~val y ::~ys = rev (v ::~va);\\ +\hspace*{0pt} ~~~in\\ +\hspace*{0pt} ~~~~~(SOME y,~AQueue ([],~ys))\\ +\hspace*{0pt} ~~~end;\\ +\hspace*{0pt}\\ +\hspace*{0pt}fun enqueue x (AQueue (xs,~ys)) = AQueue (x ::~xs,~ys);\\ +\hspace*{0pt}\\ +\hspace*{0pt}end;~(*struct Example*)% +\end{isamarkuptext}% +\isamarkuptrue% +% +\endisatagquote +{\isafoldquote}% +% +\isadelimquote +% +\endisadelimquote +% +\begin{isamarkuptext}% +\noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of + constants for which code shall be generated; anything else needed for those + is added implicitly. Then follows a target language identifier + (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name. + A file name denotes the destination to store the generated code. Note that + the semantics of the destination depends on the target language: for + \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell} + it denotes a \emph{directory} where a file named as the module name + (with extension \isa{{\isachardot}hs}) is written:% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimquote +% +\endisadelimquote +% +\isatagquote +\isacommand{export{\isacharunderscore}code}\isamarkupfalse% +\ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline +\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}% +\endisatagquote +{\isafoldquote}% +% +\isadelimquote +% +\endisadelimquote +% +\begin{isamarkuptext}% +\noindent This is how the corresponding code in \isa{Haskell} looks like:% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimquote +% +\endisadelimquote +% +\isatagquote +% +\begin{isamarkuptext}% +\isatypewriter% +\noindent% +\hspace*{0pt}module Example where {\char123}\\ +\hspace*{0pt}\\ +\hspace*{0pt}\\ +\hspace*{0pt}foldla ::~forall a b.~(a -> b -> a) -> a -> [b] -> a;\\ +\hspace*{0pt}foldla f a [] = a;\\ +\hspace*{0pt}foldla f a (x :~xs) = foldla f (f a x) xs;\\ +\hspace*{0pt}\\ +\hspace*{0pt}rev ::~forall a.~[a] -> [a];\\ +\hspace*{0pt}rev xs = foldla ({\char92}~xsa x -> x :~xsa) [] xs;\\ +\hspace*{0pt}\\ +\hspace*{0pt}list{\char95}case ::~forall t a.~t -> (a -> [a] -> t) -> [a] -> t;\\ +\hspace*{0pt}list{\char95}case f1 f2 (a :~list) = f2 a list;\\ +\hspace*{0pt}list{\char95}case f1 f2 [] = f1;\\ +\hspace*{0pt}\\ +\hspace*{0pt}data Queue a = AQueue [a] [a];\\ +\hspace*{0pt}\\ +\hspace*{0pt}empty ::~forall a.~Queue a;\\ +\hspace*{0pt}empty = AQueue [] [];\\ +\hspace*{0pt}\\ +\hspace*{0pt}dequeue ::~forall a.~Queue a -> (Maybe a,~Queue a);\\ +\hspace*{0pt}dequeue (AQueue [] []) = (Nothing,~AQueue [] []);\\ +\hspace*{0pt}dequeue (AQueue xs (y :~ys)) = (Just y,~AQueue xs ys);\\ +\hspace*{0pt}dequeue (AQueue (v :~va) []) =\\ +\hspace*{0pt} ~let {\char123}\\ +\hspace*{0pt} ~~~(y :~ys) = rev (v :~va);\\ +\hspace*{0pt} ~{\char125}~in (Just y,~AQueue [] ys);\\ +\hspace*{0pt}\\ +\hspace*{0pt}enqueue ::~forall a.~a -> Queue a -> Queue a;\\ +\hspace*{0pt}enqueue x (AQueue xs ys) = AQueue (x :~xs) ys;\\ +\hspace*{0pt}\\ +\hspace*{0pt}{\char125}% +\end{isamarkuptext}% +\isamarkuptrue% +% +\endisatagquote +{\isafoldquote}% +% +\isadelimquote +% +\endisadelimquote +% +\begin{isamarkuptext}% +\noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command; + for more details see \secref{sec:further}.% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isamarkupsubsection{Code generator architecture \label{sec:concept}% +} +\isamarkuptrue% +% +\begin{isamarkuptext}% +What you have seen so far should be already enough in a lot of cases. If you + are content with this, you can quit reading here. Anyway, in order to customise + and adapt the code generator, it is inevitable to gain some understanding + how it works. + + \begin{figure}[h] + \begin{tikzpicture}[x = 4.2cm, y = 1cm] + \tikzstyle entity=[rounded corners, draw, thick, color = black, fill = white]; + \tikzstyle process=[ellipse, draw, thick, color = green, fill = white]; + \tikzstyle process_arrow=[->, semithick, color = green]; + \node (HOL) at (0, 4) [style=entity] {\isa{Isabelle{\isacharslash}HOL} theory}; + \node (eqn) at (2, 2) [style=entity] {code equations}; + \node (iml) at (2, 0) [style=entity] {intermediate language}; + \node (seri) at (1, 0) [style=process] {serialisation}; + \node (SML) at (0, 3) [style=entity] {\isa{SML}}; + \node (OCaml) at (0, 2) [style=entity] {\isa{OCaml}}; + \node (further) at (0, 1) [style=entity] {\isa{{\isasymdots}}}; + \node (Haskell) at (0, 0) [style=entity] {\isa{Haskell}}; + \draw [style=process_arrow] (HOL) .. controls (2, 4) .. + node [style=process, near start] {selection} + node [style=process, near end] {preprocessing} + (eqn); + \draw [style=process_arrow] (eqn) -- node (transl) [style=process] {translation} (iml); + \draw [style=process_arrow] (iml) -- (seri); + \draw [style=process_arrow] (seri) -- (SML); + \draw [style=process_arrow] (seri) -- (OCaml); + \draw [style=process_arrow, dashed] (seri) -- (further); + \draw [style=process_arrow] (seri) -- (Haskell); + \end{tikzpicture} + \caption{Code generator architecture} + \label{fig:arch} + \end{figure} + + The code generator employs a notion of executability + for three foundational executable ingredients known + from functional programming: + \emph{code equations}, \emph{datatypes}, and + \emph{type classes}. A code equation as a first approximation + is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t} + (an equation headed by a constant \isa{f} with arguments + \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}). + Code generation aims to turn code equations + into a functional program. This is achieved by three major + components which operate sequentially, i.e. the result of one is + the input + of the next in the chain, see diagram \ref{fig:arch}: + + \begin{itemize} + + \item Out of the vast collection of theorems proven in a + \qn{theory}, a reasonable subset modelling + code equations is \qn{selected}. + + \item On those selected theorems, certain + transformations are carried out + (\qn{preprocessing}). Their purpose is to turn theorems + representing non- or badly executable + specifications into equivalent but executable counterparts. + The result is a structured collection of \qn{code theorems}. + + \item Before the selected code equations are continued with, + they can be \qn{preprocessed}, i.e. subjected to theorem + transformations. This \qn{preprocessor} is an interface which + allows to apply + the full expressiveness of ML-based theorem transformations + to code generation; motivating examples are shown below, see + \secref{sec:preproc}. + The result of the preprocessing step is a structured collection + of code equations. + + \item These code equations are \qn{translated} to a program + in an abstract intermediate language. Think of it as a kind + of \qt{Mini-Haskell} with four \qn{statements}: \isa{data} + (for datatypes), \isa{fun} (stemming from code equations), + also \isa{class} and \isa{inst} (for type classes). + + \item Finally, the abstract program is \qn{serialised} into concrete + source code of a target language. + + \end{itemize} + + \noindent From these steps, only the two last are carried out outside the logic; by + keeping this layer as thin as possible, the amount of code to trust is + kept to a minimum.% +\end{isamarkuptext}% +\isamarkuptrue% +% +\isadelimtheory +% +\endisadelimtheory +% +\isatagtheory +\isacommand{end}\isamarkupfalse% +% +\endisatagtheory +{\isafoldtheory}% +% +\isadelimtheory +% +\endisadelimtheory +\isanewline +\end{isabellebody}% +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "root" +%%% End: