doc-src/Codegen/Thy/document/Introduction.tex
changeset 30226 2f4684e2ea95
parent 29798 6df726203e39
child 30227 853abb4853cc
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc-src/Codegen/Thy/document/Introduction.tex	Tue Mar 03 11:00:51 2009 +0100
@@ -0,0 +1,390 @@
+%
+\begin{isabellebody}%
+\def\isabellecontext{Introduction}%
+%
+\isadelimtheory
+%
+\endisadelimtheory
+%
+\isatagtheory
+\isacommand{theory}\isamarkupfalse%
+\ Introduction\isanewline
+\isakeyword{imports}\ Setup\isanewline
+\isakeyword{begin}%
+\endisatagtheory
+{\isafoldtheory}%
+%
+\isadelimtheory
+%
+\endisadelimtheory
+%
+\isamarkupchapter{Code generation from \isa{Isabelle{\isacharslash}HOL} theories%
+}
+\isamarkuptrue%
+%
+\isamarkupsection{Introduction and Overview%
+}
+\isamarkuptrue%
+%
+\begin{isamarkuptext}%
+This tutorial introduces a generic code generator for the
+  \isa{Isabelle} system.
+  Generic in the sense that the
+  \qn{target language} for which code shall ultimately be
+  generated is not fixed but may be an arbitrary state-of-the-art
+  functional programming language (currently, the implementation
+  supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell}
+  \cite{haskell-revised-report}).
+
+  Conceptually the code generator framework is part
+  of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic
+  \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}}
+  already comes with a reasonable framework setup and thus provides
+  a good working horse for raising code-generation-driven
+  applications.  So, we assume some familiarity and experience
+  with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories.
+  (see also \cite{isa-tutorial}).
+
+  The code generator aims to be usable with no further ado
+  in most cases while allowing for detailed customisation.
+  This manifests in the structure of this tutorial: after a short
+  conceptual introduction with an example (\secref{sec:intro}),
+  we discuss the generic customisation facilities (\secref{sec:program}).
+  A further section (\secref{sec:adaption}) is dedicated to the matter of
+  \qn{adaption} to specific target language environments.  After some
+  further issues (\secref{sec:further}) we conclude with an overview
+  of some ML programming interfaces (\secref{sec:ml}).
+
+  \begin{warn}
+    Ultimately, the code generator which this tutorial deals with
+    is supposed to replace the existing code generator
+    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
+    So, for the moment, there are two distinct code generators
+    in Isabelle.  In case of ambiguity, we will refer to the framework
+    described here as \isa{generic\ code\ generator}, to the
+    other as \isa{SML\ code\ generator}.
+    Also note that while the framework itself is
+    object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable
+    framework setup.    
+  \end{warn}%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}%
+}
+\isamarkuptrue%
+%
+\begin{isamarkuptext}%
+The key concept for understanding \isa{Isabelle}'s code generation is
+  \emph{shallow embedding}, i.e.~logical entities like constants, types and
+  classes are identified with corresponding concepts in the target language.
+
+  Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and
+  \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form
+  the core of a functional programming language.  The default code generator setup
+  allows to turn those into functional programs immediately.
+  This means that \qt{naive} code generation can proceed without further ado.
+  For example, here a simple \qt{implementation} of amortised queues:%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\isatagquote
+\isacommand{datatype}\isamarkupfalse%
+\ {\isacharprime}a\ queue\ {\isacharequal}\ AQueue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline
+\isanewline
+\isacommand{definition}\isamarkupfalse%
+\ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
+\ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline
+\isanewline
+\isacommand{primrec}\isamarkupfalse%
+\ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
+\ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}AQueue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ AQueue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline
+\isanewline
+\isacommand{fun}\isamarkupfalse%
+\ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
+\ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline
+\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline
+\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline
+\ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}%
+\endisatagquote
+{\isafoldquote}%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\begin{isamarkuptext}%
+\noindent Then we can generate code e.g.~for \isa{SML} as follows:%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\isatagquote
+\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
+\ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline
+\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}%
+\endisatagquote
+{\isafoldquote}%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\begin{isamarkuptext}%
+\noindent resulting in the following code:%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\isatagquote
+%
+\begin{isamarkuptext}%
+\isatypewriter%
+\noindent%
+\hspace*{0pt}structure Example = \\
+\hspace*{0pt}struct\\
+\hspace*{0pt}\\
+\hspace*{0pt}fun foldl f a [] = a\\
+\hspace*{0pt} ~| foldl f a (x ::~xs) = foldl f (f a x) xs;\\
+\hspace*{0pt}\\
+\hspace*{0pt}fun rev xs = foldl (fn xsa => fn x => x ::~xsa) [] xs;\\
+\hspace*{0pt}\\
+\hspace*{0pt}fun list{\char95}case f1 f2 (a ::~lista) = f2 a lista\\
+\hspace*{0pt} ~| list{\char95}case f1 f2 [] = f1;\\
+\hspace*{0pt}\\
+\hspace*{0pt}datatype 'a queue = AQueue of 'a list * 'a list;\\
+\hspace*{0pt}\\
+\hspace*{0pt}val empty :~'a queue = AQueue ([],~[])\\
+\hspace*{0pt}\\
+\hspace*{0pt}fun dequeue (AQueue ([],~[])) = (NONE,~AQueue ([],~[]))\\
+\hspace*{0pt} ~| dequeue (AQueue (xs,~y ::~ys)) = (SOME y,~AQueue (xs,~ys))\\
+\hspace*{0pt} ~| dequeue (AQueue (v ::~va,~[])) =\\
+\hspace*{0pt} ~~~let\\
+\hspace*{0pt} ~~~~~val y ::~ys = rev (v ::~va);\\
+\hspace*{0pt} ~~~in\\
+\hspace*{0pt} ~~~~~(SOME y,~AQueue ([],~ys))\\
+\hspace*{0pt} ~~~end;\\
+\hspace*{0pt}\\
+\hspace*{0pt}fun enqueue x (AQueue (xs,~ys)) = AQueue (x ::~xs,~ys);\\
+\hspace*{0pt}\\
+\hspace*{0pt}end;~(*struct Example*)%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\endisatagquote
+{\isafoldquote}%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\begin{isamarkuptext}%
+\noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of
+  constants for which code shall be generated;  anything else needed for those
+  is added implicitly.  Then follows a target language identifier
+  (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name.
+  A file name denotes the destination to store the generated code.  Note that
+  the semantics of the destination depends on the target language:  for
+  \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell}
+  it denotes a \emph{directory} where a file named as the module name
+  (with extension \isa{{\isachardot}hs}) is written:%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\isatagquote
+\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
+\ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline
+\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}%
+\endisatagquote
+{\isafoldquote}%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\begin{isamarkuptext}%
+\noindent This is how the corresponding code in \isa{Haskell} looks like:%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\isatagquote
+%
+\begin{isamarkuptext}%
+\isatypewriter%
+\noindent%
+\hspace*{0pt}module Example where {\char123}\\
+\hspace*{0pt}\\
+\hspace*{0pt}\\
+\hspace*{0pt}foldla ::~forall a b.~(a -> b -> a) -> a -> [b] -> a;\\
+\hspace*{0pt}foldla f a [] = a;\\
+\hspace*{0pt}foldla f a (x :~xs) = foldla f (f a x) xs;\\
+\hspace*{0pt}\\
+\hspace*{0pt}rev ::~forall a.~[a] -> [a];\\
+\hspace*{0pt}rev xs = foldla ({\char92}~xsa x -> x :~xsa) [] xs;\\
+\hspace*{0pt}\\
+\hspace*{0pt}list{\char95}case ::~forall t a.~t -> (a -> [a] -> t) -> [a] -> t;\\
+\hspace*{0pt}list{\char95}case f1 f2 (a :~list) = f2 a list;\\
+\hspace*{0pt}list{\char95}case f1 f2 [] = f1;\\
+\hspace*{0pt}\\
+\hspace*{0pt}data Queue a = AQueue [a] [a];\\
+\hspace*{0pt}\\
+\hspace*{0pt}empty ::~forall a.~Queue a;\\
+\hspace*{0pt}empty = AQueue [] [];\\
+\hspace*{0pt}\\
+\hspace*{0pt}dequeue ::~forall a.~Queue a -> (Maybe a,~Queue a);\\
+\hspace*{0pt}dequeue (AQueue [] []) = (Nothing,~AQueue [] []);\\
+\hspace*{0pt}dequeue (AQueue xs (y :~ys)) = (Just y,~AQueue xs ys);\\
+\hspace*{0pt}dequeue (AQueue (v :~va) []) =\\
+\hspace*{0pt} ~let {\char123}\\
+\hspace*{0pt} ~~~(y :~ys) = rev (v :~va);\\
+\hspace*{0pt} ~{\char125}~in (Just y,~AQueue [] ys);\\
+\hspace*{0pt}\\
+\hspace*{0pt}enqueue ::~forall a.~a -> Queue a -> Queue a;\\
+\hspace*{0pt}enqueue x (AQueue xs ys) = AQueue (x :~xs) ys;\\
+\hspace*{0pt}\\
+\hspace*{0pt}{\char125}%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\endisatagquote
+{\isafoldquote}%
+%
+\isadelimquote
+%
+\endisadelimquote
+%
+\begin{isamarkuptext}%
+\noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command;
+  for more details see \secref{sec:further}.%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isamarkupsubsection{Code generator architecture \label{sec:concept}%
+}
+\isamarkuptrue%
+%
+\begin{isamarkuptext}%
+What you have seen so far should be already enough in a lot of cases.  If you
+  are content with this, you can quit reading here.  Anyway, in order to customise
+  and adapt the code generator, it is inevitable to gain some understanding
+  how it works.
+
+  \begin{figure}[h]
+    \begin{tikzpicture}[x = 4.2cm, y = 1cm]
+      \tikzstyle entity=[rounded corners, draw, thick, color = black, fill = white];
+      \tikzstyle process=[ellipse, draw, thick, color = green, fill = white];
+      \tikzstyle process_arrow=[->, semithick, color = green];
+      \node (HOL) at (0, 4) [style=entity] {\isa{Isabelle{\isacharslash}HOL} theory};
+      \node (eqn) at (2, 2) [style=entity] {code equations};
+      \node (iml) at (2, 0) [style=entity] {intermediate language};
+      \node (seri) at (1, 0) [style=process] {serialisation};
+      \node (SML) at (0, 3) [style=entity] {\isa{SML}};
+      \node (OCaml) at (0, 2) [style=entity] {\isa{OCaml}};
+      \node (further) at (0, 1) [style=entity] {\isa{{\isasymdots}}};
+      \node (Haskell) at (0, 0) [style=entity] {\isa{Haskell}};
+      \draw [style=process_arrow] (HOL) .. controls (2, 4) ..
+        node [style=process, near start] {selection}
+        node [style=process, near end] {preprocessing}
+        (eqn);
+      \draw [style=process_arrow] (eqn) -- node (transl) [style=process] {translation} (iml);
+      \draw [style=process_arrow] (iml) -- (seri);
+      \draw [style=process_arrow] (seri) -- (SML);
+      \draw [style=process_arrow] (seri) -- (OCaml);
+      \draw [style=process_arrow, dashed] (seri) -- (further);
+      \draw [style=process_arrow] (seri) -- (Haskell);
+    \end{tikzpicture}
+    \caption{Code generator architecture}
+    \label{fig:arch}
+  \end{figure}
+
+  The code generator employs a notion of executability
+  for three foundational executable ingredients known
+  from functional programming:
+  \emph{code equations}, \emph{datatypes}, and
+  \emph{type classes}.  A code equation as a first approximation
+  is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t}
+  (an equation headed by a constant \isa{f} with arguments
+  \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}).
+  Code generation aims to turn code equations
+  into a functional program.  This is achieved by three major
+  components which operate sequentially, i.e. the result of one is
+  the input
+  of the next in the chain,  see diagram \ref{fig:arch}:
+
+  \begin{itemize}
+
+    \item Out of the vast collection of theorems proven in a
+      \qn{theory}, a reasonable subset modelling
+      code equations is \qn{selected}.
+
+    \item On those selected theorems, certain
+      transformations are carried out
+      (\qn{preprocessing}).  Their purpose is to turn theorems
+      representing non- or badly executable
+      specifications into equivalent but executable counterparts.
+      The result is a structured collection of \qn{code theorems}.
+
+    \item Before the selected code equations are continued with,
+      they can be \qn{preprocessed}, i.e. subjected to theorem
+      transformations.  This \qn{preprocessor} is an interface which
+      allows to apply
+      the full expressiveness of ML-based theorem transformations
+      to code generation;  motivating examples are shown below, see
+      \secref{sec:preproc}.
+      The result of the preprocessing step is a structured collection
+      of code equations.
+
+    \item These code equations are \qn{translated} to a program
+      in an abstract intermediate language.  Think of it as a kind
+      of \qt{Mini-Haskell} with four \qn{statements}: \isa{data}
+      (for datatypes), \isa{fun} (stemming from code equations),
+      also \isa{class} and \isa{inst} (for type classes).
+
+    \item Finally, the abstract program is \qn{serialised} into concrete
+      source code of a target language.
+
+  \end{itemize}
+
+  \noindent From these steps, only the two last are carried out outside the logic;  by
+  keeping this layer as thin as possible, the amount of code to trust is
+  kept to a minimum.%
+\end{isamarkuptext}%
+\isamarkuptrue%
+%
+\isadelimtheory
+%
+\endisadelimtheory
+%
+\isatagtheory
+\isacommand{end}\isamarkupfalse%
+%
+\endisatagtheory
+{\isafoldtheory}%
+%
+\isadelimtheory
+%
+\endisadelimtheory
+\isanewline
+\end{isabellebody}%
+%%% Local Variables:
+%%% mode: latex
+%%% TeX-master: "root"
+%%% End: