doc-src/Codegen/Thy/document/Introduction.tex
author haftmann
Fri, 13 Aug 2010 13:43:54 +0200
changeset 38402 58fc3a3af71f
parent 37610 1b09880d9734
child 38405 7935b334893e
permissions -rw-r--r--
added stub "If something utterly fails"

%
\begin{isabellebody}%
\def\isabellecontext{Introduction}%
%
\isadelimtheory
%
\endisadelimtheory
%
\isatagtheory
\isacommand{theory}\isamarkupfalse%
\ Introduction\isanewline
\isakeyword{imports}\ Setup\isanewline
\isakeyword{begin}%
\endisatagtheory
{\isafoldtheory}%
%
\isadelimtheory
%
\endisadelimtheory
%
\isamarkupsection{Introduction and Overview%
}
\isamarkuptrue%
%
\begin{isamarkuptext}%
This tutorial introduces a generic code generator for the
  \isa{Isabelle} system.
  The
  \qn{target language} for which code is
  generated is not fixed, but may be one of several
  functional programming languages (currently, the implementation
  supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell}
  \cite{haskell-revised-report}).

  Conceptually the code generator framework is part
  of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic
  \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} \cite{isa-tutorial}, which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}},
  already comes with a reasonable framework setup and thus provides
  a good basis for creating code-generation-driven
  applications.  So, we assume some familiarity and experience
  with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories.

  The code generator aims to be usable with no further ado
  in most cases, while allowing for detailed customisation.
  This can be seen in the structure of this tutorial: after a short
  conceptual introduction with an example (\secref{sec:intro}),
  we discuss the generic customisation facilities (\secref{sec:program}).
  A further section (\secref{sec:adaptation}) is dedicated to the matter of
  \qn{adaptation} to specific target language environments.  After some
  further issues (\secref{sec:further}) we conclude with an overview
  of some ML programming interfaces (\secref{sec:ml}).

  \begin{warn}
    Ultimately, the code generator which this tutorial deals with
    is supposed to replace the existing code generator
    by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
    So, for the moment, there are two distinct code generators
    in Isabelle.  In case of ambiguity, we will refer to the framework
    described here as \isa{generic\ code\ generator}, to the
    other as \isa{SML\ code\ generator}.
    Also note that while the framework itself is
    object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable
    framework setup.    
  \end{warn}%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}%
}
\isamarkuptrue%
%
\begin{isamarkuptext}%
The key concept for understanding \isa{Isabelle}'s code generation is
  \emph{shallow embedding}, i.e.~logical entities like constants, types and
  classes are identified with corresponding concepts in the target language.

  Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and
  \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form
  the core of a functional programming language.  The default code generator setup
   transforms those into functional programs immediately.
  This means that \qt{naive} code generation can proceed without further ado.
  For example, here a simple \qt{implementation} of amortised queues:%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimquote
%
\endisadelimquote
%
\isatagquote
\isacommand{datatype}\isamarkupfalse%
\ {\isacharprime}a\ queue\ {\isacharequal}\ AQueue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline
\isanewline
\isacommand{definition}\isamarkupfalse%
\ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
\ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline
\isanewline
\isacommand{primrec}\isamarkupfalse%
\ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
\ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}AQueue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ AQueue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline
\isanewline
\isacommand{fun}\isamarkupfalse%
\ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
\ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline
\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline
\ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline
\ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}%
\endisatagquote
{\isafoldquote}%
%
\isadelimquote
%
\endisadelimquote
%
\begin{isamarkuptext}%
\noindent Then we can generate code e.g.~for \isa{SML} as follows:%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimquote
%
\endisadelimquote
%
\isatagquote
\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
\ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline
\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}%
\endisatagquote
{\isafoldquote}%
%
\isadelimquote
%
\endisadelimquote
%
\begin{isamarkuptext}%
\noindent resulting in the following code:%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimquote
%
\endisadelimquote
%
\isatagquote
%
\begin{isamarkuptext}%
\isatypewriter%
\noindent%
\hspace*{0pt}structure Example :~sig\\
\hspace*{0pt} ~val foldl :~('a -> 'b -> 'a) -> 'a -> 'b list -> 'a\\
\hspace*{0pt} ~val rev :~'a list -> 'a list\\
\hspace*{0pt} ~datatype 'a queue = AQueue of 'a list * 'a list\\
\hspace*{0pt} ~val empty :~'a queue\\
\hspace*{0pt} ~val dequeue :~'a queue -> 'a option * 'a queue\\
\hspace*{0pt} ~val enqueue :~'a -> 'a queue -> 'a queue\\
\hspace*{0pt}end = struct\\
\hspace*{0pt}\\
\hspace*{0pt}fun foldl f a [] = a\\
\hspace*{0pt} ~| foldl f a (x ::~xs) = foldl f (f a x) xs;\\
\hspace*{0pt}\\
\hspace*{0pt}fun rev xs = foldl (fn xsa => fn x => x ::~xsa) [] xs;\\
\hspace*{0pt}\\
\hspace*{0pt}datatype 'a queue = AQueue of 'a list * 'a list;\\
\hspace*{0pt}\\
\hspace*{0pt}val empty :~'a queue = AQueue ([],~[]);\\
\hspace*{0pt}\\
\hspace*{0pt}fun dequeue (AQueue ([],~[])) = (NONE,~AQueue ([],~[]))\\
\hspace*{0pt} ~| dequeue (AQueue (xs,~y ::~ys)) = (SOME y,~AQueue (xs,~ys))\\
\hspace*{0pt} ~| dequeue (AQueue (v ::~va,~[])) =\\
\hspace*{0pt} ~~~let\\
\hspace*{0pt} ~~~~~val y ::~ys = rev (v ::~va);\\
\hspace*{0pt} ~~~in\\
\hspace*{0pt} ~~~~~(SOME y,~AQueue ([],~ys))\\
\hspace*{0pt} ~~~end;\\
\hspace*{0pt}\\
\hspace*{0pt}fun enqueue x (AQueue (xs,~ys)) = AQueue (x ::~xs,~ys);\\
\hspace*{0pt}\\
\hspace*{0pt}end;~(*struct Example*)%
\end{isamarkuptext}%
\isamarkuptrue%
%
\endisatagquote
{\isafoldquote}%
%
\isadelimquote
%
\endisadelimquote
%
\begin{isamarkuptext}%
\noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of
  constants for which code shall be generated;  anything else needed for those
  is added implicitly.  Then follows a target language identifier
  (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name.
  A file name denotes the destination to store the generated code.  Note that
  the semantics of the destination depends on the target language:  for
  \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell}
  it denotes a \emph{directory} where a file named as the module name
  (with extension \isa{{\isachardot}hs}) is written:%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimquote
%
\endisadelimquote
%
\isatagquote
\isacommand{export{\isacharunderscore}code}\isamarkupfalse%
\ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline
\ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}%
\endisatagquote
{\isafoldquote}%
%
\isadelimquote
%
\endisadelimquote
%
\begin{isamarkuptext}%
\noindent This is the corresponding code in \isa{Haskell}:%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimquote
%
\endisadelimquote
%
\isatagquote
%
\begin{isamarkuptext}%
\isatypewriter%
\noindent%
\hspace*{0pt}module Example where {\char123}\\
\hspace*{0pt}\\
\hspace*{0pt}data Queue a = AQueue [a] [a];\\
\hspace*{0pt}\\
\hspace*{0pt}empty ::~forall a.~Queue a;\\
\hspace*{0pt}empty = AQueue [] [];\\
\hspace*{0pt}\\
\hspace*{0pt}dequeue ::~forall a.~Queue a -> (Maybe a,~Queue a);\\
\hspace*{0pt}dequeue (AQueue [] []) = (Nothing,~AQueue [] []);\\
\hspace*{0pt}dequeue (AQueue xs (y :~ys)) = (Just y,~AQueue xs ys);\\
\hspace*{0pt}dequeue (AQueue (v :~va) []) =\\
\hspace*{0pt} ~let {\char123}\\
\hspace*{0pt} ~~~(y :~ys) = reverse (v :~va);\\
\hspace*{0pt} ~{\char125}~in (Just y,~AQueue [] ys);\\
\hspace*{0pt}\\
\hspace*{0pt}enqueue ::~forall a.~a -> Queue a -> Queue a;\\
\hspace*{0pt}enqueue x (AQueue xs ys) = AQueue (x :~xs) ys;\\
\hspace*{0pt}\\
\hspace*{0pt}{\char125}%
\end{isamarkuptext}%
\isamarkuptrue%
%
\endisatagquote
{\isafoldquote}%
%
\isadelimquote
%
\endisadelimquote
%
\begin{isamarkuptext}%
\noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command;
  for more details see \secref{sec:further}.%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isamarkupsubsection{If something utterly fails%
}
\isamarkuptrue%
%
\begin{isamarkuptext}%
Under certain circumstances, the code generator fails to produce
  code entirely.  

  \begin{description}

    \ditem{generate only one module}

    \ditem{check with a different target language}

    \ditem{inspect code equations}

    \ditem{inspect preprocessor setup}

    \ditem{generate exceptions}

    \ditem{remove offending code equations}

  \end{description}%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isamarkupsubsection{Code generator architecture \label{sec:concept}%
}
\isamarkuptrue%
%
\begin{isamarkuptext}%
What you have seen so far should be already enough in a lot of cases.  If you
  are content with this, you can quit reading here.  Anyway, in order to customise
  and adapt the code generator, it is necessary to gain some understanding
  how it works.

  \begin{figure}[h]
    \includegraphics{architecture}
    \caption{Code generator architecture}
    \label{fig:arch}
  \end{figure}

  The code generator employs a notion of executability
  for three foundational executable ingredients known
  from functional programming:
  \emph{code equations}, \emph{datatypes}, and
  \emph{type classes}.  A code equation as a first approximation
  is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t}
  (an equation headed by a constant \isa{f} with arguments
  \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}).
  Code generation aims to turn code equations
  into a functional program.  This is achieved by three major
  components which operate sequentially, i.e. the result of one is
  the input
  of the next in the chain,  see figure \ref{fig:arch}:

  \begin{itemize}

    \item The starting point is a collection of raw code equations in a
      theory. It is not relevant where they
      stem from, but typically they were either produced by specification
      tools or proved explicitly by the user.
      
    \item These raw code equations can be subjected to theorem transformations.  This
      \qn{preprocessor} can apply the full
      expressiveness of ML-based theorem transformations to code
      generation.  The result of preprocessing is a
      structured collection of code equations.

    \item These code equations are \qn{translated} to a program in an
      abstract intermediate language.  Think of it as a kind
      of \qt{Mini-Haskell} with four \qn{statements}: \isa{data}
      (for datatypes), \isa{fun} (stemming from code equations),
      also \isa{class} and \isa{inst} (for type classes).

    \item Finally, the abstract program is \qn{serialised} into concrete
      source code of a target language.
      This step only produces concrete syntax but does not change the
      program in essence; all conceptual transformations occur in the
      translation step.

  \end{itemize}

  \noindent From these steps, only the two last are carried out outside the logic;  by
  keeping this layer as thin as possible, the amount of code to trust is
  kept to a minimum.%
\end{isamarkuptext}%
\isamarkuptrue%
%
\isadelimtheory
%
\endisadelimtheory
%
\isatagtheory
\isacommand{end}\isamarkupfalse%
%
\endisatagtheory
{\isafoldtheory}%
%
\isadelimtheory
%
\endisadelimtheory
\isanewline
\end{isabellebody}%
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "root"
%%% End: