doc-src/IsarAdvanced/Codegen/Thy/document/Introduction.tex
author haftmann
Mon Nov 03 14:15:25 2008 +0100 (2008-11-03)
changeset 28714 1992553cccfe
parent 28636 d5342d4c7360
child 28727 185110a4b97a
permissions -rw-r--r--
improved verbatim mechanism
     1 %
     2 \begin{isabellebody}%
     3 \def\isabellecontext{Introduction}%
     4 %
     5 \isadelimtheory
     6 %
     7 \endisadelimtheory
     8 %
     9 \isatagtheory
    10 \isacommand{theory}\isamarkupfalse%
    11 \ Introduction\isanewline
    12 \isakeyword{imports}\ Setup\isanewline
    13 \isakeyword{begin}%
    14 \endisatagtheory
    15 {\isafoldtheory}%
    16 %
    17 \isadelimtheory
    18 %
    19 \endisadelimtheory
    20 %
    21 \isamarkupchapter{Code generation from \isa{Isabelle{\isacharslash}HOL} theories%
    22 }
    23 \isamarkuptrue%
    24 %
    25 \isamarkupsection{Introduction and Overview%
    26 }
    27 \isamarkuptrue%
    28 %
    29 \begin{isamarkuptext}%
    30 This tutorial introduces a generic code generator for the
    31   \isa{Isabelle} system.
    32   Generic in the sense that the
    33   \qn{target language} for which code shall ultimately be
    34   generated is not fixed but may be an arbitrary state-of-the-art
    35   functional programming language (currently, the implementation
    36   supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell}
    37   \cite{haskell-revised-report}).
    38 
    39   Conceptually the code generator framework is part
    40   of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic
    41   \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}}
    42   already comes with a reasonable framework setup and thus provides
    43   a good working horse for raising code-generation-driven
    44   applications.  So, we assume some familiarity and experience
    45   with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories.
    46   (see also \cite{isa-tutorial}).
    47 
    48   The code generator aims to be usable with no further ado
    49   in most cases while allowing for detailed customisation.
    50   This manifests in the structure of this tutorial: after a short
    51   conceptual introduction with an example (\secref{sec:intro}),
    52   we discuss the generic customisation facilities (\secref{sec:program}).
    53   A further section (\secref{sec:adaption}) is dedicated to the matter of
    54   \qn{adaption} to specific target language environments.  After some
    55   further issues (\secref{sec:further}) we conclude with an overview
    56   of some ML programming interfaces (\secref{sec:ml}).
    57 
    58   \begin{warn}
    59     Ultimately, the code generator which this tutorial deals with
    60     is supposed to replace the existing code generator
    61     by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
    62     So, for the moment, there are two distinct code generators
    63     in Isabelle.  In case of ambiguity, we will refer to the framework
    64     described here as \isa{generic\ code\ generator}, to the
    65     other as \isa{SML\ code\ generator}.
    66     Also note that while the framework itself is
    67     object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable
    68     framework setup.    
    69   \end{warn}%
    70 \end{isamarkuptext}%
    71 \isamarkuptrue%
    72 %
    73 \isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}%
    74 }
    75 \isamarkuptrue%
    76 %
    77 \begin{isamarkuptext}%
    78 The key concept for understanding \isa{Isabelle}'s code generation is
    79   \emph{shallow embedding}, i.e.~logical entities like constants, types and
    80   classes are identified with corresponding concepts in the target language.
    81 
    82   Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and
    83   \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form
    84   the core of a functional programming language.  The default code generator setup
    85   allows to turn those into functional programs immediately.
    86   This means that \qt{naive} code generation can proceed without further ado.
    87   For example, here a simple \qt{implementation} of amortised queues:%
    88 \end{isamarkuptext}%
    89 \isamarkuptrue%
    90 %
    91 \isadelimquote
    92 %
    93 \endisadelimquote
    94 %
    95 \isatagquote
    96 \isacommand{datatype}\isamarkupfalse%
    97 \ {\isacharprime}a\ queue\ {\isacharequal}\ Queue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline
    98 \isanewline
    99 \isacommand{definition}\isamarkupfalse%
   100 \ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   101 \ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline
   102 \isanewline
   103 \isacommand{primrec}\isamarkupfalse%
   104 \ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   105 \ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}Queue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ Queue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline
   106 \isanewline
   107 \isacommand{fun}\isamarkupfalse%
   108 \ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
   109 \ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline
   110 \ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ Queue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline
   111 \ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}Queue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline
   112 \ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ Queue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}%
   113 \endisatagquote
   114 {\isafoldquote}%
   115 %
   116 \isadelimquote
   117 %
   118 \endisadelimquote
   119 %
   120 \begin{isamarkuptext}%
   121 \noindent Then we can generate code e.g.~for \isa{SML} as follows:%
   122 \end{isamarkuptext}%
   123 \isamarkuptrue%
   124 %
   125 \isadelimquote
   126 %
   127 \endisadelimquote
   128 %
   129 \isatagquote
   130 \isacommand{export{\isacharunderscore}code}\isamarkupfalse%
   131 \ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline
   132 \ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}%
   133 \endisatagquote
   134 {\isafoldquote}%
   135 %
   136 \isadelimquote
   137 %
   138 \endisadelimquote
   139 %
   140 \begin{isamarkuptext}%
   141 \noindent resulting in the following code:%
   142 \end{isamarkuptext}%
   143 \isamarkuptrue%
   144 %
   145 \isadelimquote
   146 %
   147 \endisadelimquote
   148 %
   149 \isatagquote
   150 %
   151 \begin{isamarkuptext}%
   152 \isaverbatim%
   153 \noindent%
   154 \hspace*{0pt}structure Example = \\
   155 \hspace*{0pt}struct\\
   156 \hspace*{0pt}\\
   157 \hspace*{0pt}fun foldl f a [] = a\\
   158 \hspace*{0pt} ~| foldl f a (x ::~xs) = foldl f (f a x) xs;\\
   159 \hspace*{0pt}\\
   160 \hspace*{0pt}fun rev xs = foldl (fn xsa => fn x => x ::~xsa) [] xs;\\
   161 \hspace*{0pt}\\
   162 \hspace*{0pt}fun list{\char95}case f1 f2 (a ::~lista) = f2 a lista\\
   163 \hspace*{0pt} ~| list{\char95}case f1 f2 [] = f1;\\
   164 \hspace*{0pt}\\
   165 \hspace*{0pt}datatype 'a queue = Queue of 'a list * 'a list;\\
   166 \hspace*{0pt}\\
   167 \hspace*{0pt}val empty :~'a queue = Queue ([], []);\\
   168 \hspace*{0pt}\\
   169 \hspace*{0pt}fun dequeue (Queue ([], [])) = (NONE, Queue ([], []))\\
   170 \hspace*{0pt} ~| dequeue (Queue (xs, y ::~ys)) = (SOME y, Queue (xs, ys))\\
   171 \hspace*{0pt} ~| dequeue (Queue (v ::~va, [])) =\\
   172 \hspace*{0pt} ~~~let\\
   173 \hspace*{0pt} ~~~~~val y ::~ys = rev (v ::~va);\\
   174 \hspace*{0pt} ~~~in\\
   175 \hspace*{0pt} ~~~~~(SOME y, Queue ([], ys))\\
   176 \hspace*{0pt} ~~~end;\\
   177 \hspace*{0pt}\\
   178 \hspace*{0pt}fun enqueue x (Queue (xs, ys)) = Queue (x ::~xs, ys);\\
   179 \hspace*{0pt}\\
   180 \hspace*{0pt}end; (*struct Example*)%
   181 \end{isamarkuptext}%
   182 \isamarkuptrue%
   183 %
   184 \endisatagquote
   185 {\isafoldquote}%
   186 %
   187 \isadelimquote
   188 %
   189 \endisadelimquote
   190 %
   191 \begin{isamarkuptext}%
   192 \noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of
   193   constants for which code shall be generated;  anything else needed for those
   194   is added implicitly.  Then follows a target language identifier
   195   (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name.
   196   A file name denotes the destination to store the generated code.  Note that
   197   the semantics of the destination depends on the target language:  for
   198   \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell}
   199   it denotes a \emph{directory} where a file named as the module name
   200   (with extension \isa{{\isachardot}hs}) is written:%
   201 \end{isamarkuptext}%
   202 \isamarkuptrue%
   203 %
   204 \isadelimquote
   205 %
   206 \endisadelimquote
   207 %
   208 \isatagquote
   209 \isacommand{export{\isacharunderscore}code}\isamarkupfalse%
   210 \ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline
   211 \ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}%
   212 \endisatagquote
   213 {\isafoldquote}%
   214 %
   215 \isadelimquote
   216 %
   217 \endisadelimquote
   218 %
   219 \begin{isamarkuptext}%
   220 \noindent This is how the corresponding code in \isa{Haskell} looks like:%
   221 \end{isamarkuptext}%
   222 \isamarkuptrue%
   223 %
   224 \isadelimquote
   225 %
   226 \endisadelimquote
   227 %
   228 \isatagquote
   229 %
   230 \begin{isamarkuptext}%
   231 \isaverbatim%
   232 \noindent%
   233 \hspace*{0pt}module Example where {\char123}\\
   234 \hspace*{0pt}\\
   235 \hspace*{0pt}\\
   236 \hspace*{0pt}foldla ::~forall a b. (a -> b -> a) -> a -> [b] -> a;\\
   237 \hspace*{0pt}foldla f a [] = a;\\
   238 \hspace*{0pt}foldla f a (x :~xs) = foldla f (f a x) xs;\\
   239 \hspace*{0pt}\\
   240 \hspace*{0pt}rev ::~forall a. [a] -> [a];\\
   241 \hspace*{0pt}rev xs = foldla ({\char92}~xsa x -> x :~xsa) [] xs;\\
   242 \hspace*{0pt}\\
   243 \hspace*{0pt}list{\char95}case ::~forall t a. t -> (a -> [a] -> t) -> [a] -> t;\\
   244 \hspace*{0pt}list{\char95}case f1 f2 (a :~list) = f2 a list;\\
   245 \hspace*{0pt}list{\char95}case f1 f2 [] = f1;\\
   246 \hspace*{0pt}\\
   247 \hspace*{0pt}data Queue a = Queue [a] [a];\\
   248 \hspace*{0pt}\\
   249 \hspace*{0pt}empty ::~forall a. Queue a;\\
   250 \hspace*{0pt}empty = Queue [] [];\\
   251 \hspace*{0pt}\\
   252 \hspace*{0pt}dequeue ::~forall a. Queue a -> (Maybe a, Queue a);\\
   253 \hspace*{0pt}dequeue (Queue [] []) = (Nothing, Queue [] []);\\
   254 \hspace*{0pt}dequeue (Queue xs (y :~ys)) = (Just y, Queue xs ys);\\
   255 \hspace*{0pt}dequeue (Queue (v :~va) []) =\\
   256 \hspace*{0pt} ~let {\char123}\\
   257 \hspace*{0pt} ~~~(y :~ys) = rev (v :~va);\\
   258 \hspace*{0pt} ~{\char125}~in (Just y, Queue [] ys);\\
   259 \hspace*{0pt}\\
   260 \hspace*{0pt}enqueue ::~forall a. a -> Queue a -> Queue a;\\
   261 \hspace*{0pt}enqueue x (Queue xs ys) = Queue (x :~xs) ys;\\
   262 \hspace*{0pt}\\
   263 \hspace*{0pt}{\char125}%
   264 \end{isamarkuptext}%
   265 \isamarkuptrue%
   266 %
   267 \endisatagquote
   268 {\isafoldquote}%
   269 %
   270 \isadelimquote
   271 %
   272 \endisadelimquote
   273 %
   274 \begin{isamarkuptext}%
   275 \noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command;
   276   for more details see \secref{sec:further}.%
   277 \end{isamarkuptext}%
   278 \isamarkuptrue%
   279 %
   280 \isamarkupsubsection{Code generator architecture \label{sec:concept}%
   281 }
   282 \isamarkuptrue%
   283 %
   284 \begin{isamarkuptext}%
   285 What you have seen so far should be already enough in a lot of cases.  If you
   286   are content with this, you can quit reading here.  Anyway, in order to customise
   287   and adapt the code generator, it is inevitable to gain some understanding
   288   how it works.
   289 
   290   \begin{figure}[h]
   291     \begin{tikzpicture}[x = 4.2cm, y = 1cm]
   292       \tikzstyle entity=[rounded corners, draw, thick, color = black, fill = white];
   293       \tikzstyle process=[ellipse, draw, thick, color = green, fill = white];
   294       \tikzstyle process_arrow=[->, semithick, color = green];
   295       \node (HOL) at (0, 4) [style=entity] {\isa{Isabelle{\isacharslash}HOL} theory};
   296       \node (eqn) at (2, 2) [style=entity] {defining equations};
   297       \node (iml) at (2, 0) [style=entity] {intermediate language};
   298       \node (seri) at (1, 0) [style=process] {serialisation};
   299       \node (SML) at (0, 3) [style=entity] {\isa{SML}};
   300       \node (OCaml) at (0, 2) [style=entity] {\isa{OCaml}};
   301       \node (further) at (0, 1) [style=entity] {\isa{{\isasymdots}}};
   302       \node (Haskell) at (0, 0) [style=entity] {\isa{Haskell}};
   303       \draw [style=process_arrow] (HOL) .. controls (2, 4) ..
   304         node [style=process, near start] {selection}
   305         node [style=process, near end] {preprocessing}
   306         (eqn);
   307       \draw [style=process_arrow] (eqn) -- node (transl) [style=process] {translation} (iml);
   308       \draw [style=process_arrow] (iml) -- (seri);
   309       \draw [style=process_arrow] (seri) -- (SML);
   310       \draw [style=process_arrow] (seri) -- (OCaml);
   311       \draw [style=process_arrow, dashed] (seri) -- (further);
   312       \draw [style=process_arrow] (seri) -- (Haskell);
   313     \end{tikzpicture}
   314     \caption{Code generator architecture}
   315     \label{fig:arch}
   316   \end{figure}
   317 
   318   The code generator employs a notion of executability
   319   for three foundational executable ingredients known
   320   from functional programming:
   321   \emph{defining equations}, \emph{datatypes}, and
   322   \emph{type classes}.  A defining equation as a first approximation
   323   is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t}
   324   (an equation headed by a constant \isa{f} with arguments
   325   \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}).
   326   Code generation aims to turn defining equations
   327   into a functional program.  This is achieved by three major
   328   components which operate sequentially, i.e. the result of one is
   329   the input
   330   of the next in the chain,  see diagram \ref{fig:arch}:
   331 
   332   \begin{itemize}
   333 
   334     \item Out of the vast collection of theorems proven in a
   335       \qn{theory}, a reasonable subset modelling
   336       defining equations is \qn{selected}.
   337 
   338     \item On those selected theorems, certain
   339       transformations are carried out
   340       (\qn{preprocessing}).  Their purpose is to turn theorems
   341       representing non- or badly executable
   342       specifications into equivalent but executable counterparts.
   343       The result is a structured collection of \qn{code theorems}.
   344 
   345     \item Before the selected defining equations are continued with,
   346       they can be \qn{preprocessed}, i.e. subjected to theorem
   347       transformations.  This \qn{preprocessor} is an interface which
   348       allows to apply
   349       the full expressiveness of ML-based theorem transformations
   350       to code generation;  motivating examples are shown below, see
   351       \secref{sec:preproc}.
   352       The result of the preprocessing step is a structured collection
   353       of defining equations.
   354 
   355     \item These defining equations are \qn{translated} to a program
   356       in an abstract intermediate language.  Think of it as a kind
   357       of \qt{Mini-Haskell} with four \qn{statements}: \isa{data}
   358       (for datatypes), \isa{fun} (stemming from defining equations),
   359       also \isa{class} and \isa{inst} (for type classes).
   360 
   361     \item Finally, the abstract program is \qn{serialised} into concrete
   362       source code of a target language.
   363 
   364   \end{itemize}
   365 
   366   \noindent From these steps, only the two last are carried out outside the logic;  by
   367   keeping this layer as thin as possible, the amount of code to trust is
   368   kept to a minimum.%
   369 \end{isamarkuptext}%
   370 \isamarkuptrue%
   371 %
   372 \isadelimtheory
   373 %
   374 \endisadelimtheory
   375 %
   376 \isatagtheory
   377 \isacommand{end}\isamarkupfalse%
   378 %
   379 \endisatagtheory
   380 {\isafoldtheory}%
   381 %
   382 \isadelimtheory
   383 %
   384 \endisadelimtheory
   385 \isanewline
   386 \end{isabellebody}%
   387 %%% Local Variables:
   388 %%% mode: latex
   389 %%% TeX-master: "root"
   390 %%% End: