doc-src/IsarAdvanced/Codegen/Thy/document/Introduction.tex
changeset 30226 2f4684e2ea95
parent 30202 2775062fd3a9
child 30227 853abb4853cc
equal deleted inserted replaced
30202:2775062fd3a9 30226:2f4684e2ea95
     1 %
       
     2 \begin{isabellebody}%
       
     3 \def\isabellecontext{Introduction}%
       
     4 %
       
     5 \isadelimtheory
       
     6 %
       
     7 \endisadelimtheory
       
     8 %
       
     9 \isatagtheory
       
    10 \isacommand{theory}\isamarkupfalse%
       
    11 \ Introduction\isanewline
       
    12 \isakeyword{imports}\ Setup\isanewline
       
    13 \isakeyword{begin}%
       
    14 \endisatagtheory
       
    15 {\isafoldtheory}%
       
    16 %
       
    17 \isadelimtheory
       
    18 %
       
    19 \endisadelimtheory
       
    20 %
       
    21 \isamarkupchapter{Code generation from \isa{Isabelle{\isacharslash}HOL} theories%
       
    22 }
       
    23 \isamarkuptrue%
       
    24 %
       
    25 \isamarkupsection{Introduction and Overview%
       
    26 }
       
    27 \isamarkuptrue%
       
    28 %
       
    29 \begin{isamarkuptext}%
       
    30 This tutorial introduces a generic code generator for the
       
    31   \isa{Isabelle} system.
       
    32   Generic in the sense that the
       
    33   \qn{target language} for which code shall ultimately be
       
    34   generated is not fixed but may be an arbitrary state-of-the-art
       
    35   functional programming language (currently, the implementation
       
    36   supports \isa{SML} \cite{SML}, \isa{OCaml} \cite{OCaml} and \isa{Haskell}
       
    37   \cite{haskell-revised-report}).
       
    38 
       
    39   Conceptually the code generator framework is part
       
    40   of Isabelle's \hyperlink{theory.Pure}{\mbox{\isa{Pure}}} meta logic framework; the logic
       
    41   \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} which is an extension of \hyperlink{theory.Pure}{\mbox{\isa{Pure}}}
       
    42   already comes with a reasonable framework setup and thus provides
       
    43   a good working horse for raising code-generation-driven
       
    44   applications.  So, we assume some familiarity and experience
       
    45   with the ingredients of the \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} distribution theories.
       
    46   (see also \cite{isa-tutorial}).
       
    47 
       
    48   The code generator aims to be usable with no further ado
       
    49   in most cases while allowing for detailed customisation.
       
    50   This manifests in the structure of this tutorial: after a short
       
    51   conceptual introduction with an example (\secref{sec:intro}),
       
    52   we discuss the generic customisation facilities (\secref{sec:program}).
       
    53   A further section (\secref{sec:adaption}) is dedicated to the matter of
       
    54   \qn{adaption} to specific target language environments.  After some
       
    55   further issues (\secref{sec:further}) we conclude with an overview
       
    56   of some ML programming interfaces (\secref{sec:ml}).
       
    57 
       
    58   \begin{warn}
       
    59     Ultimately, the code generator which this tutorial deals with
       
    60     is supposed to replace the existing code generator
       
    61     by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
       
    62     So, for the moment, there are two distinct code generators
       
    63     in Isabelle.  In case of ambiguity, we will refer to the framework
       
    64     described here as \isa{generic\ code\ generator}, to the
       
    65     other as \isa{SML\ code\ generator}.
       
    66     Also note that while the framework itself is
       
    67     object-logic independent, only \hyperlink{theory.HOL}{\mbox{\isa{HOL}}} provides a reasonable
       
    68     framework setup.    
       
    69   \end{warn}%
       
    70 \end{isamarkuptext}%
       
    71 \isamarkuptrue%
       
    72 %
       
    73 \isamarkupsubsection{Code generation via shallow embedding \label{sec:intro}%
       
    74 }
       
    75 \isamarkuptrue%
       
    76 %
       
    77 \begin{isamarkuptext}%
       
    78 The key concept for understanding \isa{Isabelle}'s code generation is
       
    79   \emph{shallow embedding}, i.e.~logical entities like constants, types and
       
    80   classes are identified with corresponding concepts in the target language.
       
    81 
       
    82   Inside \hyperlink{theory.HOL}{\mbox{\isa{HOL}}}, the \hyperlink{command.datatype}{\mbox{\isa{\isacommand{datatype}}}} and
       
    83   \hyperlink{command.definition}{\mbox{\isa{\isacommand{definition}}}}/\hyperlink{command.primrec}{\mbox{\isa{\isacommand{primrec}}}}/\hyperlink{command.fun}{\mbox{\isa{\isacommand{fun}}}} declarations form
       
    84   the core of a functional programming language.  The default code generator setup
       
    85   allows to turn those into functional programs immediately.
       
    86   This means that \qt{naive} code generation can proceed without further ado.
       
    87   For example, here a simple \qt{implementation} of amortised queues:%
       
    88 \end{isamarkuptext}%
       
    89 \isamarkuptrue%
       
    90 %
       
    91 \isadelimquote
       
    92 %
       
    93 \endisadelimquote
       
    94 %
       
    95 \isatagquote
       
    96 \isacommand{datatype}\isamarkupfalse%
       
    97 \ {\isacharprime}a\ queue\ {\isacharequal}\ AQueue\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\ {\isachardoublequoteopen}{\isacharprime}a\ list{\isachardoublequoteclose}\isanewline
       
    98 \isanewline
       
    99 \isacommand{definition}\isamarkupfalse%
       
   100 \ empty\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
       
   101 \ \ {\isachardoublequoteopen}empty\ {\isacharequal}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isachardoublequoteclose}\isanewline
       
   102 \isanewline
       
   103 \isacommand{primrec}\isamarkupfalse%
       
   104 \ enqueue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
       
   105 \ \ {\isachardoublequoteopen}enqueue\ x\ {\isacharparenleft}AQueue\ xs\ ys{\isacharparenright}\ {\isacharequal}\ AQueue\ {\isacharparenleft}x\ {\isacharhash}\ xs{\isacharparenright}\ ys{\isachardoublequoteclose}\isanewline
       
   106 \isanewline
       
   107 \isacommand{fun}\isamarkupfalse%
       
   108 \ dequeue\ {\isacharcolon}{\isacharcolon}\ {\isachardoublequoteopen}{\isacharprime}a\ queue\ {\isasymRightarrow}\ {\isacharprime}a\ option\ {\isasymtimes}\ {\isacharprime}a\ queue{\isachardoublequoteclose}\ \isakeyword{where}\isanewline
       
   109 \ \ \ \ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}None{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}{\isachardoublequoteclose}\isanewline
       
   110 \ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharparenleft}y\ {\isacharhash}\ ys{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ xs\ ys{\isacharparenright}{\isachardoublequoteclose}\isanewline
       
   111 \ \ {\isacharbar}\ {\isachardoublequoteopen}dequeue\ {\isacharparenleft}AQueue\ xs\ {\isacharbrackleft}{\isacharbrackright}{\isacharparenright}\ {\isacharequal}\isanewline
       
   112 \ \ \ \ \ \ {\isacharparenleft}case\ rev\ xs\ of\ y\ {\isacharhash}\ ys\ {\isasymRightarrow}\ {\isacharparenleft}Some\ y{\isacharcomma}\ AQueue\ {\isacharbrackleft}{\isacharbrackright}\ ys{\isacharparenright}{\isacharparenright}{\isachardoublequoteclose}%
       
   113 \endisatagquote
       
   114 {\isafoldquote}%
       
   115 %
       
   116 \isadelimquote
       
   117 %
       
   118 \endisadelimquote
       
   119 %
       
   120 \begin{isamarkuptext}%
       
   121 \noindent Then we can generate code e.g.~for \isa{SML} as follows:%
       
   122 \end{isamarkuptext}%
       
   123 \isamarkuptrue%
       
   124 %
       
   125 \isadelimquote
       
   126 %
       
   127 \endisadelimquote
       
   128 %
       
   129 \isatagquote
       
   130 \isacommand{export{\isacharunderscore}code}\isamarkupfalse%
       
   131 \ empty\ dequeue\ enqueue\ \isakeyword{in}\ SML\isanewline
       
   132 \ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}example{\isachardot}ML{\isachardoublequoteclose}%
       
   133 \endisatagquote
       
   134 {\isafoldquote}%
       
   135 %
       
   136 \isadelimquote
       
   137 %
       
   138 \endisadelimquote
       
   139 %
       
   140 \begin{isamarkuptext}%
       
   141 \noindent resulting in the following code:%
       
   142 \end{isamarkuptext}%
       
   143 \isamarkuptrue%
       
   144 %
       
   145 \isadelimquote
       
   146 %
       
   147 \endisadelimquote
       
   148 %
       
   149 \isatagquote
       
   150 %
       
   151 \begin{isamarkuptext}%
       
   152 \isatypewriter%
       
   153 \noindent%
       
   154 \hspace*{0pt}structure Example = \\
       
   155 \hspace*{0pt}struct\\
       
   156 \hspace*{0pt}\\
       
   157 \hspace*{0pt}fun foldl f a [] = a\\
       
   158 \hspace*{0pt} ~| foldl f a (x ::~xs) = foldl f (f a x) xs;\\
       
   159 \hspace*{0pt}\\
       
   160 \hspace*{0pt}fun rev xs = foldl (fn xsa => fn x => x ::~xsa) [] xs;\\
       
   161 \hspace*{0pt}\\
       
   162 \hspace*{0pt}fun list{\char95}case f1 f2 (a ::~lista) = f2 a lista\\
       
   163 \hspace*{0pt} ~| list{\char95}case f1 f2 [] = f1;\\
       
   164 \hspace*{0pt}\\
       
   165 \hspace*{0pt}datatype 'a queue = AQueue of 'a list * 'a list;\\
       
   166 \hspace*{0pt}\\
       
   167 \hspace*{0pt}val empty :~'a queue = AQueue ([],~[])\\
       
   168 \hspace*{0pt}\\
       
   169 \hspace*{0pt}fun dequeue (AQueue ([],~[])) = (NONE,~AQueue ([],~[]))\\
       
   170 \hspace*{0pt} ~| dequeue (AQueue (xs,~y ::~ys)) = (SOME y,~AQueue (xs,~ys))\\
       
   171 \hspace*{0pt} ~| dequeue (AQueue (v ::~va,~[])) =\\
       
   172 \hspace*{0pt} ~~~let\\
       
   173 \hspace*{0pt} ~~~~~val y ::~ys = rev (v ::~va);\\
       
   174 \hspace*{0pt} ~~~in\\
       
   175 \hspace*{0pt} ~~~~~(SOME y,~AQueue ([],~ys))\\
       
   176 \hspace*{0pt} ~~~end;\\
       
   177 \hspace*{0pt}\\
       
   178 \hspace*{0pt}fun enqueue x (AQueue (xs,~ys)) = AQueue (x ::~xs,~ys);\\
       
   179 \hspace*{0pt}\\
       
   180 \hspace*{0pt}end;~(*struct Example*)%
       
   181 \end{isamarkuptext}%
       
   182 \isamarkuptrue%
       
   183 %
       
   184 \endisatagquote
       
   185 {\isafoldquote}%
       
   186 %
       
   187 \isadelimquote
       
   188 %
       
   189 \endisadelimquote
       
   190 %
       
   191 \begin{isamarkuptext}%
       
   192 \noindent The \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command takes a space-separated list of
       
   193   constants for which code shall be generated;  anything else needed for those
       
   194   is added implicitly.  Then follows a target language identifier
       
   195   (\isa{SML}, \isa{OCaml} or \isa{Haskell}) and a freely chosen module name.
       
   196   A file name denotes the destination to store the generated code.  Note that
       
   197   the semantics of the destination depends on the target language:  for
       
   198   \isa{SML} and \isa{OCaml} it denotes a \emph{file}, for \isa{Haskell}
       
   199   it denotes a \emph{directory} where a file named as the module name
       
   200   (with extension \isa{{\isachardot}hs}) is written:%
       
   201 \end{isamarkuptext}%
       
   202 \isamarkuptrue%
       
   203 %
       
   204 \isadelimquote
       
   205 %
       
   206 \endisadelimquote
       
   207 %
       
   208 \isatagquote
       
   209 \isacommand{export{\isacharunderscore}code}\isamarkupfalse%
       
   210 \ empty\ dequeue\ enqueue\ \isakeyword{in}\ Haskell\isanewline
       
   211 \ \ \isakeyword{module{\isacharunderscore}name}\ Example\ \isakeyword{file}\ {\isachardoublequoteopen}examples{\isacharslash}{\isachardoublequoteclose}%
       
   212 \endisatagquote
       
   213 {\isafoldquote}%
       
   214 %
       
   215 \isadelimquote
       
   216 %
       
   217 \endisadelimquote
       
   218 %
       
   219 \begin{isamarkuptext}%
       
   220 \noindent This is how the corresponding code in \isa{Haskell} looks like:%
       
   221 \end{isamarkuptext}%
       
   222 \isamarkuptrue%
       
   223 %
       
   224 \isadelimquote
       
   225 %
       
   226 \endisadelimquote
       
   227 %
       
   228 \isatagquote
       
   229 %
       
   230 \begin{isamarkuptext}%
       
   231 \isatypewriter%
       
   232 \noindent%
       
   233 \hspace*{0pt}module Example where {\char123}\\
       
   234 \hspace*{0pt}\\
       
   235 \hspace*{0pt}\\
       
   236 \hspace*{0pt}foldla ::~forall a b.~(a -> b -> a) -> a -> [b] -> a;\\
       
   237 \hspace*{0pt}foldla f a [] = a;\\
       
   238 \hspace*{0pt}foldla f a (x :~xs) = foldla f (f a x) xs;\\
       
   239 \hspace*{0pt}\\
       
   240 \hspace*{0pt}rev ::~forall a.~[a] -> [a];\\
       
   241 \hspace*{0pt}rev xs = foldla ({\char92}~xsa x -> x :~xsa) [] xs;\\
       
   242 \hspace*{0pt}\\
       
   243 \hspace*{0pt}list{\char95}case ::~forall t a.~t -> (a -> [a] -> t) -> [a] -> t;\\
       
   244 \hspace*{0pt}list{\char95}case f1 f2 (a :~list) = f2 a list;\\
       
   245 \hspace*{0pt}list{\char95}case f1 f2 [] = f1;\\
       
   246 \hspace*{0pt}\\
       
   247 \hspace*{0pt}data Queue a = AQueue [a] [a];\\
       
   248 \hspace*{0pt}\\
       
   249 \hspace*{0pt}empty ::~forall a.~Queue a;\\
       
   250 \hspace*{0pt}empty = AQueue [] [];\\
       
   251 \hspace*{0pt}\\
       
   252 \hspace*{0pt}dequeue ::~forall a.~Queue a -> (Maybe a,~Queue a);\\
       
   253 \hspace*{0pt}dequeue (AQueue [] []) = (Nothing,~AQueue [] []);\\
       
   254 \hspace*{0pt}dequeue (AQueue xs (y :~ys)) = (Just y,~AQueue xs ys);\\
       
   255 \hspace*{0pt}dequeue (AQueue (v :~va) []) =\\
       
   256 \hspace*{0pt} ~let {\char123}\\
       
   257 \hspace*{0pt} ~~~(y :~ys) = rev (v :~va);\\
       
   258 \hspace*{0pt} ~{\char125}~in (Just y,~AQueue [] ys);\\
       
   259 \hspace*{0pt}\\
       
   260 \hspace*{0pt}enqueue ::~forall a.~a -> Queue a -> Queue a;\\
       
   261 \hspace*{0pt}enqueue x (AQueue xs ys) = AQueue (x :~xs) ys;\\
       
   262 \hspace*{0pt}\\
       
   263 \hspace*{0pt}{\char125}%
       
   264 \end{isamarkuptext}%
       
   265 \isamarkuptrue%
       
   266 %
       
   267 \endisatagquote
       
   268 {\isafoldquote}%
       
   269 %
       
   270 \isadelimquote
       
   271 %
       
   272 \endisadelimquote
       
   273 %
       
   274 \begin{isamarkuptext}%
       
   275 \noindent This demonstrates the basic usage of the \hyperlink{command.export-code}{\mbox{\isa{\isacommand{export{\isacharunderscore}code}}}} command;
       
   276   for more details see \secref{sec:further}.%
       
   277 \end{isamarkuptext}%
       
   278 \isamarkuptrue%
       
   279 %
       
   280 \isamarkupsubsection{Code generator architecture \label{sec:concept}%
       
   281 }
       
   282 \isamarkuptrue%
       
   283 %
       
   284 \begin{isamarkuptext}%
       
   285 What you have seen so far should be already enough in a lot of cases.  If you
       
   286   are content with this, you can quit reading here.  Anyway, in order to customise
       
   287   and adapt the code generator, it is inevitable to gain some understanding
       
   288   how it works.
       
   289 
       
   290   \begin{figure}[h]
       
   291     \begin{tikzpicture}[x = 4.2cm, y = 1cm]
       
   292       \tikzstyle entity=[rounded corners, draw, thick, color = black, fill = white];
       
   293       \tikzstyle process=[ellipse, draw, thick, color = green, fill = white];
       
   294       \tikzstyle process_arrow=[->, semithick, color = green];
       
   295       \node (HOL) at (0, 4) [style=entity] {\isa{Isabelle{\isacharslash}HOL} theory};
       
   296       \node (eqn) at (2, 2) [style=entity] {code equations};
       
   297       \node (iml) at (2, 0) [style=entity] {intermediate language};
       
   298       \node (seri) at (1, 0) [style=process] {serialisation};
       
   299       \node (SML) at (0, 3) [style=entity] {\isa{SML}};
       
   300       \node (OCaml) at (0, 2) [style=entity] {\isa{OCaml}};
       
   301       \node (further) at (0, 1) [style=entity] {\isa{{\isasymdots}}};
       
   302       \node (Haskell) at (0, 0) [style=entity] {\isa{Haskell}};
       
   303       \draw [style=process_arrow] (HOL) .. controls (2, 4) ..
       
   304         node [style=process, near start] {selection}
       
   305         node [style=process, near end] {preprocessing}
       
   306         (eqn);
       
   307       \draw [style=process_arrow] (eqn) -- node (transl) [style=process] {translation} (iml);
       
   308       \draw [style=process_arrow] (iml) -- (seri);
       
   309       \draw [style=process_arrow] (seri) -- (SML);
       
   310       \draw [style=process_arrow] (seri) -- (OCaml);
       
   311       \draw [style=process_arrow, dashed] (seri) -- (further);
       
   312       \draw [style=process_arrow] (seri) -- (Haskell);
       
   313     \end{tikzpicture}
       
   314     \caption{Code generator architecture}
       
   315     \label{fig:arch}
       
   316   \end{figure}
       
   317 
       
   318   The code generator employs a notion of executability
       
   319   for three foundational executable ingredients known
       
   320   from functional programming:
       
   321   \emph{code equations}, \emph{datatypes}, and
       
   322   \emph{type classes}.  A code equation as a first approximation
       
   323   is a theorem of the form \isa{f\ t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n\ {\isasymequiv}\ t}
       
   324   (an equation headed by a constant \isa{f} with arguments
       
   325   \isa{t\isactrlisub {\isadigit{1}}\ t\isactrlisub {\isadigit{2}}\ {\isasymdots}\ t\isactrlisub n} and right hand side \isa{t}).
       
   326   Code generation aims to turn code equations
       
   327   into a functional program.  This is achieved by three major
       
   328   components which operate sequentially, i.e. the result of one is
       
   329   the input
       
   330   of the next in the chain,  see diagram \ref{fig:arch}:
       
   331 
       
   332   \begin{itemize}
       
   333 
       
   334     \item Out of the vast collection of theorems proven in a
       
   335       \qn{theory}, a reasonable subset modelling
       
   336       code equations is \qn{selected}.
       
   337 
       
   338     \item On those selected theorems, certain
       
   339       transformations are carried out
       
   340       (\qn{preprocessing}).  Their purpose is to turn theorems
       
   341       representing non- or badly executable
       
   342       specifications into equivalent but executable counterparts.
       
   343       The result is a structured collection of \qn{code theorems}.
       
   344 
       
   345     \item Before the selected code equations are continued with,
       
   346       they can be \qn{preprocessed}, i.e. subjected to theorem
       
   347       transformations.  This \qn{preprocessor} is an interface which
       
   348       allows to apply
       
   349       the full expressiveness of ML-based theorem transformations
       
   350       to code generation;  motivating examples are shown below, see
       
   351       \secref{sec:preproc}.
       
   352       The result of the preprocessing step is a structured collection
       
   353       of code equations.
       
   354 
       
   355     \item These code equations are \qn{translated} to a program
       
   356       in an abstract intermediate language.  Think of it as a kind
       
   357       of \qt{Mini-Haskell} with four \qn{statements}: \isa{data}
       
   358       (for datatypes), \isa{fun} (stemming from code equations),
       
   359       also \isa{class} and \isa{inst} (for type classes).
       
   360 
       
   361     \item Finally, the abstract program is \qn{serialised} into concrete
       
   362       source code of a target language.
       
   363 
       
   364   \end{itemize}
       
   365 
       
   366   \noindent From these steps, only the two last are carried out outside the logic;  by
       
   367   keeping this layer as thin as possible, the amount of code to trust is
       
   368   kept to a minimum.%
       
   369 \end{isamarkuptext}%
       
   370 \isamarkuptrue%
       
   371 %
       
   372 \isadelimtheory
       
   373 %
       
   374 \endisadelimtheory
       
   375 %
       
   376 \isatagtheory
       
   377 \isacommand{end}\isamarkupfalse%
       
   378 %
       
   379 \endisatagtheory
       
   380 {\isafoldtheory}%
       
   381 %
       
   382 \isadelimtheory
       
   383 %
       
   384 \endisadelimtheory
       
   385 \isanewline
       
   386 \end{isabellebody}%
       
   387 %%% Local Variables:
       
   388 %%% mode: latex
       
   389 %%% TeX-master: "root"
       
   390 %%% End: