isabelle: comparison doc-src/Codegen/Thy/Introduction.thy

equal deleted inserted replaced

-:8e4058f4848c
+:ffb1c5bf0425
 imports Setup
 begin
 section {* Introduction *}
-subsection {* Code generation fundamental: shallow embedding *}
+text {*
+This tutorial introduces the code generator facilities of @{text
-subsection {* A quick start with the @{text "Isabelle/HOL"} toolbox *}
+"Isabelle/HOL"}.  It allows to turn (a certain class of) HOL
+specifications into corresponding executable code in the programming
-subsection {* Type classes *}
+languages @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and
+@{text Haskell} \cite{haskell-revised-report}.
-subsection {* How to continue from here *}
+To profit from this tutorial, some familiarity and experience with
-subsection {* If something goes utterly wrong *}
+@{theory HOL} \cite{isa-tutorial} and its basic theories is assumed.
+*}
-text {*
-This tutorial introduces a generic code generator for the
-@{text Isabelle} system.
+subsection {* Code generation principle: shallow embedding \label{sec:principle} *}
-The
-\qn{target language} for which code is
+text {*
-generated is not fixed, but may be one of several
+The key concept for understanding Isabelle's code generation is
-functional programming languages (currently, the implementation
+\emph{shallow embedding}: logical entities like constants, types and
-supports @{text SML} \cite{SML}, @{text OCaml} \cite{OCaml} and @{text Haskell}
+classes are identified with corresponding entities in the target
-\cite{haskell-revised-report}).
+language.  In particular, the carrier of a generated program's
+semantics are \emph{equational theorems} from the logic.  If we view
-Conceptually the code generator framework is part
+a generated program as an implementation of a higher-order rewrite
-of Isabelle's @{theory Pure} meta logic framework; the logic
+system, then every rewrite step performed by the program can be
-@{theory HOL} \cite{isa-tutorial}, which is an extension of @{theory Pure},
+simulated in the logic, which guarantees partial correctness
-already comes with a reasonable framework setup and thus provides
+\cite{Haftmann-Nipkow:2010:code}.
-a good basis for creating code-generation-driven
+*}
-applications.  So, we assume some familiarity and experience
-with the ingredients of the @{theory HOL} distribution theories.
+subsection {* A quick start with the Isabelle/HOL toolbox \label{sec:queue_example} *}
-The code generator aims to be usable with no further ado
-in most cases, while allowing for detailed customisation.
+text {*
-This can be seen in the structure of this tutorial: after a short
+In a HOL theory, the @{command datatype} and @{command
-conceptual introduction with an example (\secref{sec:intro}),
+definition}/@{command primrec}/@{command fun} declarations form the
-we discuss the generic customisation facilities (\secref{sec:program}).
+core of a functional programming language.  By default equational
-A further section (\secref{sec:adaptation}) is dedicated to the matter of
+theorems stemming from those are used for generated code, therefore
-\qn{adaptation} to specific target language environments.  After some
+\qt{naive} code generation can proceed without further ado.
-further issues (\secref{sec:further}) we conclude with an overview
-of some ML programming interfaces (\secref{sec:ml}).
-\begin{warn}
-Ultimately, the code generator which this tutorial deals with
-is supposed to replace the existing code generator
-by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
-So, for the moment, there are two distinct code generators
-in Isabelle.  In case of ambiguity, we will refer to the framework
-described here as @{text "generic code generator"}, to the
-other as @{text "SML code generator"}.
-Also note that while the framework itself is
-object-logic independent, only @{theory HOL} provides a reasonable
-framework setup.
-\end{warn}
-*}
-subsection {* Code generation via shallow embedding \label{sec:intro} *}
-text {*
-The key concept for understanding @{text Isabelle}'s code generation is
-\emph{shallow embedding}, i.e.~logical entities like constants, types and
-classes are identified with corresponding concepts in the target language.
-Inside @{theory HOL}, the @{command datatype} and
-@{command definition}/@{command primrec}/@{command fun} declarations form
-the core of a functional programming language.  The default code generator setup
-transforms those into functional programs immediately.
-This means that \qt{naive} code generation can proceed without further ado.
 For example, here a simple \qt{implementation} of amortised queues:
 *}
 datatype %quote 'a queue = AQueue "'a list" "'a list"
 fun %quote dequeue :: "'a queue \<Rightarrow> 'a option \<times> 'a queue" where
 "dequeue (AQueue [] []) = (None, AQueue [] [])"
 | "dequeue (AQueue xs (y # ys)) = (Some y, AQueue xs ys)"
 | "dequeue (AQueue xs []) =
-(case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))"
+(case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))" (*<*)
+lemma %invisible dequeue_nonempty_Nil [simp]:
+"xs \<noteq> [] \<Longrightarrow> dequeue (AQueue xs []) = (case rev xs of y # ys \<Rightarrow> (Some y, AQueue [] ys))"
+by (cases xs) (simp_all split: list.splits) (*>*)
 text {* \noindent Then we can generate code e.g.~for @{text SML} as follows: *}
 export_code %quote empty dequeue enqueue in SML
 module_name Example file "examples/example.ML"
 text {* \noindent resulting in the following code: *}
 text %quote {*@{code_stmts empty enqueue dequeue (SML)}*}
 text {*
-\noindent The @{command export_code} command takes a space-separated list of
+\noindent The @{command export_code} command takes a space-separated
-constants for which code shall be generated;  anything else needed for those
+list of constants for which code shall be generated; anything else
-is added implicitly.  Then follows a target language identifier
+needed for those is added implicitly.  Then follows a target
-(@{text SML}, @{text OCaml} or @{text Haskell}) and a freely chosen module name.
+language identifier and a freely chosen module name.  A file name
-A file name denotes the destination to store the generated code.  Note that
+denotes the destination to store the generated code.  Note that the
-the semantics of the destination depends on the target language:  for
+semantics of the destination depends on the target language: for
-@{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text Haskell}
+@{text SML} and @{text OCaml} it denotes a \emph{file}, for @{text
-it denotes a \emph{directory} where a file named as the module name
+Haskell} it denotes a \emph{directory} where a file named as the
-(with extension @{text ".hs"}) is written:
+module name (with extension @{text ".hs"}) is written:
 *}
 export_code %quote empty dequeue enqueue in Haskell
 module_name Example file "examples/"
 text {*
-\noindent This is the corresponding code in @{text Haskell}:
+\noindent This is the corresponding code:
 *}
 text %quote {*@{code_stmts empty enqueue dequeue (Haskell)}*}
 text {*
-\noindent This demonstrates the basic usage of the @{command export_code} command;
+\noindent For more details about @{command export_code} see
-for more details see \secref{sec:further}.
+\secref{sec:further}.
 *}
-subsection {* If something utterly fails *}
+subsection {* Type classes *}
-text {*
-Under certain circumstances, the code generator fails to produce
+text {*
-code entirely.
+Code can also be generated from type classes in a Haskell-like
+manner.  For illustration here an example from abstract algebra:
-\begin{description}
+*}
-\ditem{generate only one module}
+class %quote semigroup =
+fixes mult :: "'a \<Rightarrow> 'a \<Rightarrow> 'a" (infixl "\<otimes>" 70)
-\ditem{check with a different target language}
+assumes assoc: "(x \<otimes> y) \<otimes> z = x \<otimes> (y \<otimes> z)"
-\ditem{inspect code equations}
+class %quote monoid = semigroup +
+fixes neutral :: 'a ("\<one>")
-\ditem{inspect preprocessor setup}
+assumes neutl: "\<one> \<otimes> x = x"
+and neutr: "x \<otimes> \<one> = x"
-\ditem{generate exceptions}
+instantiation %quote nat :: monoid
-\ditem{remove offending code equations}
+begin
-\end{description}
+primrec %quote mult_nat where
-*}
+"0 \<otimes> n = (0\<Colon>nat)"
+| "Suc m \<otimes> n = n + m \<otimes> n"
-subsection {* Code generator architecture \label{sec:concept} *}
+definition %quote neutral_nat where
-text {*
+"\<one> = Suc 0"
-What you have seen so far should be already enough in a lot of cases.  If you
-are content with this, you can quit reading here.  Anyway, in order to customise
+lemma %quote add_mult_distrib:
-and adapt the code generator, it is necessary to gain some understanding
+fixes n m q :: nat
-how it works.
+shows "(n + m) \<otimes> q = n \<otimes> q + m \<otimes> q"
+by (induct n) simp_all
-\begin{figure}[h]
-\includegraphics{architecture}
+instance %quote proof
-\caption{Code generator architecture}
+fix m n q :: nat
-\label{fig:arch}
+show "m \<otimes> n \<otimes> q = m \<otimes> (n \<otimes> q)"
-\end{figure}
+by (induct m) (simp_all add: add_mult_distrib)
+show "\<one> \<otimes> n = n"
-The code generator employs a notion of executability
+by (simp add: neutral_nat_def)
-for three foundational executable ingredients known
+show "m \<otimes> \<one> = m"
-from functional programming:
+by (induct m) (simp_all add: neutral_nat_def)
-\emph{code equations}, \emph{datatypes}, and
+qed
-\emph{type classes}.  A code equation as a first approximation
-is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
+end %quote
-(an equation headed by a constant @{text f} with arguments
-@{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}).
+text {*
-Code generation aims to turn code equations
+\noindent We define the natural operation of the natural numbers
-into a functional program.  This is achieved by three major
+on monoids:
-components which operate sequentially, i.e. the result of one is
+*}
-the input
-of the next in the chain,  see figure \ref{fig:arch}:
+primrec %quote (in monoid) pow :: "nat \<Rightarrow> 'a \<Rightarrow> 'a" where
+"pow 0 a = \<one>"
+| "pow (Suc n) a = a \<otimes> pow n a"
+text {*
+\noindent This we use to define the discrete exponentiation
+function:
+*}
+definition %quote bexp :: "nat \<Rightarrow> nat" where
+"bexp n = pow n (Suc (Suc 0))"
+text {*
+\noindent The corresponding code in Haskell uses that language's
+native classes:
+*}
+text %quote {*@{code_stmts bexp (Haskell)}*}
+text {*
+\noindent This is a convenient place to show how explicit dictionary
+construction manifests in generated code -- the same example in
+@{text SML}:
+*}
+text %quote {*@{code_stmts bexp (SML)}*}
+text {*
+\noindent Note the parameters with trailing underscore (@{verbatim
+"A_"}), which are the dictionary parameters.
+*}
+subsection {* How to continue from here *}
+text {*
+What you have seen so far should be already enough in a lot of
+cases.  If you are content with this, you can quit reading here.
+Anyway, to understand situations where problems occur or to increase
+the scope of code generation beyond default, it is necessary to gain
+some understanding how the code generator actually works:
 \begin{itemize}
-\item The starting point is a collection of raw code equations in a
+\item The foundations of the code generator are described in
-theory. It is not relevant where they
+\secref{sec:foundations}.
-stem from, but typically they were either produced by specification
-tools or proved explicitly by the user.
+\item In particular \secref{sec:utterly_wrong} gives hints how to
+debug situations where code generation does not succeed as
-\item These raw code equations can be subjected to theorem transformations.  This
+expected.
-\qn{preprocessor} can apply the full
-expressiveness of ML-based theorem transformations to code
+\item The scope and quality of generated code can be increased
-generation.  The result of preprocessing is a
+dramatically by applying refinement techniques, which are
-structured collection of code equations.
+introduced in \secref{sec:refinement}.
-\item These code equations are \qn{translated} to a program in an
+\item Inductive predicates can be turned executable using an
-abstract intermediate language.  Think of it as a kind
+extension of the code generator \secref{sec:inductive}.
-of \qt{Mini-Haskell} with four \qn{statements}: @{text data}
-(for datatypes), @{text fun} (stemming from code equations),
+\item You may want to skim over the more technical sections
-also @{text class} and @{text inst} (for type classes).
+\secref{sec:adaptation} and \secref{sec:further}.
-\item Finally, the abstract program is \qn{serialised} into concrete
+\item For exhaustive syntax diagrams etc. you should visit the
-source code of a target language.
+Isabelle/Isar Reference Manual \cite{isabelle-isar-ref}.
-This step only produces concrete syntax but does not change the
-program in essence; all conceptual transformations occur in the
-translation step.
 \end{itemize}
-\noindent From these steps, only the two last are carried out outside the logic;  by
+\bigskip
-keeping this layer as thin as possible, the amount of code to trust is
-kept to a minimum.
+\begin{center}\fbox{\fbox{\begin{minipage}{8cm}
+\begin{center}\textit{Happy proving, happy hacking!}\end{center}
+\end{minipage}}}\end{center}
+\begin{warn}
+There is also a more ancient code generator in Isabelle by Stefan
+Berghofer \cite{Berghofer-Nipkow:2002}.  Although its
+functionality is covered by the code generator presented here, it
+will sometimes show up as an artifact.  In case of ambiguity, we
+will refer to the framework described here as @{text "generic code
+generator"}, to the other as @{text "SML code generator"}.
+\end{warn}
 *}
 end

changeset 38437	ffb1c5bf0425
parent 38405	7935b334893e
child 38505	2f8699695cf6