--- a/doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy Fri Oct 20 10:44:34 2006 +0200
+++ b/doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy Fri Oct 20 10:44:35 2006 +0200
@@ -9,39 +9,394 @@
section {* Introduction *}
+subsection {* Motivation *}
+
text {*
- \cite{isa-tutorial}
-*} (* add graphics here *)
+ Executing formal specifications as programs is a well-established
+ topic in the theorem proving community. With increasing
+ application of theorem proving systems in the area of
+ software development and verification, its relevance manifests
+ for running test cases and rapid prototyping. In logical
+ calculi like constructive type theory,
+ a notion of executability is implicit due to the nature
+ of the calculus. In contrast, specifications in Isabelle/HOL
+ can be highly non-executable. In order to bridge
+ the gap between logic and executable specifications,
+ an explicit non-trivial transformation has to be applied:
+ code generation.
+
+ This tutorial introduces a generic code generator for the
+ Isabelle system \cite{isa-tutorial}.
+ Generic in the sense that the
+ \qn{target language} for which code shall ultimately be
+ generated is not fixed but may be an arbitrary state-of-the-art
+ functional programming language (currently, the implementation
+ supports SML \cite{web:sml} and Haskell \cite{web:haskell}).
+ We aim to provide a
+ versatile environment
+ suitable for software development and verification,
+ structuring the process
+ of code generation into a small set of orthogonal principles
+ while achieving a big coverage of application areas
+ with maximum flexibility.
+*}
+
+
+subsection {* Overview *}
+
+text {*
+ The code generator aims to be usable with no further ado
+ in most cases while allowing for detailed customization.
+ This manifests in the structure of this tutorial: this introduction
+ continous with a short introduction of concepts. Section
+ \secref{sec:basics} explains how to use the framework naivly,
+ presuming a reasonable default setup. Then, section
+ \secref{sec:advanced} deals with advanced topics,
+ introducing further aspects of the code generator framework
+ in a motivation-driven manner. Last, section \secref{sec:ml}
+ introduces the framework's internal programming interfaces.
-section {* Basics *}
+ \begin{warn}
+ Ultimately, the code generator which this tutorial deals with
+ is supposed to replace the already established code generator
+ by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
+ So, for the momennt, there are two distinct code generators
+ in Isabelle.
+ Also note that while the framework itself is largely
+ object-logic independent, only HOL provides a reasonable
+ framework setup.
+ \end{warn}
+*}
+
+
+subsection {* Code generation process *}
+
+text {*
+ The code generator employs a notion of executability
+ for three foundational executable ingredients known
+ from functional programming:
+ \emph{function equations}, \emph{datatypes}, and
+ \emph{type classes}. A function equation as a first approximation
+ is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"}
+ (an equation headed by a constant @{text f} with arguments
+ @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right hand side @{text t}.
+ Code generation aims to turn function equations
+ into a functional program by running through
+ a process (see figure \ref{fig:process}):
+
+ \begin{itemize}
+
+ \item Out of the vast collection of theorems proven in a
+ \qn{theory}, a reasonable subset modeling
+ function equations is \qn{selected}.
+
+ \item On those selected theorems, certain
+ transformations are carried out
+ (\qn{preprocessing}). Their purpose is to turn theorems
+ representing non- or badly executable
+ specifications into equivalent but executable counterparts.
+ The result is a structured collection of \qn{code theorems}.
+
+ \item These \qn{code theorems} then are extracted
+ into an Haskell-like intermediate
+ language.
+
+ \item Finally, out of the intermediate language the final
+ code in the desired \qn{target language} is \qn{serialized}.
+
+ \end{itemize}
+
+ \begin{figure}[h]
+ \centering
+ \includegraphics[width=0.3\textwidth]{codegen_process}
+ \caption{code generator -- processing overview}
+ \label{fig:process}
+ \end{figure}
+
+ From these steps, only the two last are carried out
+ outside the logic; by keeping this layer as
+ thin as possible, the amount of code to trust is
+ kept to a minimum.
+*}
+
+
+
+section {* Basics \label{sec:basics} *}
subsection {* Invoking the code generator *}
+text {*
+ Thanks to a reasonable setup of the HOL theories, in
+ most cases code generation proceeds without further ado:
+*}
+
+consts
+ fac :: "nat \<Rightarrow> nat"
+
+primrec
+ "fac 0 = 1"
+ "fac (Suc n) = Suc n * fac n"
+
+text {*
+ This executable specification is now turned to SML code:
+*}
+
+code_gen fac (SML "examples/fac.ML")
+
+text {*
+ The \isasymCODEGEN command takes a space-seperated list of
+ constants together with \qn{serialization directives}
+ in parentheses. These start with a \qn{target language}
+ identifer, followed by arguments, their semantics
+ depending on the target. In the SML case, a filename
+ is given where to write the generated code to.
+
+ Internally, the function equations for all selected
+ constants are taken, including any tranitivly required
+ constants, datatypes and classes, resulting in the following
+ code:
+
+ \lstsml{Thy/examples/fac.ML}
+
+ The code generator will complain when a required
+ ingredient does not provide a executable counterpart.
+ This is the case if an involved type is not a datatype:
+*}
+
+(*<*)
+setup {* Sign.add_path "foo" *}
+(*>*)
+
+typedecl 'a foo
+
+definition
+ bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a"
+ "bar x y = y"
+
+(*<*)
+hide type foo
+hide const bar
+
+setup {* Sign.parent_path *}
+
+datatype 'a foo = Foo
+
+definition
+ bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a"
+ "bar x y = y"
+(*>*)
+
+code_gen bar (SML "examples/fail_type.ML")
+
+text {*
+ \noindent will result in an error. Likewise, generating code
+ for constants not yielding
+ a function equation will fail, e.g.~the Hilbert choice
+ operation @{text "SOME"}:
+*}
+
+(*<*)
+setup {* Sign.add_path "foo" *}
+(*>*)
+
+definition
+ pick_some :: "'a list \<Rightarrow> 'a"
+ "pick_some xs = (SOME x. x \<in> set xs)"
+
+(*<*)
+hide const pick_some
+
+setup {* Sign.parent_path *}
+
+definition
+ pick_some :: "'a list \<Rightarrow> 'a"
+ "pick_some = hd"
+(*>*)
+
+code_gen pick_some (SML "examples/fail_const.ML")
+
subsection {* Theorem selection *}
-(* print_codethms, code func, default setup code nofunc *)
+text {*
+ The list of all function equations in a theory may be inspected
+ using the \isasymPRINTCODETHMS command:
+*}
+
+print_codethms
+
+text {*
+ \noindent which displays a table of constant with corresponding
+ function equations (the additional stuff displayed
+ shall not bother us for the moment). If this table does
+ not provide at least one function
+ equation, the table of primititve definitions is searched
+ whether it provides one.
+
+ The typical HOL tools are already set up in a way that
+ function definitions introduced by \isasymFUN, \isasymFUNCTION,
+ \isasymPRIMREC, \isasymRECDEF are implicitly propagated
+ to this function equation table. Specific theorems may be
+ selected using an attribute: \emph{code func}. As example,
+ a weight selector function:
+*}
+
+consts
+ pick :: "(nat \<times> 'a) list \<Rightarrow> nat \<Rightarrow> 'a"
+
+primrec
+ "pick (x#xs) n = (let (k, v) = x in
+ if n < k then v else pick xs (n - k))"
+
+text {*
+ We want to eliminate the explicit destruction
+ of @{term x} to @{term "(k, v)"}:
+*}
+
+lemma [code func]:
+ "pick ((k, v)#xs) n = (if n < k then v else pick xs (n - k))"
+ by simp
+
+code_gen pick (SML "examples/pick1.ML")
+
+text {*
+ This theorem is now added to the function equation table:
+
+ \lstsml{Thy/examples/pick1.ML}
+
+ It might be convenient to remove the pointless original
+ equation, using the \emph{nofunc} attribute:
+*}
+
+lemmas [code nofunc] = pick.simps
+
+code_gen pick (SML "examples/pick2.ML")
+
+text {*
+ \lstsml{Thy/examples/pick2.ML}
+
+ Syntactic redundancies are implicitly dropped. For example,
+ using a modified version of the @{const fac} function
+ as function equation, the then redundant (since
+ syntactically more subsumed) original function equations
+ are dropped, resulting in a warning:
+*}
+
+lemma [code func]:
+ "fac n = (case n of 0 \<Rightarrow> 1 | Suc m \<Rightarrow> n * fac m)"
+ by (cases n) simp_all
+
+code_gen fac (SML "examples/fac_case.ML")
+
+text {*
+ \lstsml{Thy/examples/fac_case.ML}
+
+ \begin{warn}
+ Some statements in this section have to be treated with some
+ caution. First, since the HOL function package is still
+ under development, its setup with respect to code generation
+ may differ from what is presumed here.
+ Further, the attributes \emph{code} and \emph{code del}
+ associated with the existing code generator also apply to
+ the new one: \emph{code} implies \emph{code func},
+ and \emph{code del} implies \emph{code nofunc}.
+ \end{warn}
+*}
subsection {* Type classes *}
+text {*
+ Type classes enter the game via the Isar class package.
+ For a short introduction how to use it, see \fixme[ref];
+ here we just illustrate its relation on code generation.
+
+ In a target language, type classes may be represented
+ nativly (as in the case of Haskell). For languages
+ like SML, they are implemented using \emph{dictionaries}.
+ Our following example specified a class \qt{null},
+ assigning to each of its inhabitants a \qt{null} value:
+*}
+
+class null =
+ fixes null :: 'a
+
+consts
+ head :: "'a\<Colon>null list \<Rightarrow> 'a"
+
+primrec
+ "head [] = null"
+ "head (x#xs) = x"
+
+text {*
+ We provide some instances for our @{text null}:
+*}
+
+instance option :: (type) null
+ "null \<equiv> None" ..
+
+instance list :: (type) null
+ "null \<equiv> []" ..
+
+text {*
+ Constructing a dummy example:
+*}
+
+definition
+ "dummy = head [Some (Suc 0), None]"
+
+text {*
+ Type classes offer a suitable occassion to introduce
+ the Haskell serializer. Its usage is almost the same
+ as SML, but, in accordance with conventions some
+ common Haskell compilers enforce, each module ends
+ up in a single file which the file given by the user
+ then imports. The module hierarchy is reflected in
+ the file system.
+*}
+
+code_gen dummy (Haskell "examples/codegen.hs")
+ (* NOTE: you may use Haskell only once in this document *)
+
+text {*
+ \lsthaskell{Thy/examples/Codegen.hs}
+
+ (we have left out all other modules).
+
+ The whole code in SML with explicit dictionary passing:
+*}
+
+code_gen dummy (SML "examples/class.ML")
+
+text {*
+ \lstsml{Thy/examples/class.ML}
+*}
+
+subsection {* Incremental code generation *}
+
+(* print_codethms (\<dots>) and code_gen, 2 *)
+
+
+section {* Recipes and advanced topics \label{sec:advanced} *}
+
+(* no reference, IsarRef, but see paper *)
+
+subsection {* Library theories *}
+
subsection {* Preprocessing *}
(* preprocessing, print_codethms () *)
-subsection {* Incremental code generation *}
+subsection {* Customizing serialization *}
-(* print_codethms (\<dots>) *)
-
+(* existing libraries, understanding the type game, reflexive equations, code inline code_constsubst, code_abstype*)
-section {* Customizing serialization *}
-
-section {* Recipes and advanced topics *}
+section {* ML interfaces \label{sec:ml} *}
-(* understanding the type game, reflexive equations, code inline code_constsubst, code_abstype*)
-
-section {* ML interfaces *}
+(* under developement *)
subsection {* codegen\_data.ML *}
subsection {* Implementing code generator applications *}
+(* hooks *)
+
end
--- a/doc-src/IsarAdvanced/Codegen/Thy/ROOT.ML Fri Oct 20 10:44:34 2006 +0200
+++ b/doc-src/IsarAdvanced/Codegen/Thy/ROOT.ML Fri Oct 20 10:44:35 2006 +0200
@@ -1,4 +1,6 @@
(* $Id$ *)
+CodegenSerializer.sml_code_width := 74;
+
use_thy "Codegen";
--- a/doc-src/IsarAdvanced/Codegen/codegen.tex Fri Oct 20 10:44:34 2006 +0200
+++ b/doc-src/IsarAdvanced/Codegen/codegen.tex Fri Oct 20 10:44:35 2006 +0200
@@ -3,6 +3,7 @@
\documentclass[12pt,a4paper,fleqn]{report}
\usepackage{latexsym,graphicx}
+\usepackage{listings}
\usepackage[refpage]{nomencl}
\usepackage{../../iman,../../extra,../../isar,../../proof}
\usepackage{Thy/document/isabelle,Thy/document/isabellesym}
@@ -31,6 +32,12 @@
\newcommand{\isasymSHOW}{\cmd{show}}
\newcommand{\isasymNOTE}{\cmd{note}}
\newcommand{\isasymIN}{\cmd{in}}
+\newcommand{\isasymCODEGEN}{\cmd{code\_gen}}
+\newcommand{\isasymPRINTCODETHMS}{\cmd{print\_codethms}}
+\newcommand{\isasymFUN}{\cmd{fun}}
+\newcommand{\isasymFUNCTION}{\cmd{function}}
+\newcommand{\isasymPRIMREC}{\cmd{primrec}}
+\newcommand{\isasymRECDEF}{\cmd{recdef}}
\newcommand{\qt}[1]{``#1''}
\newcommand{\qtt}[1]{"{}{#1}"{}}
@@ -38,6 +45,10 @@
\newcommand{\strong}[1]{{\bfseries #1}}
\newcommand{\fixme}[1][!]{\strong{FIXME: #1}}
+\lstset{basicstyle=\scriptsize\ttfamily, keywordstyle=\itshape, commentstyle=\itshape\sffamily, frame=shadowbox}
+\newcommand{\lstsml}[1]{\lstset{language=ML}\lstinputlisting{#1}}
+\newcommand{\lsthaskell}[1]{\lstset{language=Haskell}\lstinputlisting{#1}}
+
\hyphenation{Isabelle}
\hyphenation{Isar}
@@ -52,7 +63,10 @@
\maketitle
\begin{abstract}
- \fixme
+ This tutorial gives a motivation-driven introduction
+ to a generic code generator framework in Isabelle
+ for generating executable code in functional
+ programming languages from logical specifications.
\end{abstract}
\thispagestyle{empty}\clearpage