# HG changeset patch # User haftmann # Date 1161333875 -7200 # Node ID a32d357dfd702e19f42942536f49ac6977afc01b # Parent c45591716692d761d75dc6535318b4b5a13967e0 started tutorial diff -r c45591716692 -r a32d357dfd70 doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy --- a/doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy Fri Oct 20 10:44:34 2006 +0200 +++ b/doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy Fri Oct 20 10:44:35 2006 +0200 @@ -9,39 +9,394 @@ section {* Introduction *} +subsection {* Motivation *} + text {* - \cite{isa-tutorial} -*} (* add graphics here *) + Executing formal specifications as programs is a well-established + topic in the theorem proving community. With increasing + application of theorem proving systems in the area of + software development and verification, its relevance manifests + for running test cases and rapid prototyping. In logical + calculi like constructive type theory, + a notion of executability is implicit due to the nature + of the calculus. In contrast, specifications in Isabelle/HOL + can be highly non-executable. In order to bridge + the gap between logic and executable specifications, + an explicit non-trivial transformation has to be applied: + code generation. + + This tutorial introduces a generic code generator for the + Isabelle system \cite{isa-tutorial}. + Generic in the sense that the + \qn{target language} for which code shall ultimately be + generated is not fixed but may be an arbitrary state-of-the-art + functional programming language (currently, the implementation + supports SML \cite{web:sml} and Haskell \cite{web:haskell}). + We aim to provide a + versatile environment + suitable for software development and verification, + structuring the process + of code generation into a small set of orthogonal principles + while achieving a big coverage of application areas + with maximum flexibility. +*} + + +subsection {* Overview *} + +text {* + The code generator aims to be usable with no further ado + in most cases while allowing for detailed customization. + This manifests in the structure of this tutorial: this introduction + continous with a short introduction of concepts. Section + \secref{sec:basics} explains how to use the framework naivly, + presuming a reasonable default setup. Then, section + \secref{sec:advanced} deals with advanced topics, + introducing further aspects of the code generator framework + in a motivation-driven manner. Last, section \secref{sec:ml} + introduces the framework's internal programming interfaces. -section {* Basics *} + \begin{warn} + Ultimately, the code generator which this tutorial deals with + is supposed to replace the already established code generator + by Stefan Berghofer \cite{Berghofer-Nipkow:2002}. + So, for the momennt, there are two distinct code generators + in Isabelle. + Also note that while the framework itself is largely + object-logic independent, only HOL provides a reasonable + framework setup. + \end{warn} +*} + + +subsection {* Code generation process *} + +text {* + The code generator employs a notion of executability + for three foundational executable ingredients known + from functional programming: + \emph{function equations}, \emph{datatypes}, and + \emph{type classes}. A function equation as a first approximation + is a theorem of the form @{text "f t\<^isub>1 t\<^isub>2 \ t\<^isub>n \ t"} + (an equation headed by a constant @{text f} with arguments + @{text "t\<^isub>1 t\<^isub>2 \ t\<^isub>n"} and right hand side @{text t}. + Code generation aims to turn function equations + into a functional program by running through + a process (see figure \ref{fig:process}): + + \begin{itemize} + + \item Out of the vast collection of theorems proven in a + \qn{theory}, a reasonable subset modeling + function equations is \qn{selected}. + + \item On those selected theorems, certain + transformations are carried out + (\qn{preprocessing}). Their purpose is to turn theorems + representing non- or badly executable + specifications into equivalent but executable counterparts. + The result is a structured collection of \qn{code theorems}. + + \item These \qn{code theorems} then are extracted + into an Haskell-like intermediate + language. + + \item Finally, out of the intermediate language the final + code in the desired \qn{target language} is \qn{serialized}. + + \end{itemize} + + \begin{figure}[h] + \centering + \includegraphics[width=0.3\textwidth]{codegen_process} + \caption{code generator -- processing overview} + \label{fig:process} + \end{figure} + + From these steps, only the two last are carried out + outside the logic; by keeping this layer as + thin as possible, the amount of code to trust is + kept to a minimum. +*} + + + +section {* Basics \label{sec:basics} *} subsection {* Invoking the code generator *} +text {* + Thanks to a reasonable setup of the HOL theories, in + most cases code generation proceeds without further ado: +*} + +consts + fac :: "nat \ nat" + +primrec + "fac 0 = 1" + "fac (Suc n) = Suc n * fac n" + +text {* + This executable specification is now turned to SML code: +*} + +code_gen fac (SML "examples/fac.ML") + +text {* + The \isasymCODEGEN command takes a space-seperated list of + constants together with \qn{serialization directives} + in parentheses. These start with a \qn{target language} + identifer, followed by arguments, their semantics + depending on the target. In the SML case, a filename + is given where to write the generated code to. + + Internally, the function equations for all selected + constants are taken, including any tranitivly required + constants, datatypes and classes, resulting in the following + code: + + \lstsml{Thy/examples/fac.ML} + + The code generator will complain when a required + ingredient does not provide a executable counterpart. + This is the case if an involved type is not a datatype: +*} + +(*<*) +setup {* Sign.add_path "foo" *} +(*>*) + +typedecl 'a foo + +definition + bar :: "'a foo \ 'a \ 'a" + "bar x y = y" + +(*<*) +hide type foo +hide const bar + +setup {* Sign.parent_path *} + +datatype 'a foo = Foo + +definition + bar :: "'a foo \ 'a \ 'a" + "bar x y = y" +(*>*) + +code_gen bar (SML "examples/fail_type.ML") + +text {* + \noindent will result in an error. Likewise, generating code + for constants not yielding + a function equation will fail, e.g.~the Hilbert choice + operation @{text "SOME"}: +*} + +(*<*) +setup {* Sign.add_path "foo" *} +(*>*) + +definition + pick_some :: "'a list \ 'a" + "pick_some xs = (SOME x. x \ set xs)" + +(*<*) +hide const pick_some + +setup {* Sign.parent_path *} + +definition + pick_some :: "'a list \ 'a" + "pick_some = hd" +(*>*) + +code_gen pick_some (SML "examples/fail_const.ML") + subsection {* Theorem selection *} -(* print_codethms, code func, default setup code nofunc *) +text {* + The list of all function equations in a theory may be inspected + using the \isasymPRINTCODETHMS command: +*} + +print_codethms + +text {* + \noindent which displays a table of constant with corresponding + function equations (the additional stuff displayed + shall not bother us for the moment). If this table does + not provide at least one function + equation, the table of primititve definitions is searched + whether it provides one. + + The typical HOL tools are already set up in a way that + function definitions introduced by \isasymFUN, \isasymFUNCTION, + \isasymPRIMREC, \isasymRECDEF are implicitly propagated + to this function equation table. Specific theorems may be + selected using an attribute: \emph{code func}. As example, + a weight selector function: +*} + +consts + pick :: "(nat \ 'a) list \ nat \ 'a" + +primrec + "pick (x#xs) n = (let (k, v) = x in + if n < k then v else pick xs (n - k))" + +text {* + We want to eliminate the explicit destruction + of @{term x} to @{term "(k, v)"}: +*} + +lemma [code func]: + "pick ((k, v)#xs) n = (if n < k then v else pick xs (n - k))" + by simp + +code_gen pick (SML "examples/pick1.ML") + +text {* + This theorem is now added to the function equation table: + + \lstsml{Thy/examples/pick1.ML} + + It might be convenient to remove the pointless original + equation, using the \emph{nofunc} attribute: +*} + +lemmas [code nofunc] = pick.simps + +code_gen pick (SML "examples/pick2.ML") + +text {* + \lstsml{Thy/examples/pick2.ML} + + Syntactic redundancies are implicitly dropped. For example, + using a modified version of the @{const fac} function + as function equation, the then redundant (since + syntactically more subsumed) original function equations + are dropped, resulting in a warning: +*} + +lemma [code func]: + "fac n = (case n of 0 \ 1 | Suc m \ n * fac m)" + by (cases n) simp_all + +code_gen fac (SML "examples/fac_case.ML") + +text {* + \lstsml{Thy/examples/fac_case.ML} + + \begin{warn} + Some statements in this section have to be treated with some + caution. First, since the HOL function package is still + under development, its setup with respect to code generation + may differ from what is presumed here. + Further, the attributes \emph{code} and \emph{code del} + associated with the existing code generator also apply to + the new one: \emph{code} implies \emph{code func}, + and \emph{code del} implies \emph{code nofunc}. + \end{warn} +*} subsection {* Type classes *} +text {* + Type classes enter the game via the Isar class package. + For a short introduction how to use it, see \fixme[ref]; + here we just illustrate its relation on code generation. + + In a target language, type classes may be represented + nativly (as in the case of Haskell). For languages + like SML, they are implemented using \emph{dictionaries}. + Our following example specified a class \qt{null}, + assigning to each of its inhabitants a \qt{null} value: +*} + +class null = + fixes null :: 'a + +consts + head :: "'a\null list \ 'a" + +primrec + "head [] = null" + "head (x#xs) = x" + +text {* + We provide some instances for our @{text null}: +*} + +instance option :: (type) null + "null \ None" .. + +instance list :: (type) null + "null \ []" .. + +text {* + Constructing a dummy example: +*} + +definition + "dummy = head [Some (Suc 0), None]" + +text {* + Type classes offer a suitable occassion to introduce + the Haskell serializer. Its usage is almost the same + as SML, but, in accordance with conventions some + common Haskell compilers enforce, each module ends + up in a single file which the file given by the user + then imports. The module hierarchy is reflected in + the file system. +*} + +code_gen dummy (Haskell "examples/codegen.hs") + (* NOTE: you may use Haskell only once in this document *) + +text {* + \lsthaskell{Thy/examples/Codegen.hs} + + (we have left out all other modules). + + The whole code in SML with explicit dictionary passing: +*} + +code_gen dummy (SML "examples/class.ML") + +text {* + \lstsml{Thy/examples/class.ML} +*} + +subsection {* Incremental code generation *} + +(* print_codethms (\) and code_gen, 2 *) + + +section {* Recipes and advanced topics \label{sec:advanced} *} + +(* no reference, IsarRef, but see paper *) + +subsection {* Library theories *} + subsection {* Preprocessing *} (* preprocessing, print_codethms () *) -subsection {* Incremental code generation *} +subsection {* Customizing serialization *} -(* print_codethms (\) *) - +(* existing libraries, understanding the type game, reflexive equations, code inline code_constsubst, code_abstype*) -section {* Customizing serialization *} - -section {* Recipes and advanced topics *} +section {* ML interfaces \label{sec:ml} *} -(* understanding the type game, reflexive equations, code inline code_constsubst, code_abstype*) - -section {* ML interfaces *} +(* under developement *) subsection {* codegen\_data.ML *} subsection {* Implementing code generator applications *} +(* hooks *) + end diff -r c45591716692 -r a32d357dfd70 doc-src/IsarAdvanced/Codegen/Thy/ROOT.ML --- a/doc-src/IsarAdvanced/Codegen/Thy/ROOT.ML Fri Oct 20 10:44:34 2006 +0200 +++ b/doc-src/IsarAdvanced/Codegen/Thy/ROOT.ML Fri Oct 20 10:44:35 2006 +0200 @@ -1,4 +1,6 @@ (* $Id$ *) +CodegenSerializer.sml_code_width := 74; + use_thy "Codegen"; diff -r c45591716692 -r a32d357dfd70 doc-src/IsarAdvanced/Codegen/codegen.tex --- a/doc-src/IsarAdvanced/Codegen/codegen.tex Fri Oct 20 10:44:34 2006 +0200 +++ b/doc-src/IsarAdvanced/Codegen/codegen.tex Fri Oct 20 10:44:35 2006 +0200 @@ -3,6 +3,7 @@ \documentclass[12pt,a4paper,fleqn]{report} \usepackage{latexsym,graphicx} +\usepackage{listings} \usepackage[refpage]{nomencl} \usepackage{../../iman,../../extra,../../isar,../../proof} \usepackage{Thy/document/isabelle,Thy/document/isabellesym} @@ -31,6 +32,12 @@ \newcommand{\isasymSHOW}{\cmd{show}} \newcommand{\isasymNOTE}{\cmd{note}} \newcommand{\isasymIN}{\cmd{in}} +\newcommand{\isasymCODEGEN}{\cmd{code\_gen}} +\newcommand{\isasymPRINTCODETHMS}{\cmd{print\_codethms}} +\newcommand{\isasymFUN}{\cmd{fun}} +\newcommand{\isasymFUNCTION}{\cmd{function}} +\newcommand{\isasymPRIMREC}{\cmd{primrec}} +\newcommand{\isasymRECDEF}{\cmd{recdef}} \newcommand{\qt}[1]{``#1''} \newcommand{\qtt}[1]{"{}{#1}"{}} @@ -38,6 +45,10 @@ \newcommand{\strong}[1]{{\bfseries #1}} \newcommand{\fixme}[1][!]{\strong{FIXME: #1}} +\lstset{basicstyle=\scriptsize\ttfamily, keywordstyle=\itshape, commentstyle=\itshape\sffamily, frame=shadowbox} +\newcommand{\lstsml}[1]{\lstset{language=ML}\lstinputlisting{#1}} +\newcommand{\lsthaskell}[1]{\lstset{language=Haskell}\lstinputlisting{#1}} + \hyphenation{Isabelle} \hyphenation{Isar} @@ -52,7 +63,10 @@ \maketitle \begin{abstract} - \fixme + This tutorial gives a motivation-driven introduction + to a generic code generator framework in Isabelle + for generating executable code in functional + programming languages from logical specifications. \end{abstract} \thispagestyle{empty}\clearpage