doc-src/Codegen/Thy/Foundations.thy
changeset 48951 b9238cbcdd41
parent 48950 9965099f51ad
child 48952 29562708e05c
--- a/doc-src/Codegen/Thy/Foundations.thy	Mon Aug 27 22:31:16 2012 +0200
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,347 +0,0 @@
-theory Foundations
-imports Introduction
-begin
-
-section {* Code generation foundations \label{sec:foundations} *}
-
-subsection {* Code generator architecture \label{sec:architecture} *}
-
-text {*
-  The code generator is actually a framework consisting of different
-  components which can be customised individually.
-
-  Conceptually all components operate on Isabelle's logic framework
-  @{theory Pure}.  Practically, the object logic @{theory HOL}
-  provides the necessary facilities to make use of the code generator,
-  mainly since it is an extension of @{theory Pure}.
-
-  The constellation of the different components is visualized in the
-  following picture.
-
-  \begin{figure}[h]
-    \includegraphics{architecture}
-    \caption{Code generator architecture}
-    \label{fig:arch}
-  \end{figure}
-
-  \noindent Central to code generation is the notion of \emph{code
-  equations}.  A code equation as a first approximation is a theorem
-  of the form @{text "f t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n \<equiv> t"} (an equation headed by a
-  constant @{text f} with arguments @{text "t\<^isub>1 t\<^isub>2 \<dots> t\<^isub>n"} and right
-  hand side @{text t}).
-
-  \begin{itemize}
-
-    \item Starting point of code generation is a collection of (raw)
-      code equations in a theory. It is not relevant where they stem
-      from, but typically they were either produced by specification
-      tools or proved explicitly by the user.
-      
-    \item These raw code equations can be subjected to theorem
-      transformations.  This \qn{preprocessor} (see
-      \secref{sec:preproc}) can apply the full expressiveness of
-      ML-based theorem transformations to code generation.  The result
-      of preprocessing is a structured collection of code equations.
-
-    \item These code equations are \qn{translated} to a program in an
-      abstract intermediate language.  Think of it as a kind of
-      \qt{Mini-Haskell} with four \qn{statements}: @{text data} (for
-      datatypes), @{text fun} (stemming from code equations), also
-      @{text class} and @{text inst} (for type classes).
-
-    \item Finally, the abstract program is \qn{serialised} into
-      concrete source code of a target language.  This step only
-      produces concrete syntax but does not change the program in
-      essence; all conceptual transformations occur in the translation
-      step.
-
-  \end{itemize}
-
-  \noindent From these steps, only the last two are carried out
-  outside the logic; by keeping this layer as thin as possible, the
-  amount of code to trust is kept to a minimum.
-*}
-
-
-subsection {* The preprocessor \label{sec:preproc} *}
-
-text {*
-  Before selected function theorems are turned into abstract code, a
-  chain of definitional transformation steps is carried out:
-  \emph{preprocessing}.  The preprocessor consists of two
-  components: a \emph{simpset} and \emph{function transformers}.
-
-  The \emph{simpset} can apply the full generality of the Isabelle
-  simplifier.  Due to the interpretation of theorems as code
-  equations, rewrites are applied to the right hand side and the
-  arguments of the left hand side of an equation, but never to the
-  constant heading the left hand side.  An important special case are
-  \emph{unfold theorems}, which may be declared and removed using the
-  @{attribute code_unfold} or \emph{@{attribute code_unfold} del}
-  attribute, respectively.
-
-  Some common applications:
-*}
-
-text_raw {*
-  \begin{itemize}
-*}
-
-text {*
-     \item replacing non-executable constructs by executable ones:
-*}     
-
-lemma %quote [code_unfold]:
-  "x \<in> set xs \<longleftrightarrow> List.member xs x" by (fact in_set_member)
-
-text {*
-     \item replacing executable but inconvenient constructs:
-*}
-
-lemma %quote [code_unfold]:
-  "xs = [] \<longleftrightarrow> List.null xs" by (fact eq_Nil_null)
-
-text {*
-     \item eliminating disturbing expressions:
-*}
-
-lemma %quote [code_unfold]:
-  "1 = Suc 0" by (fact One_nat_def)
-
-text_raw {*
-  \end{itemize}
-*}
-
-text {*
-  \noindent \emph{Function transformers} provide a very general
-  interface, transforming a list of function theorems to another list
-  of function theorems, provided that neither the heading constant nor
-  its type change.  The @{term "0\<Colon>nat"} / @{const Suc} pattern
-  elimination implemented in theory @{text Efficient_Nat} (see
-  \secref{eff_nat}) uses this interface.
-
-  \noindent The current setup of the preprocessor may be inspected
-  using the @{command_def print_codeproc} command.  @{command_def
-  code_thms} (see \secref{sec:equations}) provides a convenient
-  mechanism to inspect the impact of a preprocessor setup on code
-  equations.
-*}
-
-
-subsection {* Understanding code equations \label{sec:equations} *}
-
-text {*
-  As told in \secref{sec:principle}, the notion of code equations is
-  vital to code generation.  Indeed most problems which occur in
-  practice can be resolved by an inspection of the underlying code
-  equations.
-
-  It is possible to exchange the default code equations for constants
-  by explicitly proving alternative ones:
-*}
-
-lemma %quote [code]:
-  "dequeue (AQueue xs []) =
-     (if xs = [] then (None, AQueue [] [])
-       else dequeue (AQueue [] (rev xs)))"
-  "dequeue (AQueue xs (y # ys)) =
-     (Some y, AQueue xs ys)"
-  by (cases xs, simp_all) (cases "rev xs", simp_all)
-
-text {*
-  \noindent The annotation @{text "[code]"} is an @{text attribute}
-  which states that the given theorems should be considered as code
-  equations for a @{text fun} statement -- the corresponding constant
-  is determined syntactically.  The resulting code:
-*}
-
-text %quotetypewriter {*
-  @{code_stmts dequeue (consts) dequeue (Haskell)}
-*}
-
-text {*
-  \noindent You may note that the equality test @{term "xs = []"} has
-  been replaced by the predicate @{term "List.null xs"}.  This is due
-  to the default setup of the \qn{preprocessor}.
-
-  This possibility to select arbitrary code equations is the key
-  technique for program and datatype refinement (see
-  \secref{sec:refinement}).
-
-  Due to the preprocessor, there is the distinction of raw code
-  equations (before preprocessing) and code equations (after
-  preprocessing).
-
-  The first can be listed (among other data) using the @{command_def
-  print_codesetup} command.
-
-  The code equations after preprocessing are already are blueprint of
-  the generated program and can be inspected using the @{command
-  code_thms} command:
-*}
-
-code_thms %quote dequeue
-
-text {*
-  \noindent This prints a table with the code equations for @{const
-  dequeue}, including \emph{all} code equations those equations depend
-  on recursively.  These dependencies themselves can be visualized using
-  the @{command_def code_deps} command.
-*}
-
-
-subsection {* Equality *}
-
-text {*
-  Implementation of equality deserves some attention.  Here an example
-  function involving polymorphic equality:
-*}
-
-primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
-  "collect_duplicates xs ys [] = xs"
-| "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
-    then if z \<in> set ys
-      then collect_duplicates xs ys zs
-      else collect_duplicates xs (z#ys) zs
-    else collect_duplicates (z#xs) (z#ys) zs)"
-
-text {*
-  \noindent During preprocessing, the membership test is rewritten,
-  resulting in @{const List.member}, which itself performs an explicit
-  equality check, as can be seen in the corresponding @{text SML} code:
-*}
-
-text %quotetypewriter {*
-  @{code_stmts collect_duplicates (SML)}
-*}
-
-text {*
-  \noindent Obviously, polymorphic equality is implemented the Haskell
-  way using a type class.  How is this achieved?  HOL introduces an
-  explicit class @{class equal} with a corresponding operation @{const
-  HOL.equal} such that @{thm equal [no_vars]}.  The preprocessing
-  framework does the rest by propagating the @{class equal} constraints
-  through all dependent code equations.  For datatypes, instances of
-  @{class equal} are implicitly derived when possible.  For other types,
-  you may instantiate @{text equal} manually like any other type class.
-*}
-
-
-subsection {* Explicit partiality \label{sec:partiality} *}
-
-text {*
-  Partiality usually enters the game by partial patterns, as
-  in the following example, again for amortised queues:
-*}
-
-definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
-  "strict_dequeue q = (case dequeue q
-    of (Some x, q') \<Rightarrow> (x, q'))"
-
-lemma %quote strict_dequeue_AQueue [code]:
-  "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
-  "strict_dequeue (AQueue xs []) =
-    (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
-  by (simp_all add: strict_dequeue_def) (cases xs, simp_all split: list.split)
-
-text {*
-  \noindent In the corresponding code, there is no equation
-  for the pattern @{term "AQueue [] []"}:
-*}
-
-text %quotetypewriter {*
-  @{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}
-*}
-
-text {*
-  \noindent In some cases it is desirable to have this
-  pseudo-\qt{partiality} more explicitly, e.g.~as follows:
-*}
-
-axiomatization %quote empty_queue :: 'a
-
-definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
-  "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
-
-lemma %quote strict_dequeue'_AQueue [code]:
-  "strict_dequeue' (AQueue xs []) = (if xs = [] then empty_queue
-     else strict_dequeue' (AQueue [] (rev xs)))"
-  "strict_dequeue' (AQueue xs (y # ys)) =
-     (y, AQueue xs ys)"
-  by (simp_all add: strict_dequeue'_def split: list.splits)
-
-text {*
-  Observe that on the right hand side of the definition of @{const
-  "strict_dequeue'"}, the unspecified constant @{const empty_queue} occurs.
-
-  Normally, if constants without any code equations occur in a
-  program, the code generator complains (since in most cases this is
-  indeed an error).  But such constants can also be thought
-  of as function definitions which always fail,
-  since there is never a successful pattern match on the left hand
-  side.  In order to categorise a constant into that category
-  explicitly, use @{command_def "code_abort"}:
-*}
-
-code_abort %quote empty_queue
-
-text {*
-  \noindent Then the code generator will just insert an error or
-  exception at the appropriate position:
-*}
-
-text %quotetypewriter {*
-  @{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}
-*}
-
-text {*
-  \noindent This feature however is rarely needed in practice.  Note
-  also that the HOL default setup already declares @{const undefined}
-  as @{command "code_abort"}, which is most likely to be used in such
-  situations.
-*}
-
-
-subsection {* If something goes utterly wrong \label{sec:utterly_wrong} *}
-
-text {*
-  Under certain circumstances, the code generator fails to produce
-  code entirely.  To debug these, the following hints may prove
-  helpful:
-
-  \begin{description}
-
-    \ditem{\emph{Check with a different target language}.}  Sometimes
-      the situation gets more clear if you switch to another target
-      language; the code generated there might give some hints what
-      prevents the code generator to produce code for the desired
-      language.
-
-    \ditem{\emph{Inspect code equations}.}  Code equations are the central
-      carrier of code generation.  Most problems occurring while generating
-      code can be traced to single equations which are printed as part of
-      the error message.  A closer inspection of those may offer the key
-      for solving issues (cf.~\secref{sec:equations}).
-
-    \ditem{\emph{Inspect preprocessor setup}.}  The preprocessor might
-      transform code equations unexpectedly; to understand an
-      inspection of its setup is necessary (cf.~\secref{sec:preproc}).
-
-    \ditem{\emph{Generate exceptions}.}  If the code generator
-      complains about missing code equations, in can be helpful to
-      implement the offending constants as exceptions
-      (cf.~\secref{sec:partiality}); this allows at least for a formal
-      generation of code, whose inspection may then give clues what is
-      wrong.
-
-    \ditem{\emph{Remove offending code equations}.}  If code
-      generation is prevented by just a single equation, this can be
-      removed (cf.~\secref{sec:equations}) to allow formal code
-      generation, whose result in turn can be used to trace the
-      problem.  The most prominent case here are mismatches in type
-      class signatures (\qt{wellsortedness error}).
-
-  \end{description}
-*}
-
-end