src/Doc/Codegen/Foundations.thy
changeset 59377 056945909f60
parent 54890 cb892d835803
child 61076 bdc1e2f0a86a
equal deleted inserted replaced
59376:ead400fd6484 59377:056945909f60
     1 theory Foundations
     1 theory Foundations
     2 imports Introduction
     2 imports Introduction
     3 begin
     3 begin
     4 
     4 
     5 section {* Code generation foundations \label{sec:foundations} *}
     5 section \<open>Code generation foundations \label{sec:foundations}\<close>
     6 
     6 
     7 subsection {* Code generator architecture \label{sec:architecture} *}
     7 subsection \<open>Code generator architecture \label{sec:architecture}\<close>
     8 
     8 
     9 text {*
     9 text \<open>
    10   The code generator is actually a framework consisting of different
    10   The code generator is actually a framework consisting of different
    11   components which can be customised individually.
    11   components which can be customised individually.
    12 
    12 
    13   Conceptually all components operate on Isabelle's logic framework
    13   Conceptually all components operate on Isabelle's logic framework
    14   @{theory Pure}.  Practically, the object logic @{theory HOL}
    14   @{theory Pure}.  Practically, the object logic @{theory HOL}
    88   \end{itemize}
    88   \end{itemize}
    89 
    89 
    90   \noindent From these steps, only the last two are carried out
    90   \noindent From these steps, only the last two are carried out
    91   outside the logic; by keeping this layer as thin as possible, the
    91   outside the logic; by keeping this layer as thin as possible, the
    92   amount of code to trust is kept to a minimum.
    92   amount of code to trust is kept to a minimum.
    93 *}
    93 \<close>
    94 
    94 
    95 
    95 
    96 subsection {* The pre- and postprocessor \label{sec:preproc} *}
    96 subsection \<open>The pre- and postprocessor \label{sec:preproc}\<close>
    97 
    97 
    98 text {*
    98 text \<open>
    99   Before selected function theorems are turned into abstract code, a
    99   Before selected function theorems are turned into abstract code, a
   100   chain of definitional transformation steps is carried out:
   100   chain of definitional transformation steps is carried out:
   101   \emph{preprocessing}.  The preprocessor consists of two
   101   \emph{preprocessing}.  The preprocessor consists of two
   102   components: a \emph{simpset} and \emph{function transformers}.
   102   components: a \emph{simpset} and \emph{function transformers}.
   103 
   103 
   120   expressions suitable for logical reasoning and expressions 
   120   expressions suitable for logical reasoning and expressions 
   121   suitable for execution.  As example, take list membership; logically
   121   suitable for execution.  As example, take list membership; logically
   122   is is just expressed as @{term "x \<in> set xs"}.  But for execution
   122   is is just expressed as @{term "x \<in> set xs"}.  But for execution
   123   the intermediate set is not desirable.  Hence the following
   123   the intermediate set is not desirable.  Hence the following
   124   specification:
   124   specification:
   125 *}
   125 \<close>
   126 
   126 
   127 definition %quote member :: "'a list \<Rightarrow> 'a \<Rightarrow> bool"
   127 definition %quote member :: "'a list \<Rightarrow> 'a \<Rightarrow> bool"
   128 where
   128 where
   129   [code_abbrev]: "member xs x \<longleftrightarrow> x \<in> set xs"
   129   [code_abbrev]: "member xs x \<longleftrightarrow> x \<in> set xs"
   130 
   130 
   131 text {*
   131 text \<open>
   132   \noindent The \emph{@{attribute code_abbrev} attribute} declares
   132   \noindent The \emph{@{attribute code_abbrev} attribute} declares
   133   its theorem a rewrite rule for the postprocessor and the symmetric
   133   its theorem a rewrite rule for the postprocessor and the symmetric
   134   of its theorem as rewrite rule for the preprocessor.  Together,
   134   of its theorem as rewrite rule for the preprocessor.  Together,
   135   this has the effect that expressions @{thm (rhs) member_def [no_vars]}
   135   this has the effect that expressions @{thm (rhs) member_def [no_vars]}
   136   are replaced by @{thm (lhs) member_def [no_vars]} in generated code, but
   136   are replaced by @{thm (lhs) member_def [no_vars]} in generated code, but
   151   \noindent The current setup of the pre- and postprocessor may be inspected
   151   \noindent The current setup of the pre- and postprocessor may be inspected
   152   using the @{command_def print_codeproc} command.  @{command_def
   152   using the @{command_def print_codeproc} command.  @{command_def
   153   code_thms} (see \secref{sec:equations}) provides a convenient
   153   code_thms} (see \secref{sec:equations}) provides a convenient
   154   mechanism to inspect the impact of a preprocessor setup on code
   154   mechanism to inspect the impact of a preprocessor setup on code
   155   equations.
   155   equations.
   156 *}
   156 \<close>
   157 
   157 
   158 
   158 
   159 subsection {* Understanding code equations \label{sec:equations} *}
   159 subsection \<open>Understanding code equations \label{sec:equations}\<close>
   160 
   160 
   161 text {*
   161 text \<open>
   162   As told in \secref{sec:principle}, the notion of code equations is
   162   As told in \secref{sec:principle}, the notion of code equations is
   163   vital to code generation.  Indeed most problems which occur in
   163   vital to code generation.  Indeed most problems which occur in
   164   practice can be resolved by an inspection of the underlying code
   164   practice can be resolved by an inspection of the underlying code
   165   equations.
   165   equations.
   166 
   166 
   167   It is possible to exchange the default code equations for constants
   167   It is possible to exchange the default code equations for constants
   168   by explicitly proving alternative ones:
   168   by explicitly proving alternative ones:
   169 *}
   169 \<close>
   170 
   170 
   171 lemma %quote [code]:
   171 lemma %quote [code]:
   172   "dequeue (AQueue xs []) =
   172   "dequeue (AQueue xs []) =
   173      (if xs = [] then (None, AQueue [] [])
   173      (if xs = [] then (None, AQueue [] [])
   174        else dequeue (AQueue [] (rev xs)))"
   174        else dequeue (AQueue [] (rev xs)))"
   175   "dequeue (AQueue xs (y # ys)) =
   175   "dequeue (AQueue xs (y # ys)) =
   176      (Some y, AQueue xs ys)"
   176      (Some y, AQueue xs ys)"
   177   by (cases xs, simp_all) (cases "rev xs", simp_all)
   177   by (cases xs, simp_all) (cases "rev xs", simp_all)
   178 
   178 
   179 text {*
   179 text \<open>
   180   \noindent The annotation @{text "[code]"} is an @{text attribute}
   180   \noindent The annotation @{text "[code]"} is an @{text attribute}
   181   which states that the given theorems should be considered as code
   181   which states that the given theorems should be considered as code
   182   equations for a @{text fun} statement -- the corresponding constant
   182   equations for a @{text fun} statement -- the corresponding constant
   183   is determined syntactically.  The resulting code:
   183   is determined syntactically.  The resulting code:
   184 *}
   184 \<close>
   185 
   185 
   186 text %quotetypewriter {*
   186 text %quotetypewriter \<open>
   187   @{code_stmts dequeue (consts) dequeue (Haskell)}
   187   @{code_stmts dequeue (consts) dequeue (Haskell)}
   188 *}
   188 \<close>
   189 
   189 
   190 text {*
   190 text \<open>
   191   \noindent You may note that the equality test @{term "xs = []"} has
   191   \noindent You may note that the equality test @{term "xs = []"} has
   192   been replaced by the predicate @{term "List.null xs"}.  This is due
   192   been replaced by the predicate @{term "List.null xs"}.  This is due
   193   to the default setup of the \qn{preprocessor}.
   193   to the default setup of the \qn{preprocessor}.
   194 
   194 
   195   This possibility to select arbitrary code equations is the key
   195   This possibility to select arbitrary code equations is the key
   204   print_codesetup} command.
   204   print_codesetup} command.
   205 
   205 
   206   The code equations after preprocessing are already are blueprint of
   206   The code equations after preprocessing are already are blueprint of
   207   the generated program and can be inspected using the @{command
   207   the generated program and can be inspected using the @{command
   208   code_thms} command:
   208   code_thms} command:
   209 *}
   209 \<close>
   210 
   210 
   211 code_thms %quote dequeue
   211 code_thms %quote dequeue
   212 
   212 
   213 text {*
   213 text \<open>
   214   \noindent This prints a table with the code equations for @{const
   214   \noindent This prints a table with the code equations for @{const
   215   dequeue}, including \emph{all} code equations those equations depend
   215   dequeue}, including \emph{all} code equations those equations depend
   216   on recursively.  These dependencies themselves can be visualized using
   216   on recursively.  These dependencies themselves can be visualized using
   217   the @{command_def code_deps} command.
   217   the @{command_def code_deps} command.
   218 *}
   218 \<close>
   219 
   219 
   220 
   220 
   221 subsection {* Equality *}
   221 subsection \<open>Equality\<close>
   222 
   222 
   223 text {*
   223 text \<open>
   224   Implementation of equality deserves some attention.  Here an example
   224   Implementation of equality deserves some attention.  Here an example
   225   function involving polymorphic equality:
   225   function involving polymorphic equality:
   226 *}
   226 \<close>
   227 
   227 
   228 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
   228 primrec %quote collect_duplicates :: "'a list \<Rightarrow> 'a list \<Rightarrow> 'a list \<Rightarrow> 'a list" where
   229   "collect_duplicates xs ys [] = xs"
   229   "collect_duplicates xs ys [] = xs"
   230 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
   230 | "collect_duplicates xs ys (z#zs) = (if z \<in> set xs
   231     then if z \<in> set ys
   231     then if z \<in> set ys
   232       then collect_duplicates xs ys zs
   232       then collect_duplicates xs ys zs
   233       else collect_duplicates xs (z#ys) zs
   233       else collect_duplicates xs (z#ys) zs
   234     else collect_duplicates (z#xs) (z#ys) zs)"
   234     else collect_duplicates (z#xs) (z#ys) zs)"
   235 
   235 
   236 text {*
   236 text \<open>
   237   \noindent During preprocessing, the membership test is rewritten,
   237   \noindent During preprocessing, the membership test is rewritten,
   238   resulting in @{const List.member}, which itself performs an explicit
   238   resulting in @{const List.member}, which itself performs an explicit
   239   equality check, as can be seen in the corresponding @{text SML} code:
   239   equality check, as can be seen in the corresponding @{text SML} code:
   240 *}
   240 \<close>
   241 
   241 
   242 text %quotetypewriter {*
   242 text %quotetypewriter \<open>
   243   @{code_stmts collect_duplicates (SML)}
   243   @{code_stmts collect_duplicates (SML)}
   244 *}
   244 \<close>
   245 
   245 
   246 text {*
   246 text \<open>
   247   \noindent Obviously, polymorphic equality is implemented the Haskell
   247   \noindent Obviously, polymorphic equality is implemented the Haskell
   248   way using a type class.  How is this achieved?  HOL introduces an
   248   way using a type class.  How is this achieved?  HOL introduces an
   249   explicit class @{class equal} with a corresponding operation @{const
   249   explicit class @{class equal} with a corresponding operation @{const
   250   HOL.equal} such that @{thm equal [no_vars]}.  The preprocessing
   250   HOL.equal} such that @{thm equal [no_vars]}.  The preprocessing
   251   framework does the rest by propagating the @{class equal} constraints
   251   framework does the rest by propagating the @{class equal} constraints
   252   through all dependent code equations.  For datatypes, instances of
   252   through all dependent code equations.  For datatypes, instances of
   253   @{class equal} are implicitly derived when possible.  For other types,
   253   @{class equal} are implicitly derived when possible.  For other types,
   254   you may instantiate @{text equal} manually like any other type class.
   254   you may instantiate @{text equal} manually like any other type class.
   255 *}
   255 \<close>
   256 
   256 
   257 
   257 
   258 subsection {* Explicit partiality \label{sec:partiality} *}
   258 subsection \<open>Explicit partiality \label{sec:partiality}\<close>
   259 
   259 
   260 text {*
   260 text \<open>
   261   Partiality usually enters the game by partial patterns, as
   261   Partiality usually enters the game by partial patterns, as
   262   in the following example, again for amortised queues:
   262   in the following example, again for amortised queues:
   263 *}
   263 \<close>
   264 
   264 
   265 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
   265 definition %quote strict_dequeue :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
   266   "strict_dequeue q = (case dequeue q
   266   "strict_dequeue q = (case dequeue q
   267     of (Some x, q') \<Rightarrow> (x, q'))"
   267     of (Some x, q') \<Rightarrow> (x, q'))"
   268 
   268 
   270   "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
   270   "strict_dequeue (AQueue xs (y # ys)) = (y, AQueue xs ys)"
   271   "strict_dequeue (AQueue xs []) =
   271   "strict_dequeue (AQueue xs []) =
   272     (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
   272     (case rev xs of y # ys \<Rightarrow> (y, AQueue [] ys))"
   273   by (simp_all add: strict_dequeue_def) (cases xs, simp_all split: list.split)
   273   by (simp_all add: strict_dequeue_def) (cases xs, simp_all split: list.split)
   274 
   274 
   275 text {*
   275 text \<open>
   276   \noindent In the corresponding code, there is no equation
   276   \noindent In the corresponding code, there is no equation
   277   for the pattern @{term "AQueue [] []"}:
   277   for the pattern @{term "AQueue [] []"}:
   278 *}
   278 \<close>
   279 
   279 
   280 text %quotetypewriter {*
   280 text %quotetypewriter \<open>
   281   @{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}
   281   @{code_stmts strict_dequeue (consts) strict_dequeue (Haskell)}
   282 *}
   282 \<close>
   283 
   283 
   284 text {*
   284 text \<open>
   285   \noindent In some cases it is desirable to have this
   285   \noindent In some cases it is desirable to have this
   286   pseudo-\qt{partiality} more explicitly, e.g.~as follows:
   286   pseudo-\qt{partiality} more explicitly, e.g.~as follows:
   287 *}
   287 \<close>
   288 
   288 
   289 axiomatization %quote empty_queue :: 'a
   289 axiomatization %quote empty_queue :: 'a
   290 
   290 
   291 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
   291 definition %quote strict_dequeue' :: "'a queue \<Rightarrow> 'a \<times> 'a queue" where
   292   "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
   292   "strict_dequeue' q = (case dequeue q of (Some x, q') \<Rightarrow> (x, q') | _ \<Rightarrow> empty_queue)"
   296      else strict_dequeue' (AQueue [] (rev xs)))"
   296      else strict_dequeue' (AQueue [] (rev xs)))"
   297   "strict_dequeue' (AQueue xs (y # ys)) =
   297   "strict_dequeue' (AQueue xs (y # ys)) =
   298      (y, AQueue xs ys)"
   298      (y, AQueue xs ys)"
   299   by (simp_all add: strict_dequeue'_def split: list.splits)
   299   by (simp_all add: strict_dequeue'_def split: list.splits)
   300 
   300 
   301 text {*
   301 text \<open>
   302   Observe that on the right hand side of the definition of @{const
   302   Observe that on the right hand side of the definition of @{const
   303   "strict_dequeue'"}, the unspecified constant @{const empty_queue} occurs.
   303   "strict_dequeue'"}, the unspecified constant @{const empty_queue} occurs.
   304 
   304 
   305   Normally, if constants without any code equations occur in a
   305   Normally, if constants without any code equations occur in a
   306   program, the code generator complains (since in most cases this is
   306   program, the code generator complains (since in most cases this is
   308   of as function definitions which always fail,
   308   of as function definitions which always fail,
   309   since there is never a successful pattern match on the left hand
   309   since there is never a successful pattern match on the left hand
   310   side.  In order to categorise a constant into that category
   310   side.  In order to categorise a constant into that category
   311   explicitly, use the @{attribute code} attribute with
   311   explicitly, use the @{attribute code} attribute with
   312   @{text abort}:
   312   @{text abort}:
   313 *}
   313 \<close>
   314 
   314 
   315 declare %quote [[code abort: empty_queue]]
   315 declare %quote [[code abort: empty_queue]]
   316 
   316 
   317 text {*
   317 text \<open>
   318   \noindent Then the code generator will just insert an error or
   318   \noindent Then the code generator will just insert an error or
   319   exception at the appropriate position:
   319   exception at the appropriate position:
   320 *}
   320 \<close>
   321 
   321 
   322 text %quotetypewriter {*
   322 text %quotetypewriter \<open>
   323   @{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}
   323   @{code_stmts strict_dequeue' (consts) empty_queue strict_dequeue' (Haskell)}
   324 *}
   324 \<close>
   325 
   325 
   326 text {*
   326 text \<open>
   327   \noindent This feature however is rarely needed in practice.  Note
   327   \noindent This feature however is rarely needed in practice.  Note
   328   also that the HOL default setup already declares @{const undefined},
   328   also that the HOL default setup already declares @{const undefined},
   329   which is most likely to be used in such situations, as
   329   which is most likely to be used in such situations, as
   330   @{text "code abort"}.
   330   @{text "code abort"}.
   331 *}
   331 \<close>
   332 
   332 
   333 
   333 
   334 subsection {* If something goes utterly wrong \label{sec:utterly_wrong} *}
   334 subsection \<open>If something goes utterly wrong \label{sec:utterly_wrong}\<close>
   335 
   335 
   336 text {*
   336 text \<open>
   337   Under certain circumstances, the code generator fails to produce
   337   Under certain circumstances, the code generator fails to produce
   338   code entirely.  To debug these, the following hints may prove
   338   code entirely.  To debug these, the following hints may prove
   339   helpful:
   339   helpful:
   340 
   340 
   341   \begin{description}
   341   \begin{description}
   369       generation, whose result in turn can be used to trace the
   369       generation, whose result in turn can be used to trace the
   370       problem.  The most prominent case here are mismatches in type
   370       problem.  The most prominent case here are mismatches in type
   371       class signatures (\qt{wellsortedness error}).
   371       class signatures (\qt{wellsortedness error}).
   372 
   372 
   373   \end{description}
   373   \end{description}
   374 *}
   374 \<close>
   375 
   375 
   376 end
   376 end