src/Doc/Codegen/Evaluation.thy
author haftmann
Mon Feb 06 20:56:34 2017 +0100 (2017-02-06)
changeset 64990 c6a7de505796
parent 63670 8e0148e1f5f4
child 65041 2525e680f94f
permissions -rw-r--r--
more explicit errors in pathological cases
     1 theory Evaluation
     2 imports Setup
     3 begin (*<*)
     4 
     5 ML \<open>
     6   Isabelle_System.mkdirs (File.tmp_path (Path.basic "examples"))
     7 \<close> (*>*)
     8 
     9 section \<open>Evaluation \label{sec:evaluation}\<close>
    10 
    11 text \<open>
    12   Recalling \secref{sec:principle}, code generation turns a system of
    13   equations into a program with the \emph{same} equational semantics.
    14   As a consequence, this program can be used as a \emph{rewrite
    15   engine} for terms: rewriting a term @{term "t"} using a program to a
    16   term @{term "t'"} yields the theorems @{prop "t \<equiv> t'"}.  This
    17   application of code generation in the following is referred to as
    18   \emph{evaluation}.
    19 \<close>
    20 
    21 
    22 subsection \<open>Evaluation techniques\<close>
    23 
    24 text \<open>
    25   The existing infrastructure provides a rich palette of evaluation
    26   techniques, each comprising different aspects:
    27 
    28   \begin{description}
    29 
    30     \item[Expressiveness.]  Depending on how good symbolic computation
    31       is supported, the class of terms which can be evaluated may be
    32       bigger or smaller.
    33 
    34     \item[Efficiency.]  The more machine-near the technique, the
    35       faster it is.
    36 
    37     \item[Trustability.]  Techniques which a huge (and also probably
    38       more configurable infrastructure) are more fragile and less
    39       trustable.
    40 
    41   \end{description}
    42 \<close>
    43 
    44 
    45 subsubsection \<open>The simplifier (@{text simp})\<close>
    46 
    47 text \<open>
    48   The simplest way for evaluation is just using the simplifier with
    49   the original code equations of the underlying program.  This gives
    50   fully symbolic evaluation and highest trustablity, with the usual
    51   performance of the simplifier.  Note that for operations on abstract
    52   datatypes (cf.~\secref{sec:invariant}), the original theorems as
    53   given by the users are used, not the modified ones.
    54 \<close>
    55 
    56 
    57 subsubsection \<open>Normalization by evaluation (@{text nbe})\<close>
    58 
    59 text \<open>
    60   Normalization by evaluation @{cite "Aehlig-Haftmann-Nipkow:2008:nbe"}
    61   provides a comparably fast partially symbolic evaluation which
    62   permits also normalization of functions and uninterpreted symbols;
    63   the stack of code to be trusted is considerable.
    64 \<close>
    65 
    66 
    67 subsubsection \<open>Evaluation in ML (@{text code})\<close>
    68 
    69 text \<open>
    70   Highest performance can be achieved by evaluation in ML, at the cost
    71   of being restricted to ground results and a layered stack of code to
    72   be trusted, including code generator configurations by the user.
    73 
    74   Evaluation is carried out in a target language \emph{Eval} which
    75   inherits from \emph{SML} but for convenience uses parts of the
    76   Isabelle runtime environment.  The soundness of computation carried
    77   out there depends crucially on the correctness of the code
    78   generator setup; this is one of the reasons why you should not use
    79   adaptation (see \secref{sec:adaptation}) frivolously.
    80 \<close>
    81 
    82 
    83 subsection \<open>Aspects of evaluation\<close>
    84 
    85 text \<open>
    86   Each of the techniques can be combined with different aspects.  The
    87   most important distinction is between dynamic and static evaluation.
    88   Dynamic evaluation takes the code generator configuration \qt{as it
    89   is} at the point where evaluation is issued.  Best example is the
    90   @{command_def value} command which allows ad-hoc evaluation of
    91   terms:
    92 \<close>
    93 
    94 value %quote "42 / (12 :: rat)"
    95 
    96 text \<open>
    97   \noindent @{command value} tries first to evaluate using ML, falling
    98   back to normalization by evaluation if this fails.
    99 
   100   A particular technique may be specified in square brackets, e.g.
   101 \<close>
   102 
   103 value %quote [nbe] "42 / (12 :: rat)"
   104 
   105 text \<open>
   106   To employ dynamic evaluation in the document generation, there is also
   107   a @{text value} antiquotation with the same evaluation techniques 
   108   as @{command value}.
   109 
   110   Static evaluation freezes the code generator configuration at a
   111   certain point and uses this context whenever evaluation is issued
   112   later on.  This is particularly appropriate for proof procedures
   113   which use evaluation, since then the behaviour of evaluation is not
   114   changed or even compromised later on by actions of the user.
   115 
   116   As a technical complication, terms after evaluation in ML must be
   117   turned into Isabelle's internal term representation again.  Since
   118   this is also configurable, it is never fully trusted.  For this
   119   reason, evaluation in ML comes with further aspects:
   120 
   121   \begin{description}
   122 
   123     \item[Plain evaluation.]  A term is normalized using the provided
   124       term reconstruction from ML to Isabelle; for applications which
   125       do not need to be fully trusted.
   126 
   127     \item[Property conversion.]  Evaluates propositions; since these
   128       are monomorphic, the term reconstruction is fixed once and for all
   129       and therefore trustable.
   130 
   131     \item[Conversion.]  Evaluates an arbitrary term @{term "t"} first
   132       by plain evaluation and certifies the result @{term "t'"} by
   133       checking the equation @{term "t \<equiv> t'"} using property
   134       conversion.
   135 
   136   \end{description}
   137 
   138   \noindent The picture is further complicated by the roles of
   139   exceptions.  Here three cases have to be distinguished:
   140 
   141   \begin{itemize}
   142 
   143     \item Evaluation of @{term t} terminates with a result @{term
   144       "t'"}.
   145 
   146     \item Evaluation of @{term t} terminates which an exception
   147       indicating a pattern match failure or a non-implemented
   148       function.  As sketched in \secref{sec:partiality}, this can be
   149       interpreted as partiality.
   150      
   151     \item Evaluation raises any other kind of exception.
   152      
   153   \end{itemize}
   154 
   155   \noindent For conversions, the first case yields the equation @{term
   156   "t = t'"}, the second defaults to reflexivity @{term "t = t"}.
   157   Exceptions of the third kind are propagated to the user.
   158 
   159   By default return values of plain evaluation are optional, yielding
   160   @{text "SOME t'"} in the first case, @{text "NONE"} in the
   161   second, and propagating the exception in the third case.  A strict
   162   variant of plain evaluation either yields @{text "t'"} or propagates
   163   any exception, a liberal variant captures any exception in a result
   164   of type @{text "Exn.result"}.
   165   
   166   For property conversion (which coincides with conversion except for
   167   evaluation in ML), methods are provided which solve a given goal by
   168   evaluation.
   169 \<close>
   170 
   171 
   172 subsection \<open>Schematic overview\<close>
   173 
   174 text \<open>
   175   \newcommand{\ttsize}{\fontsize{5.8pt}{8pt}\selectfont}
   176   \fontsize{9pt}{12pt}\selectfont
   177   \begin{tabular}{ll||c|c|c}
   178     & & @{text simp} & @{text nbe} & @{text code} \tabularnewline \hline \hline
   179     \multirow{5}{1ex}{\rotatebox{90}{dynamic}}
   180       & interactive evaluation 
   181       & @{command value} @{text "[simp]"} & @{command value} @{text "[nbe]"} & @{command value} @{text "[code]"}
   182       \tabularnewline
   183     & plain evaluation & & & \ttsize@{ML "Code_Evaluation.dynamic_value"} \tabularnewline \cline{2-5}
   184     & evaluation method & @{method code_simp} & @{method normalization} & @{method eval} \tabularnewline
   185     & property conversion & & & \ttsize@{ML "Code_Runtime.dynamic_holds_conv"} \tabularnewline \cline{2-5}
   186     & conversion & \ttsize@{ML "Code_Simp.dynamic_conv"} & \ttsize@{ML "Nbe.dynamic_conv"}
   187       & \ttsize@{ML "Code_Evaluation.dynamic_conv"} \tabularnewline \hline \hline
   188     \multirow{3}{1ex}{\rotatebox{90}{static}}
   189     & plain evaluation & & & \ttsize@{ML "Code_Evaluation.static_value"} \tabularnewline \cline{2-5}
   190     & property conversion & &
   191       & \ttsize@{ML "Code_Runtime.static_holds_conv"} \tabularnewline \cline{2-5}
   192     & conversion & \ttsize@{ML "Code_Simp.static_conv"}
   193       & \ttsize@{ML "Nbe.static_conv"}
   194       & \ttsize@{ML "Code_Evaluation.static_conv"}
   195   \end{tabular}
   196 \<close>
   197 
   198 text \<open>
   199   \noindent Note that @{ML Code_Evaluation.static_value} and
   200   @{ML Code_Evaluation.static_conv} require certain code equations to
   201   reconstruct Isabelle terms from results and certify results.  This is
   202   achieved as follows:
   203 
   204   \<^enum> Identify which result types are expected.
   205 
   206   \<^enum> Define an auxiliary operation which for each possible result type @{text \<tau>}
   207     contains a term @{const Code_Evaluation.TERM_OF} of type @{text "\<tau> itself"}
   208     (for @{ML Code_Evaluation.static_value}) or
   209     a term @{const Code_Evaluation.TERM_OF_EQUAL} of type @{text "\<tau> itself"}
   210     (for @{ML Code_Evaluation.static_conv}) respectively.
   211 
   212   \<^enum> Include that auxiliary operation into the set of constants when generating
   213     the static conversion.
   214 \<close>
   215 
   216 
   217 subsection \<open>Preprocessing HOL terms into evaluable shape\<close>
   218 
   219 text \<open>
   220   When integrating decision procedures developed inside HOL into HOL itself,
   221   it is necessary to somehow get from the Isabelle/ML representation to
   222   the representation used by the decision procedure itself (``reification'').
   223   One option is to hardcode it using code antiquotations (see \secref{sec:code_antiq}).
   224   Another option is to use pre-existing infrastructure in HOL:
   225   @{ML "Reification.conv"} and @{ML "Reification.tac"}
   226 
   227   A simplistic example:
   228 \<close>
   229 
   230 datatype %quote form_ord = T | F | Less nat nat
   231   | And form_ord form_ord | Or form_ord form_ord | Neg form_ord
   232 
   233 primrec %quote interp :: "form_ord \<Rightarrow> 'a::order list \<Rightarrow> bool"
   234 where
   235   "interp T vs \<longleftrightarrow> True"
   236 | "interp F vs \<longleftrightarrow> False"
   237 | "interp (Less i j) vs \<longleftrightarrow> vs ! i < vs ! j"
   238 | "interp (And f1 f2) vs \<longleftrightarrow> interp f1 vs \<and> interp f2 vs"
   239 | "interp (Or f1 f2) vs \<longleftrightarrow> interp f1 vs \<or> interp f2 vs"
   240 | "interp (Neg f) vs \<longleftrightarrow> \<not> interp f vs"
   241 
   242 text \<open>
   243   The datatype @{type form_ord} represents formulae whose semantics is given by
   244   @{const interp}.  Note that values are represented by variable indices (@{typ nat})
   245   whose concrete values are given in list @{term vs}.
   246 \<close>
   247 
   248 ML (*<*) \<open>\<close>
   249 schematic_goal "thm": fixes x y z :: "'a::order" shows "x < y \<and> x < z \<equiv> ?P"
   250 ML_prf 
   251 (*>*) \<open>val thm =
   252   Reification.conv @{context} @{thms interp.simps} @{cterm "x < y \<and> x < z"}\<close> (*<*)
   253 by (tactic \<open>ALLGOALS (resolve_tac @{context} [thm])\<close>)
   254 (*>*) 
   255 
   256 text \<open>
   257   By virtue of @{fact interp.simps}, @{ML "Reification.conv"} provides a conversion
   258   which, for this concrete example, yields @{thm thm [no_vars]}.  Note that the argument
   259   to @{const interp} does not contain any free variables and can thus be evaluated
   260   using evaluation.
   261 
   262   A less meager example can be found in the AFP, session @{text "Regular-Sets"},
   263   theory @{text Regexp_Method}.
   264 \<close>
   265 
   266 
   267 subsection \<open>Intimate connection between logic and system runtime\<close>
   268 
   269 text \<open>
   270   The toolbox of static evaluation conversions forms a reasonable base
   271   to interweave generated code and system tools.  However in some
   272   situations more direct interaction is desirable.
   273 \<close>
   274 
   275 
   276 subsubsection \<open>Static embedding of generated code into system runtime -- the @{text code} antiquotation \label{sec:code_antiq}\<close>
   277 
   278 text \<open>
   279   The @{text code} antiquotation allows to include constants from
   280   generated code directly into ML system code, as in the following toy
   281   example:
   282 \<close>
   283 
   284 datatype %quote form = T | F | And form form | Or form form (*<*)
   285 
   286 (*>*) ML %quotett \<open>
   287   fun eval_form @{code T} = true
   288     | eval_form @{code F} = false
   289     | eval_form (@{code And} (p, q)) =
   290         eval_form p andalso eval_form q
   291     | eval_form (@{code Or} (p, q)) =
   292         eval_form p orelse eval_form q;
   293 \<close>
   294 
   295 text \<open>
   296   \noindent @{text code} takes as argument the name of a constant;
   297   after the whole ML is read, the necessary code is generated
   298   transparently and the corresponding constant names are inserted.
   299   This technique also allows to use pattern matching on constructors
   300   stemming from compiled datatypes.  Note that the @{text code}
   301   antiquotation may not refer to constants which carry adaptations;
   302   here you have to refer to the corresponding adapted code directly.
   303 
   304   For a less simplistic example, theory @{text Approximation} in
   305   the @{text Decision_Procs} session is a good reference.
   306 \<close>
   307 
   308 
   309 subsubsection \<open>Static embedding of generated code into system runtime -- @{text code_reflect}\<close>
   310 
   311 text \<open>
   312   The @{text code} antiquoation is lightweight, but the generated code
   313   is only accessible while the ML section is processed.  Sometimes this
   314   is not appropriate, especially if the generated code contains datatype
   315   declarations which are shared with other parts of the system.  In these
   316   cases, @{command_def code_reflect} can be used:
   317 \<close>
   318 
   319 code_reflect %quote Sum_Type
   320   datatypes sum = Inl | Inr
   321   functions "Sum_Type.sum.projl" "Sum_Type.sum.projr"
   322 
   323 text \<open>
   324   \noindent @{command_def code_reflect} takes a structure name and
   325   references to datatypes and functions; for these code is compiled
   326   into the named ML structure and the \emph{Eval} target is modified
   327   in a way that future code generation will reference these
   328   precompiled versions of the given datatypes and functions.  This
   329   also allows to refer to the referenced datatypes and functions from
   330   arbitrary ML code as well.
   331 
   332   A typical example for @{command code_reflect} can be found in the
   333   @{theory Predicate} theory.
   334 \<close>
   335 
   336 
   337 subsubsection \<open>Separate compilation -- @{text code_reflect}\<close>
   338 
   339 text \<open>
   340   For technical reasons it is sometimes necessary to separate
   341   generation and compilation of code which is supposed to be used in
   342   the system runtime.  For this @{command code_reflect} with an
   343   optional \<^theory_text>\<open>file\<close> argument can be used:
   344 \<close>
   345 
   346 code_reflect %quote Rat
   347   datatypes rat
   348   functions Fract
   349     "(plus :: rat \<Rightarrow> rat \<Rightarrow> rat)" "(minus :: rat \<Rightarrow> rat \<Rightarrow> rat)"
   350     "(times :: rat \<Rightarrow> rat \<Rightarrow> rat)" "(divide :: rat \<Rightarrow> rat \<Rightarrow> rat)"
   351   file "$ISABELLE_TMP/examples/rat.ML"
   352 
   353 text \<open>
   354   \noindent This merely generates the referenced code to the given
   355   file which can be included into the system runtime later on.
   356 \<close>
   357 
   358 end
   359