src/Doc/Codegen/Evaluation.thy
changeset 48985 5386df44a037
parent 48951 b9238cbcdd41
child 51713 4fd969609b4d
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/src/Doc/Codegen/Evaluation.thy	Tue Aug 28 18:57:32 2012 +0200
     1.3 @@ -0,0 +1,287 @@
     1.4 +theory Evaluation
     1.5 +imports Setup
     1.6 +begin
     1.7 +
     1.8 +section {* Evaluation \label{sec:evaluation} *}
     1.9 +
    1.10 +text {*
    1.11 +  Recalling \secref{sec:principle}, code generation turns a system of
    1.12 +  equations into a program with the \emph{same} equational semantics.
    1.13 +  As a consequence, this program can be used as a \emph{rewrite
    1.14 +  engine} for terms: rewriting a term @{term "t"} using a program to a
    1.15 +  term @{term "t'"} yields the theorems @{prop "t \<equiv> t'"}.  This
    1.16 +  application of code generation in the following is referred to as
    1.17 +  \emph{evaluation}.
    1.18 +*}
    1.19 +
    1.20 +
    1.21 +subsection {* Evaluation techniques *}
    1.22 +
    1.23 +text {*
    1.24 +  The existing infrastructure provides a rich palette of evaluation
    1.25 +  techniques, each comprising different aspects:
    1.26 +
    1.27 +  \begin{description}
    1.28 +
    1.29 +    \item[Expressiveness.]  Depending on how good symbolic computation
    1.30 +      is supported, the class of terms which can be evaluated may be
    1.31 +      bigger or smaller.
    1.32 +
    1.33 +    \item[Efficiency.]  The more machine-near the technique, the
    1.34 +      faster it is.
    1.35 +
    1.36 +    \item[Trustability.]  Techniques which a huge (and also probably
    1.37 +      more configurable infrastructure) are more fragile and less
    1.38 +      trustable.
    1.39 +
    1.40 +  \end{description}
    1.41 +*}
    1.42 +
    1.43 +
    1.44 +subsubsection {* The simplifier (@{text simp}) *}
    1.45 +
    1.46 +text {*
    1.47 +  The simplest way for evaluation is just using the simplifier with
    1.48 +  the original code equations of the underlying program.  This gives
    1.49 +  fully symbolic evaluation and highest trustablity, with the usual
    1.50 +  performance of the simplifier.  Note that for operations on abstract
    1.51 +  datatypes (cf.~\secref{sec:invariant}), the original theorems as
    1.52 +  given by the users are used, not the modified ones.
    1.53 +*}
    1.54 +
    1.55 +
    1.56 +subsubsection {* Normalization by evaluation (@{text nbe}) *}
    1.57 +
    1.58 +text {*
    1.59 +  Normalization by evaluation \cite{Aehlig-Haftmann-Nipkow:2008:nbe}
    1.60 +  provides a comparably fast partially symbolic evaluation which
    1.61 +  permits also normalization of functions and uninterpreted symbols;
    1.62 +  the stack of code to be trusted is considerable.
    1.63 +*}
    1.64 +
    1.65 +
    1.66 +subsubsection {* Evaluation in ML (@{text code}) *}
    1.67 +
    1.68 +text {*
    1.69 +  Highest performance can be achieved by evaluation in ML, at the cost
    1.70 +  of being restricted to ground results and a layered stack of code to
    1.71 +  be trusted, including code generator configurations by the user.
    1.72 +
    1.73 +  Evaluation is carried out in a target language \emph{Eval} which
    1.74 +  inherits from \emph{SML} but for convenience uses parts of the
    1.75 +  Isabelle runtime environment.  The soundness of computation carried
    1.76 +  out there depends crucially on the correctness of the code
    1.77 +  generator setup; this is one of the reasons why you should not use
    1.78 +  adaptation (see \secref{sec:adaptation}) frivolously.
    1.79 +*}
    1.80 +
    1.81 +
    1.82 +subsection {* Aspects of evaluation *}
    1.83 +
    1.84 +text {*
    1.85 +  Each of the techniques can be combined with different aspects.  The
    1.86 +  most important distinction is between dynamic and static evaluation.
    1.87 +  Dynamic evaluation takes the code generator configuration \qt{as it
    1.88 +  is} at the point where evaluation is issued.  Best example is the
    1.89 +  @{command_def value} command which allows ad-hoc evaluation of
    1.90 +  terms:
    1.91 +*}
    1.92 +
    1.93 +value %quote "42 / (12 :: rat)"
    1.94 +
    1.95 +text {*
    1.96 +  \noindent By default @{command value} tries all available evaluation
    1.97 +  techniques and prints the result of the first succeeding one.  A particular
    1.98 +  technique may be specified in square brackets, e.g.
    1.99 +*}
   1.100 +
   1.101 +value %quote [nbe] "42 / (12 :: rat)"
   1.102 +
   1.103 +text {*
   1.104 +  To employ dynamic evaluation in the document generation, there is also
   1.105 +  a @{text value} antiquotation. By default, it also tries all available evaluation
   1.106 +  techniques and prints the result of the first succeeding one, unless a particular
   1.107 +  technique is specified in square brackets.
   1.108 +
   1.109 +  Static evaluation freezes the code generator configuration at a
   1.110 +  certain point and uses this context whenever evaluation is issued
   1.111 +  later on.  This is particularly appropriate for proof procedures
   1.112 +  which use evaluation, since then the behaviour of evaluation is not
   1.113 +  changed or even compromised later on by actions of the user.
   1.114 +
   1.115 +  As a technical complication, terms after evaluation in ML must be
   1.116 +  turned into Isabelle's internal term representation again.  Since
   1.117 +  this is also configurable, it is never fully trusted.  For this
   1.118 +  reason, evaluation in ML comes with further aspects:
   1.119 +
   1.120 +  \begin{description}
   1.121 +
   1.122 +    \item[Plain evaluation.]  A term is normalized using the provided
   1.123 +      term reconstruction from ML to Isabelle; for applications which
   1.124 +      do not need to be fully trusted.
   1.125 +
   1.126 +    \item[Property conversion.]  Evaluates propositions; since these
   1.127 +      are monomorphic, the term reconstruction is fixed once and for all
   1.128 +      and therefore trustable.
   1.129 +
   1.130 +    \item[Conversion.]  Evaluates an arbitrary term @{term "t"} first
   1.131 +      by plain evaluation and certifies the result @{term "t'"} by
   1.132 +      checking the equation @{term "t \<equiv> t'"} using property
   1.133 +      conversion.
   1.134 +
   1.135 +  \end{description}
   1.136 +
   1.137 +  \noindent The picture is further complicated by the roles of
   1.138 +  exceptions.  Here three cases have to be distinguished:
   1.139 +
   1.140 +  \begin{itemize}
   1.141 +
   1.142 +    \item Evaluation of @{term t} terminates with a result @{term
   1.143 +      "t'"}.
   1.144 +
   1.145 +    \item Evaluation of @{term t} terminates which en exception
   1.146 +      indicating a pattern match failure or a non-implemented
   1.147 +      function.  As sketched in \secref{sec:partiality}, this can be
   1.148 +      interpreted as partiality.
   1.149 +     
   1.150 +    \item Evaluation raises any other kind of exception.
   1.151 +     
   1.152 +  \end{itemize}
   1.153 +
   1.154 +  \noindent For conversions, the first case yields the equation @{term
   1.155 +  "t = t'"}, the second defaults to reflexivity @{term "t = t"}.
   1.156 +  Exceptions of the third kind are propagated to the user.
   1.157 +
   1.158 +  By default return values of plain evaluation are optional, yielding
   1.159 +  @{text "SOME t'"} in the first case, @{text "NONE"} in the
   1.160 +  second, and propagating the exception in the third case.  A strict
   1.161 +  variant of plain evaluation either yields @{text "t'"} or propagates
   1.162 +  any exception, a liberal variant caputures any exception in a result
   1.163 +  of type @{text "Exn.result"}.
   1.164 +  
   1.165 +  For property conversion (which coincides with conversion except for
   1.166 +  evaluation in ML), methods are provided which solve a given goal by
   1.167 +  evaluation.
   1.168 +*}
   1.169 +
   1.170 +
   1.171 +subsection {* Schematic overview *}
   1.172 +
   1.173 +text {*
   1.174 +  \newcommand{\ttsize}{\fontsize{5.8pt}{8pt}\selectfont}
   1.175 +  \fontsize{9pt}{12pt}\selectfont
   1.176 +  \begin{tabular}{ll||c|c|c}
   1.177 +    & & @{text simp} & @{text nbe} & @{text code} \tabularnewline \hline \hline
   1.178 +    \multirow{5}{1ex}{\rotatebox{90}{dynamic}}
   1.179 +      & interactive evaluation 
   1.180 +      & @{command value} @{text "[simp]"} & @{command value} @{text "[nbe]"} & @{command value} @{text "[code]"}
   1.181 +      \tabularnewline
   1.182 +    & plain evaluation & & & \ttsize@{ML "Code_Evaluation.dynamic_value"} \tabularnewline \cline{2-5}
   1.183 +    & evaluation method & @{method code_simp} & @{method normalization} & @{method eval} \tabularnewline
   1.184 +    & property conversion & & & \ttsize@{ML "Code_Runtime.dynamic_holds_conv"} \tabularnewline \cline{2-5}
   1.185 +    & conversion & \ttsize@{ML "Code_Simp.dynamic_conv"} & \ttsize@{ML "Nbe.dynamic_conv"}
   1.186 +      & \ttsize@{ML "Code_Evaluation.dynamic_conv"} \tabularnewline \hline \hline
   1.187 +    \multirow{3}{1ex}{\rotatebox{90}{static}}
   1.188 +    & plain evaluation & & & \ttsize@{ML "Code_Evaluation.static_value"} \tabularnewline \cline{2-5}
   1.189 +    & property conversion & &
   1.190 +      & \ttsize@{ML "Code_Runtime.static_holds_conv"} \tabularnewline \cline{2-5}
   1.191 +    & conversion & \ttsize@{ML "Code_Simp.static_conv"}
   1.192 +      & \ttsize@{ML "Nbe.static_conv"}
   1.193 +      & \ttsize@{ML "Code_Evaluation.static_conv"}
   1.194 +  \end{tabular}
   1.195 +*}
   1.196 +
   1.197 +
   1.198 +subsection {* Intimate connection between logic and system runtime *}
   1.199 +
   1.200 +text {*
   1.201 +  The toolbox of static evaluation conversions forms a reasonable base
   1.202 +  to interweave generated code and system tools.  However in some
   1.203 +  situations more direct interaction is desirable.
   1.204 +*}
   1.205 +
   1.206 +
   1.207 +subsubsection {* Static embedding of generated code into system runtime -- the @{text code} antiquotation *}
   1.208 +
   1.209 +text {*
   1.210 +  The @{text code} antiquotation allows to include constants from
   1.211 +  generated code directly into ML system code, as in the following toy
   1.212 +  example:
   1.213 +*}
   1.214 +
   1.215 +datatype %quote form = T | F | And form form | Or form form (*<*)
   1.216 +
   1.217 +(*>*) ML %quotett {*
   1.218 +  fun eval_form @{code T} = true
   1.219 +    | eval_form @{code F} = false
   1.220 +    | eval_form (@{code And} (p, q)) =
   1.221 +        eval_form p andalso eval_form q
   1.222 +    | eval_form (@{code Or} (p, q)) =
   1.223 +        eval_form p orelse eval_form q;
   1.224 +*}
   1.225 +
   1.226 +text {*
   1.227 +  \noindent @{text code} takes as argument the name of a constant;
   1.228 +  after the whole ML is read, the necessary code is generated
   1.229 +  transparently and the corresponding constant names are inserted.
   1.230 +  This technique also allows to use pattern matching on constructors
   1.231 +  stemming from compiled datatypes.  Note that the @{text code}
   1.232 +  antiquotation may not refer to constants which carry adaptations;
   1.233 +  here you have to refer to the corresponding adapted code directly.
   1.234 +
   1.235 +  For a less simplistic example, theory @{text Approximation} in
   1.236 +  the @{text Decision_Procs} session is a good reference.
   1.237 +*}
   1.238 +
   1.239 +
   1.240 +subsubsection {* Static embedding of generated code into system runtime -- @{text code_reflect} *}
   1.241 +
   1.242 +text {*
   1.243 +  The @{text code} antiquoation is lightweight, but the generated code
   1.244 +  is only accessible while the ML section is processed.  Sometimes this
   1.245 +  is not appropriate, especially if the generated code contains datatype
   1.246 +  declarations which are shared with other parts of the system.  In these
   1.247 +  cases, @{command_def code_reflect} can be used:
   1.248 +*}
   1.249 +
   1.250 +code_reflect %quote Sum_Type
   1.251 +  datatypes sum = Inl | Inr
   1.252 +  functions "Sum_Type.Projl" "Sum_Type.Projr"
   1.253 +
   1.254 +text {*
   1.255 +  \noindent @{command_def code_reflect} takes a structure name and
   1.256 +  references to datatypes and functions; for these code is compiled
   1.257 +  into the named ML structure and the \emph{Eval} target is modified
   1.258 +  in a way that future code generation will reference these
   1.259 +  precompiled versions of the given datatypes and functions.  This
   1.260 +  also allows to refer to the referenced datatypes and functions from
   1.261 +  arbitrary ML code as well.
   1.262 +
   1.263 +  A typical example for @{command code_reflect} can be found in the
   1.264 +  @{theory Predicate} theory.
   1.265 +*}
   1.266 +
   1.267 +
   1.268 +subsubsection {* Separate compilation -- @{text code_reflect} *}
   1.269 +
   1.270 +text {*
   1.271 +  For technical reasons it is sometimes necessary to separate
   1.272 +  generation and compilation of code which is supposed to be used in
   1.273 +  the system runtime.  For this @{command code_reflect} with an
   1.274 +  optional @{text "file"} argument can be used:
   1.275 +*}
   1.276 +
   1.277 +code_reflect %quote Rat
   1.278 +  datatypes rat = Frct
   1.279 +  functions Fract
   1.280 +    "(plus :: rat \<Rightarrow> rat \<Rightarrow> rat)" "(minus :: rat \<Rightarrow> rat \<Rightarrow> rat)"
   1.281 +    "(times :: rat \<Rightarrow> rat \<Rightarrow> rat)" "(divide :: rat \<Rightarrow> rat \<Rightarrow> rat)"
   1.282 +  file "examples/rat.ML"
   1.283 +
   1.284 +text {*
   1.285 +  \noindent This merely generates the referenced code to the given
   1.286 +  file which can be included into the system runtime later on.
   1.287 +*}
   1.288 +
   1.289 +end
   1.290 +