src/Doc/Codegen/Further.thy
author haftmann
Fri Jul 04 20:18:47 2014 +0200 (2014-07-04)
changeset 57512 cc97b347b301
parent 55757 9fc71814b8c1
child 58620 7435b6a3f72e
permissions -rw-r--r--
reduced name variants for assoc and commute on plus and mult
     1 theory Further
     2 imports Setup
     3 begin
     4 
     5 section {* Further issues \label{sec:further} *}
     6 
     7 subsection {* Specialities of the @{text Scala} target language \label{sec:scala} *}
     8 
     9 text {*
    10   @{text Scala} deviates from languages of the ML family in a couple
    11   of aspects; those which affect code generation mainly have to do with
    12   @{text Scala}'s type system:
    13 
    14   \begin{itemize}
    15 
    16     \item @{text Scala} prefers tupled syntax over curried syntax.
    17 
    18     \item @{text Scala} sacrifices Hindely-Milner type inference for a
    19       much more rich type system with subtyping etc.  For this reason
    20       type arguments sometimes have to be given explicitly in square
    21       brackets (mimicking System F syntax).
    22 
    23     \item In contrast to @{text Haskell} where most specialities of
    24       the type system are implemented using \emph{type classes},
    25       @{text Scala} provides a sophisticated system of \emph{implicit
    26       arguments}.
    27 
    28   \end{itemize}
    29 
    30   \noindent Concerning currying, the @{text Scala} serializer counts
    31   arguments in code equations to determine how many arguments
    32   shall be tupled; remaining arguments and abstractions in terms
    33   rather than function definitions are always curried.
    34 
    35   The second aspect affects user-defined adaptations with @{command
    36   code_printing}.  For regular terms, the @{text Scala} serializer prints
    37   all type arguments explicitly.  For user-defined term adaptations
    38   this is only possible for adaptations which take no arguments: here
    39   the type arguments are just appended.  Otherwise they are ignored;
    40   hence user-defined adaptations for polymorphic constants have to be
    41   designed very carefully to avoid ambiguity.
    42 
    43   Isabelle's type classes are mapped onto @{text Scala} implicits; in
    44   cases with diamonds in the subclass hierarchy this can lead to
    45   ambiguities in the generated code:
    46 *}
    47 
    48 class %quote class1 =
    49   fixes foo :: "'a \<Rightarrow> 'a"
    50 
    51 class %quote class2 = class1
    52 
    53 class %quote class3 = class1
    54 
    55 text {*
    56   \noindent Here both @{class class2} and @{class class3} inherit from @{class class1},
    57   forming the upper part of a diamond.
    58 *}
    59 
    60 definition %quote bar :: "'a :: {class2, class3} \<Rightarrow> 'a" where
    61   "bar = foo"
    62 
    63 text {*
    64   \noindent This yields the following code:
    65 *}
    66 
    67 text %quotetypewriter {*
    68   @{code_stmts bar (Scala)}
    69 *}
    70 
    71 text {*
    72   \noindent This code is rejected by the @{text Scala} compiler: in
    73   the definition of @{text bar}, it is not clear from where to derive
    74   the implicit argument for @{text foo}.
    75 
    76   The solution to the problem is to close the diamond by a further
    77   class with inherits from both @{class class2} and @{class class3}:
    78 *}
    79 
    80 class %quote class4 = class2 + class3
    81 
    82 text {*
    83   \noindent Then the offending code equation can be restricted to
    84   @{class class4}:
    85 *}
    86 
    87 lemma %quote [code]:
    88   "(bar :: 'a::class4 \<Rightarrow> 'a) = foo"
    89   by (simp only: bar_def)
    90 
    91 text {*
    92   \noindent with the following code:
    93 *}
    94 
    95 text %quotetypewriter {*
    96   @{code_stmts bar (Scala)}
    97 *}
    98 
    99 text {*
   100   \noindent which exposes no ambiguity.
   101 
   102   Since the preprocessor (cf.~\secref{sec:preproc}) propagates sort
   103   constraints through a system of code equations, it is usually not
   104   very difficult to identify the set of code equations which actually
   105   needs more restricted sort constraints.
   106 *}
   107 
   108 subsection {* Modules namespace *}
   109 
   110 text {*
   111   When invoking the @{command export_code} command it is possible to
   112   leave out the @{keyword "module_name"} part; then code is
   113   distributed over different modules, where the module name space
   114   roughly is induced by the Isabelle theory name space.
   115 
   116   Then sometimes the awkward situation occurs that dependencies
   117   between definitions introduce cyclic dependencies between modules,
   118   which in the @{text Haskell} world leaves you to the mercy of the
   119   @{text Haskell} implementation you are using, while for @{text
   120   SML}/@{text OCaml} code generation is not possible.
   121 
   122   A solution is to declare module names explicitly.  Let use assume
   123   the three cyclically dependent modules are named \emph{A}, \emph{B}
   124   and \emph{C}.  Then, by stating
   125 *}
   126 
   127 code_identifier %quote
   128   code_module A \<rightharpoonup> (SML) ABC
   129 | code_module B \<rightharpoonup> (SML) ABC
   130 | code_module C \<rightharpoonup> (SML) ABC
   131 
   132 text {*
   133   \noindent we explicitly map all those modules on \emph{ABC},
   134   resulting in an ad-hoc merge of this three modules at serialisation
   135   time.
   136 *}
   137 
   138 subsection {* Locales and interpretation *}
   139 
   140 text {*
   141   A technical issue comes to surface when generating code from
   142   specifications stemming from locale interpretation.
   143 
   144   Let us assume a locale specifying a power operation on arbitrary
   145   types:
   146 *}
   147 
   148 locale %quote power =
   149   fixes power :: "'a \<Rightarrow> 'b \<Rightarrow> 'b"
   150   assumes power_commute: "power x \<circ> power y = power y \<circ> power x"
   151 begin
   152 
   153 text {*
   154   \noindent Inside that locale we can lift @{text power} to exponent
   155   lists by means of specification relative to that locale:
   156 *}
   157 
   158 primrec %quote powers :: "'a list \<Rightarrow> 'b \<Rightarrow> 'b" where
   159   "powers [] = id"
   160 | "powers (x # xs) = power x \<circ> powers xs"
   161 
   162 lemma %quote powers_append:
   163   "powers (xs @ ys) = powers xs \<circ> powers ys"
   164   by (induct xs) simp_all
   165 
   166 lemma %quote powers_power:
   167   "powers xs \<circ> power x = power x \<circ> powers xs"
   168   by (induct xs)
   169     (simp_all del: o_apply id_apply add: comp_assoc,
   170       simp del: o_apply add: o_assoc power_commute)
   171 
   172 lemma %quote powers_rev:
   173   "powers (rev xs) = powers xs"
   174     by (induct xs) (simp_all add: powers_append powers_power)
   175 
   176 end %quote
   177 
   178 text {*
   179   After an interpretation of this locale (say, @{command_def
   180   interpretation} @{text "fun_power:"} @{term [source] "power (\<lambda>n (f
   181   :: 'a \<Rightarrow> 'a). f ^^ n)"}), one would expect to have a constant @{text
   182   "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"} for which code
   183   can be generated.  But this not the case: internally, the term
   184   @{text "fun_power.powers"} is an abbreviation for the foundational
   185   term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}
   186   (see \cite{isabelle-locale} for the details behind).
   187 
   188   Fortunately, with minor effort the desired behaviour can be
   189   achieved.  First, a dedicated definition of the constant on which
   190   the local @{text "powers"} after interpretation is supposed to be
   191   mapped on:
   192 *}
   193 
   194 definition %quote funpows :: "nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a" where
   195   [code del]: "funpows = power.powers (\<lambda>n f. f ^^ n)"
   196 
   197 text {*
   198   \noindent In general, the pattern is @{text "c = t"} where @{text c}
   199   is the name of the future constant and @{text t} the foundational
   200   term corresponding to the local constant after interpretation.
   201 
   202   The interpretation itself is enriched with an equation @{text "t = c"}:
   203 *}
   204 
   205 interpretation %quote fun_power: power "\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n" where
   206   "power.powers (\<lambda>n f. f ^^ n) = funpows"
   207   by unfold_locales
   208     (simp_all add: fun_eq_iff funpow_mult mult.commute funpows_def)
   209 
   210 text {*
   211   \noindent This additional equation is trivially proved by the
   212   definition itself.
   213 
   214   After this setup procedure, code generation can continue as usual:
   215 *}
   216 
   217 text %quotetypewriter {*
   218   @{code_stmts funpows (consts) Nat.funpow funpows (Haskell)}
   219 *}
   220 
   221 
   222 subsection {* Parallel computation *}
   223 
   224 text {*
   225   Theory @{text Parallel} in @{file "~~/src/HOL/Library"} contains
   226   operations to exploit parallelism inside the Isabelle/ML
   227   runtime engine.
   228 *}
   229 
   230 subsection {* Imperative data structures *}
   231 
   232 text {*
   233   If you consider imperative data structures as inevitable for a
   234   specific application, you should consider \emph{Imperative
   235   Functional Programming with Isabelle/HOL}
   236   \cite{bulwahn-et-al:2008:imperative}; the framework described there
   237   is available in session @{text Imperative_HOL}, together with a
   238   short primer document.
   239 *}
   240 
   241 
   242 subsection {* ML system interfaces \label{sec:ml} *}
   243 
   244 text {*
   245   Since the code generator framework not only aims to provide a nice
   246   Isar interface but also to form a base for code-generation-based
   247   applications, here a short description of the most fundamental ML
   248   interfaces.
   249 *}
   250 
   251 subsubsection {* Managing executable content *}
   252 
   253 text %mlref {*
   254   \begin{mldecls}
   255   @{index_ML Code.read_const: "theory -> string -> string"} \\
   256   @{index_ML Code.add_eqn: "thm -> theory -> theory"} \\
   257   @{index_ML Code.del_eqn: "thm -> theory -> theory"} \\
   258   @{index_ML Code_Preproc.map_pre: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   259   @{index_ML Code_Preproc.map_post: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   260   @{index_ML Code_Preproc.add_functrans: "
   261     string * (Proof.context -> (thm * bool) list -> (thm * bool) list option)
   262       -> theory -> theory"} \\
   263   @{index_ML Code_Preproc.del_functrans: "string -> theory -> theory"} \\
   264   @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\
   265   @{index_ML Code.get_type: "theory -> string
   266     -> ((string * sort) list * (string * ((string * sort) list * typ list)) list) * bool"} \\
   267   @{index_ML Code.get_type_of_constr_or_abstr: "theory -> string -> (string * bool) option"}
   268   \end{mldecls}
   269 
   270   \begin{description}
   271 
   272   \item @{ML Code.read_const}~@{text thy}~@{text s}
   273      reads a constant as a concrete term expression @{text s}.
   274 
   275   \item @{ML Code.add_eqn}~@{text "thm"}~@{text "thy"} adds function
   276      theorem @{text "thm"} to executable content.
   277 
   278   \item @{ML Code.del_eqn}~@{text "thm"}~@{text "thy"} removes function
   279      theorem @{text "thm"} from executable content, if present.
   280 
   281   \item @{ML Code_Preproc.map_pre}~@{text "f"}~@{text "thy"} changes
   282      the preprocessor simpset.
   283 
   284   \item @{ML Code_Preproc.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds
   285      function transformer @{text f} (named @{text name}) to executable content;
   286      @{text f} is a transformer of the code equations belonging
   287      to a certain function definition, depending on the
   288      current theory context.  Returning @{text NONE} indicates that no
   289      transformation took place;  otherwise, the whole process will be iterated
   290      with the new code equations.
   291 
   292   \item @{ML Code_Preproc.del_functrans}~@{text "name"}~@{text "thy"} removes
   293      function transformer named @{text name} from executable content.
   294 
   295   \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds
   296      a datatype to executable content, with generation
   297      set @{text cs}.
   298 
   299   \item @{ML Code.get_type_of_constr_or_abstr}~@{text "thy"}~@{text "const"}
   300      returns type constructor corresponding to
   301      constructor @{text const}; returns @{text NONE}
   302      if @{text const} is no constructor.
   303 
   304   \end{description}
   305 *}
   306 
   307 
   308 subsubsection {* Data depending on the theory's executable content *}
   309 
   310 text {*
   311   Implementing code generator applications on top of the framework set
   312   out so far usually not only involves using those primitive
   313   interfaces but also storing code-dependent data and various other
   314   things.
   315 
   316   Due to incrementality of code generation, changes in the theory's
   317   executable content have to be propagated in a certain fashion.
   318   Additionally, such changes may occur not only during theory
   319   extension but also during theory merge, which is a little bit nasty
   320   from an implementation point of view.  The framework provides a
   321   solution to this technical challenge by providing a functorial data
   322   slot @{ML_functor Code_Data}; on instantiation of this functor, the
   323   following types and operations are required:
   324 
   325   \medskip
   326   \begin{tabular}{l}
   327   @{text "type T"} \\
   328   @{text "val empty: T"} \\
   329   \end{tabular}
   330 
   331   \begin{description}
   332 
   333   \item @{text T} the type of data to store.
   334 
   335   \item @{text empty} initial (empty) data.
   336 
   337   \end{description}
   338 
   339   \noindent An instance of @{ML_functor Code_Data} provides the
   340   following interface:
   341 
   342   \medskip
   343   \begin{tabular}{l}
   344   @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\
   345   @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}
   346   \end{tabular}
   347 
   348   \begin{description}
   349 
   350   \item @{text change} update of current data (cached!) by giving a
   351     continuation.
   352 
   353   \item @{text change_yield} update with side result.
   354 
   355   \end{description}
   356 *}
   357 
   358 end
   359