src/Doc/Codegen/Further.thy
author haftmann
Mon Jun 06 21:28:46 2016 +0200 (2016-06-06)
changeset 63241 f59fd6cc935e
parent 63239 d562c9948dee
child 63303 7cffe366d333
permissions -rw-r--r--
tuned signature
     1 theory Further
     2 imports Setup
     3 begin
     4 
     5 section \<open>Further issues \label{sec:further}\<close>
     6 
     7 subsection \<open>Specialities of the @{text Scala} target language \label{sec:scala}\<close>
     8 
     9 text \<open>
    10   @{text Scala} deviates from languages of the ML family in a couple
    11   of aspects; those which affect code generation mainly have to do with
    12   @{text Scala}'s type system:
    13 
    14   \begin{itemize}
    15 
    16     \item @{text Scala} prefers tupled syntax over curried syntax.
    17 
    18     \item @{text Scala} sacrifices Hindely-Milner type inference for a
    19       much more rich type system with subtyping etc.  For this reason
    20       type arguments sometimes have to be given explicitly in square
    21       brackets (mimicking System F syntax).
    22 
    23     \item In contrast to @{text Haskell} where most specialities of
    24       the type system are implemented using \emph{type classes},
    25       @{text Scala} provides a sophisticated system of \emph{implicit
    26       arguments}.
    27 
    28   \end{itemize}
    29 
    30   \noindent Concerning currying, the @{text Scala} serializer counts
    31   arguments in code equations to determine how many arguments
    32   shall be tupled; remaining arguments and abstractions in terms
    33   rather than function definitions are always curried.
    34 
    35   The second aspect affects user-defined adaptations with @{command
    36   code_printing}.  For regular terms, the @{text Scala} serializer prints
    37   all type arguments explicitly.  For user-defined term adaptations
    38   this is only possible for adaptations which take no arguments: here
    39   the type arguments are just appended.  Otherwise they are ignored;
    40   hence user-defined adaptations for polymorphic constants have to be
    41   designed very carefully to avoid ambiguity.
    42 
    43   Isabelle's type classes are mapped onto @{text Scala} implicits; in
    44   cases with diamonds in the subclass hierarchy this can lead to
    45   ambiguities in the generated code:
    46 \<close>
    47 
    48 class %quote class1 =
    49   fixes foo :: "'a \<Rightarrow> 'a"
    50 
    51 class %quote class2 = class1
    52 
    53 class %quote class3 = class1
    54 
    55 text \<open>
    56   \noindent Here both @{class class2} and @{class class3} inherit from @{class class1},
    57   forming the upper part of a diamond.
    58 \<close>
    59 
    60 definition %quote bar :: "'a :: {class2, class3} \<Rightarrow> 'a" where
    61   "bar = foo"
    62 
    63 text \<open>
    64   \noindent This yields the following code:
    65 \<close>
    66 
    67 text %quotetypewriter \<open>
    68   @{code_stmts bar (Scala)}
    69 \<close>
    70 
    71 text \<open>
    72   \noindent This code is rejected by the @{text Scala} compiler: in
    73   the definition of @{text bar}, it is not clear from where to derive
    74   the implicit argument for @{text foo}.
    75 
    76   The solution to the problem is to close the diamond by a further
    77   class with inherits from both @{class class2} and @{class class3}:
    78 \<close>
    79 
    80 class %quote class4 = class2 + class3
    81 
    82 text \<open>
    83   \noindent Then the offending code equation can be restricted to
    84   @{class class4}:
    85 \<close>
    86 
    87 lemma %quote [code]:
    88   "(bar :: 'a::class4 \<Rightarrow> 'a) = foo"
    89   by (simp only: bar_def)
    90 
    91 text \<open>
    92   \noindent with the following code:
    93 \<close>
    94 
    95 text %quotetypewriter \<open>
    96   @{code_stmts bar (Scala)}
    97 \<close>
    98 
    99 text \<open>
   100   \noindent which exposes no ambiguity.
   101 
   102   Since the preprocessor (cf.~\secref{sec:preproc}) propagates sort
   103   constraints through a system of code equations, it is usually not
   104   very difficult to identify the set of code equations which actually
   105   needs more restricted sort constraints.
   106 \<close>
   107 
   108 subsection \<open>Modules namespace\<close>
   109 
   110 text \<open>
   111   When invoking the @{command export_code} command it is possible to
   112   leave out the @{keyword "module_name"} part; then code is
   113   distributed over different modules, where the module name space
   114   roughly is induced by the Isabelle theory name space.
   115 
   116   Then sometimes the awkward situation occurs that dependencies
   117   between definitions introduce cyclic dependencies between modules,
   118   which in the @{text Haskell} world leaves you to the mercy of the
   119   @{text Haskell} implementation you are using, while for @{text
   120   SML}/@{text OCaml} code generation is not possible.
   121 
   122   A solution is to declare module names explicitly.  Let use assume
   123   the three cyclically dependent modules are named \emph{A}, \emph{B}
   124   and \emph{C}.  Then, by stating
   125 \<close>
   126 
   127 code_identifier %quote
   128   code_module A \<rightharpoonup> (SML) ABC
   129 | code_module B \<rightharpoonup> (SML) ABC
   130 | code_module C \<rightharpoonup> (SML) ABC
   131 
   132 text \<open>
   133   \noindent we explicitly map all those modules on \emph{ABC},
   134   resulting in an ad-hoc merge of this three modules at serialisation
   135   time.
   136 \<close>
   137 
   138 subsection \<open>Locales and interpretation\<close>
   139 
   140 text \<open>
   141   A technical issue comes to surface when generating code from
   142   specifications stemming from locale interpretation into global
   143   theories.
   144 
   145   Let us assume a locale specifying a power operation on arbitrary
   146   types:
   147 \<close>
   148 
   149 locale %quote power =
   150   fixes power :: "'a \<Rightarrow> 'b \<Rightarrow> 'b"
   151   assumes power_commute: "power x \<circ> power y = power y \<circ> power x"
   152 begin
   153 
   154 text \<open>
   155   \noindent Inside that locale we can lift @{text power} to exponent
   156   lists by means of a specification relative to that locale:
   157 \<close>
   158 
   159 primrec %quote powers :: "'a list \<Rightarrow> 'b \<Rightarrow> 'b" where
   160   "powers [] = id"
   161 | "powers (x # xs) = power x \<circ> powers xs"
   162 
   163 lemma %quote powers_append:
   164   "powers (xs @ ys) = powers xs \<circ> powers ys"
   165   by (induct xs) simp_all
   166 
   167 lemma %quote powers_power:
   168   "powers xs \<circ> power x = power x \<circ> powers xs"
   169   by (induct xs)
   170     (simp_all del: o_apply id_apply add: comp_assoc,
   171       simp del: o_apply add: o_assoc power_commute)
   172 
   173 lemma %quote powers_rev:
   174   "powers (rev xs) = powers xs"
   175     by (induct xs) (simp_all add: powers_append powers_power)
   176 
   177 end %quote
   178 
   179 text \<open>
   180   After an interpretation of this locale (say, @{command_def
   181   global_interpretation} @{text "fun_power:"} @{term [source] "power (\<lambda>n (f
   182   :: 'a \<Rightarrow> 'a). f ^^ n)"}), one could naively expect to have a constant @{text
   183   "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"} for which code
   184   can be generated.  But this not the case: internally, the term
   185   @{text "fun_power.powers"} is an abbreviation for the foundational
   186   term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}
   187   (see @{cite "isabelle-locale"} for the details behind).
   188 
   189   Fortunately, a succint solution is available: a dedicated
   190   rewrite definition:
   191 \<close>
   192 
   193 global_interpretation %quote fun_power: power "(\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"
   194   defines funpows = fun_power.powers
   195   by unfold_locales
   196     (simp_all add: fun_eq_iff funpow_mult mult.commute)
   197 
   198 text \<open>
   199   \noindent This amends the interpretation morphisms such that
   200   occurrences of the foundational term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}
   201   are folded to a newly defined constant @{const funpows}.
   202 
   203   After this setup procedure, code generation can continue as usual:
   204 \<close>
   205 
   206 text %quotetypewriter \<open>
   207   @{code_stmts funpows (consts) Nat.funpow funpows (Haskell)}
   208 \<close>
   209 
   210 
   211 subsection \<open>Parallel computation\<close>
   212 
   213 text \<open>
   214   Theory @{text Parallel} in @{file "~~/src/HOL/Library"} contains
   215   operations to exploit parallelism inside the Isabelle/ML
   216   runtime engine.
   217 \<close>
   218 
   219 subsection \<open>Imperative data structures\<close>
   220 
   221 text \<open>
   222   If you consider imperative data structures as inevitable for a
   223   specific application, you should consider \emph{Imperative
   224   Functional Programming with Isabelle/HOL}
   225   @{cite "bulwahn-et-al:2008:imperative"}; the framework described there
   226   is available in session @{text Imperative_HOL}, together with a
   227   short primer document.
   228 \<close>
   229 
   230 
   231 subsection \<open>ML system interfaces \label{sec:ml}\<close>
   232 
   233 text \<open>
   234   Since the code generator framework not only aims to provide a nice
   235   Isar interface but also to form a base for code-generation-based
   236   applications, here a short description of the most fundamental ML
   237   interfaces.
   238 \<close>
   239 
   240 subsubsection \<open>Managing executable content\<close>
   241 
   242 text %mlref \<open>
   243   \begin{mldecls}
   244   @{index_ML Code.read_const: "theory -> string -> string"} \\
   245   @{index_ML Code.add_eqn: "Code.kind * bool -> thm -> theory -> theory"} \\
   246   @{index_ML Code.del_eqn: "thm -> theory -> theory"} \\
   247   @{index_ML Code_Preproc.map_pre: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   248   @{index_ML Code_Preproc.map_post: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   249   @{index_ML Code_Preproc.add_functrans: "
   250     string * (Proof.context -> (thm * bool) list -> (thm * bool) list option)
   251       -> theory -> theory"} \\
   252   @{index_ML Code_Preproc.del_functrans: "string -> theory -> theory"} \\
   253   @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\
   254   @{index_ML Code.get_type: "theory -> string
   255     -> ((string * sort) list * (string * ((string * sort) list * typ list)) list) * bool"} \\
   256   @{index_ML Code.get_type_of_constr_or_abstr: "theory -> string -> (string * bool) option"}
   257   \end{mldecls}
   258 
   259   \begin{description}
   260 
   261   \item @{ML Code.read_const}~@{text thy}~@{text s}
   262      reads a constant as a concrete term expression @{text s}.
   263 
   264   \item @{ML Code.add_eqn}~@{text "(kind, default)"}~@{text "thm"}~@{text "thy"} adds code equation
   265      @{text "thm"} to executable content.
   266 
   267   \item @{ML Code.del_eqn}~@{text "thm"}~@{text "thy"} removes code equation
   268      @{text "thm"} from executable content, if present.
   269 
   270   \item @{ML Code_Preproc.map_pre}~@{text "f"}~@{text "thy"} changes
   271      the preprocessor simpset.
   272 
   273   \item @{ML Code_Preproc.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds
   274      function transformer @{text f} (named @{text name}) to executable content;
   275      @{text f} is a transformer of the code equations belonging
   276      to a certain function definition, depending on the
   277      current theory context.  Returning @{text NONE} indicates that no
   278      transformation took place;  otherwise, the whole process will be iterated
   279      with the new code equations.
   280 
   281   \item @{ML Code_Preproc.del_functrans}~@{text "name"}~@{text "thy"} removes
   282      function transformer named @{text name} from executable content.
   283 
   284   \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds
   285      a datatype to executable content, with generation
   286      set @{text cs}.
   287 
   288   \item @{ML Code.get_type_of_constr_or_abstr}~@{text "thy"}~@{text "const"}
   289      returns type constructor corresponding to
   290      constructor @{text const}; returns @{text NONE}
   291      if @{text const} is no constructor.
   292 
   293   \end{description}
   294 \<close>
   295 
   296 
   297 subsubsection \<open>Data depending on the theory's executable content\<close>
   298 
   299 text \<open>
   300   Implementing code generator applications on top of the framework set
   301   out so far usually not only involves using those primitive
   302   interfaces but also storing code-dependent data and various other
   303   things.
   304 
   305   Due to incrementality of code generation, changes in the theory's
   306   executable content have to be propagated in a certain fashion.
   307   Additionally, such changes may occur not only during theory
   308   extension but also during theory merge, which is a little bit nasty
   309   from an implementation point of view.  The framework provides a
   310   solution to this technical challenge by providing a functorial data
   311   slot @{ML_functor Code_Data}; on instantiation of this functor, the
   312   following types and operations are required:
   313 
   314   \medskip
   315   \begin{tabular}{l}
   316   @{text "type T"} \\
   317   @{text "val empty: T"} \\
   318   \end{tabular}
   319 
   320   \begin{description}
   321 
   322   \item @{text T} the type of data to store.
   323 
   324   \item @{text empty} initial (empty) data.
   325 
   326   \end{description}
   327 
   328   \noindent An instance of @{ML_functor Code_Data} provides the
   329   following interface:
   330 
   331   \medskip
   332   \begin{tabular}{l}
   333   @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\
   334   @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}
   335   \end{tabular}
   336 
   337   \begin{description}
   338 
   339   \item @{text change} update of current data (cached!) by giving a
   340     continuation.
   341 
   342   \item @{text change_yield} update with side result.
   343 
   344   \end{description}
   345 \<close>
   346 
   347 end