src/Doc/Codegen/Further.thy
author haftmann
Sat Nov 14 08:45:52 2015 +0100 (2015-11-14)
changeset 61670 301e0b4ecd45
parent 61566 c3d6e570ccef
child 61671 20d4cd2ceab2
permissions -rw-r--r--
coalesce permanent_interpretation.ML with interpretation.ML
     1 theory Further
     2 imports Setup
     3 begin
     4 
     5 section \<open>Further issues \label{sec:further}\<close>
     6 
     7 subsection \<open>Specialities of the @{text Scala} target language \label{sec:scala}\<close>
     8 
     9 text \<open>
    10   @{text Scala} deviates from languages of the ML family in a couple
    11   of aspects; those which affect code generation mainly have to do with
    12   @{text Scala}'s type system:
    13 
    14   \begin{itemize}
    15 
    16     \item @{text Scala} prefers tupled syntax over curried syntax.
    17 
    18     \item @{text Scala} sacrifices Hindely-Milner type inference for a
    19       much more rich type system with subtyping etc.  For this reason
    20       type arguments sometimes have to be given explicitly in square
    21       brackets (mimicking System F syntax).
    22 
    23     \item In contrast to @{text Haskell} where most specialities of
    24       the type system are implemented using \emph{type classes},
    25       @{text Scala} provides a sophisticated system of \emph{implicit
    26       arguments}.
    27 
    28   \end{itemize}
    29 
    30   \noindent Concerning currying, the @{text Scala} serializer counts
    31   arguments in code equations to determine how many arguments
    32   shall be tupled; remaining arguments and abstractions in terms
    33   rather than function definitions are always curried.
    34 
    35   The second aspect affects user-defined adaptations with @{command
    36   code_printing}.  For regular terms, the @{text Scala} serializer prints
    37   all type arguments explicitly.  For user-defined term adaptations
    38   this is only possible for adaptations which take no arguments: here
    39   the type arguments are just appended.  Otherwise they are ignored;
    40   hence user-defined adaptations for polymorphic constants have to be
    41   designed very carefully to avoid ambiguity.
    42 
    43   Isabelle's type classes are mapped onto @{text Scala} implicits; in
    44   cases with diamonds in the subclass hierarchy this can lead to
    45   ambiguities in the generated code:
    46 \<close>
    47 
    48 class %quote class1 =
    49   fixes foo :: "'a \<Rightarrow> 'a"
    50 
    51 class %quote class2 = class1
    52 
    53 class %quote class3 = class1
    54 
    55 text \<open>
    56   \noindent Here both @{class class2} and @{class class3} inherit from @{class class1},
    57   forming the upper part of a diamond.
    58 \<close>
    59 
    60 definition %quote bar :: "'a :: {class2, class3} \<Rightarrow> 'a" where
    61   "bar = foo"
    62 
    63 text \<open>
    64   \noindent This yields the following code:
    65 \<close>
    66 
    67 text %quotetypewriter \<open>
    68   @{code_stmts bar (Scala)}
    69 \<close>
    70 
    71 text \<open>
    72   \noindent This code is rejected by the @{text Scala} compiler: in
    73   the definition of @{text bar}, it is not clear from where to derive
    74   the implicit argument for @{text foo}.
    75 
    76   The solution to the problem is to close the diamond by a further
    77   class with inherits from both @{class class2} and @{class class3}:
    78 \<close>
    79 
    80 class %quote class4 = class2 + class3
    81 
    82 text \<open>
    83   \noindent Then the offending code equation can be restricted to
    84   @{class class4}:
    85 \<close>
    86 
    87 lemma %quote [code]:
    88   "(bar :: 'a::class4 \<Rightarrow> 'a) = foo"
    89   by (simp only: bar_def)
    90 
    91 text \<open>
    92   \noindent with the following code:
    93 \<close>
    94 
    95 text %quotetypewriter \<open>
    96   @{code_stmts bar (Scala)}
    97 \<close>
    98 
    99 text \<open>
   100   \noindent which exposes no ambiguity.
   101 
   102   Since the preprocessor (cf.~\secref{sec:preproc}) propagates sort
   103   constraints through a system of code equations, it is usually not
   104   very difficult to identify the set of code equations which actually
   105   needs more restricted sort constraints.
   106 \<close>
   107 
   108 subsection \<open>Modules namespace\<close>
   109 
   110 text \<open>
   111   When invoking the @{command export_code} command it is possible to
   112   leave out the @{keyword "module_name"} part; then code is
   113   distributed over different modules, where the module name space
   114   roughly is induced by the Isabelle theory name space.
   115 
   116   Then sometimes the awkward situation occurs that dependencies
   117   between definitions introduce cyclic dependencies between modules,
   118   which in the @{text Haskell} world leaves you to the mercy of the
   119   @{text Haskell} implementation you are using, while for @{text
   120   SML}/@{text OCaml} code generation is not possible.
   121 
   122   A solution is to declare module names explicitly.  Let use assume
   123   the three cyclically dependent modules are named \emph{A}, \emph{B}
   124   and \emph{C}.  Then, by stating
   125 \<close>
   126 
   127 code_identifier %quote
   128   code_module A \<rightharpoonup> (SML) ABC
   129 | code_module B \<rightharpoonup> (SML) ABC
   130 | code_module C \<rightharpoonup> (SML) ABC
   131 
   132 text \<open>
   133   \noindent we explicitly map all those modules on \emph{ABC},
   134   resulting in an ad-hoc merge of this three modules at serialisation
   135   time.
   136 \<close>
   137 
   138 subsection \<open>Locales and interpretation\<close>
   139 
   140 text \<open>
   141   A technical issue comes to surface when generating code from
   142   specifications stemming from locale interpretation.
   143 
   144   Let us assume a locale specifying a power operation on arbitrary
   145   types:
   146 \<close>
   147 
   148 locale %quote power =
   149   fixes power :: "'a \<Rightarrow> 'b \<Rightarrow> 'b"
   150   assumes power_commute: "power x \<circ> power y = power y \<circ> power x"
   151 begin
   152 
   153 text \<open>
   154   \noindent Inside that locale we can lift @{text power} to exponent
   155   lists by means of specification relative to that locale:
   156 \<close>
   157 
   158 primrec %quote powers :: "'a list \<Rightarrow> 'b \<Rightarrow> 'b" where
   159   "powers [] = id"
   160 | "powers (x # xs) = power x \<circ> powers xs"
   161 
   162 lemma %quote powers_append:
   163   "powers (xs @ ys) = powers xs \<circ> powers ys"
   164   by (induct xs) simp_all
   165 
   166 lemma %quote powers_power:
   167   "powers xs \<circ> power x = power x \<circ> powers xs"
   168   by (induct xs)
   169     (simp_all del: o_apply id_apply add: comp_assoc,
   170       simp del: o_apply add: o_assoc power_commute)
   171 
   172 lemma %quote powers_rev:
   173   "powers (rev xs) = powers xs"
   174     by (induct xs) (simp_all add: powers_append powers_power)
   175 
   176 end %quote
   177 
   178 text \<open>
   179   After an interpretation of this locale (say, @{command_def
   180   interpretation} @{text "fun_power:"} @{term [source] "power (\<lambda>n (f
   181   :: 'a \<Rightarrow> 'a). f ^^ n)"}), one would expect to have a constant @{text
   182   "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"} for which code
   183   can be generated.  But this not the case: internally, the term
   184   @{text "fun_power.powers"} is an abbreviation for the foundational
   185   term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}
   186   (see @{cite "isabelle-locale"} for the details behind).
   187 
   188   Fortunately, with minor effort the desired behaviour can be
   189   achieved.  First, a dedicated definition of the constant on which
   190   the local @{text "powers"} after interpretation is supposed to be
   191   mapped on:
   192 \<close>
   193 
   194 definition %quote funpows :: "nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a" where
   195   [code del]: "funpows = power.powers (\<lambda>n f. f ^^ n)"
   196 
   197 text \<open>
   198   \noindent In general, the pattern is @{text "c = t"} where @{text c}
   199   is the name of the future constant and @{text t} the foundational
   200   term corresponding to the local constant after interpretation.
   201 
   202   The interpretation itself is enriched with an equation @{text "t = c"}:
   203 \<close>
   204 
   205 interpretation %quote fun_power: power "(\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)" rewrites
   206   "power.powers (\<lambda>n f. f ^^ n) = funpows"
   207   by unfold_locales
   208     (simp_all add: fun_eq_iff funpow_mult mult.commute funpows_def)
   209 
   210 text \<open>
   211   \noindent This additional equation is trivially proved by the
   212   definition itself.
   213 
   214   After this setup procedure, code generation can continue as usual:
   215 \<close>
   216 
   217 text %quotetypewriter \<open>
   218   @{code_stmts funpows (consts) Nat.funpow funpows (Haskell)}
   219 \<close> (*<*)
   220 
   221 (*>*) text \<open>
   222   Fortunately, an even more succint approach is available using command
   223   @{command permanent_interpretation}.
   224   Then the pattern above collapses to
   225 \<close> (*<*)
   226 
   227 setup \<open>Sign.add_path "funpows"\<close>
   228 hide_const (open) funpows
   229 
   230 (*>*)
   231 permanent_interpretation %quote fun_power: power "(\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"
   232   defining funpows = "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"
   233   by unfold_locales
   234     (simp_all add: fun_eq_iff funpow_mult mult.commute) (*<*)
   235 
   236 setup \<open>Sign.parent_path\<close>
   237 
   238 (*>*)
   239 
   240 
   241 subsection \<open>Parallel computation\<close>
   242 
   243 text \<open>
   244   Theory @{text Parallel} in @{file "~~/src/HOL/Library"} contains
   245   operations to exploit parallelism inside the Isabelle/ML
   246   runtime engine.
   247 \<close>
   248 
   249 subsection \<open>Imperative data structures\<close>
   250 
   251 text \<open>
   252   If you consider imperative data structures as inevitable for a
   253   specific application, you should consider \emph{Imperative
   254   Functional Programming with Isabelle/HOL}
   255   @{cite "bulwahn-et-al:2008:imperative"}; the framework described there
   256   is available in session @{text Imperative_HOL}, together with a
   257   short primer document.
   258 \<close>
   259 
   260 
   261 subsection \<open>ML system interfaces \label{sec:ml}\<close>
   262 
   263 text \<open>
   264   Since the code generator framework not only aims to provide a nice
   265   Isar interface but also to form a base for code-generation-based
   266   applications, here a short description of the most fundamental ML
   267   interfaces.
   268 \<close>
   269 
   270 subsubsection \<open>Managing executable content\<close>
   271 
   272 text %mlref \<open>
   273   \begin{mldecls}
   274   @{index_ML Code.read_const: "theory -> string -> string"} \\
   275   @{index_ML Code.add_eqn: "thm -> theory -> theory"} \\
   276   @{index_ML Code.del_eqn: "thm -> theory -> theory"} \\
   277   @{index_ML Code_Preproc.map_pre: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   278   @{index_ML Code_Preproc.map_post: "(Proof.context -> Proof.context) -> theory -> theory"} \\
   279   @{index_ML Code_Preproc.add_functrans: "
   280     string * (Proof.context -> (thm * bool) list -> (thm * bool) list option)
   281       -> theory -> theory"} \\
   282   @{index_ML Code_Preproc.del_functrans: "string -> theory -> theory"} \\
   283   @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\
   284   @{index_ML Code.get_type: "theory -> string
   285     -> ((string * sort) list * (string * ((string * sort) list * typ list)) list) * bool"} \\
   286   @{index_ML Code.get_type_of_constr_or_abstr: "theory -> string -> (string * bool) option"}
   287   \end{mldecls}
   288 
   289   \begin{description}
   290 
   291   \item @{ML Code.read_const}~@{text thy}~@{text s}
   292      reads a constant as a concrete term expression @{text s}.
   293 
   294   \item @{ML Code.add_eqn}~@{text "thm"}~@{text "thy"} adds function
   295      theorem @{text "thm"} to executable content.
   296 
   297   \item @{ML Code.del_eqn}~@{text "thm"}~@{text "thy"} removes function
   298      theorem @{text "thm"} from executable content, if present.
   299 
   300   \item @{ML Code_Preproc.map_pre}~@{text "f"}~@{text "thy"} changes
   301      the preprocessor simpset.
   302 
   303   \item @{ML Code_Preproc.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds
   304      function transformer @{text f} (named @{text name}) to executable content;
   305      @{text f} is a transformer of the code equations belonging
   306      to a certain function definition, depending on the
   307      current theory context.  Returning @{text NONE} indicates that no
   308      transformation took place;  otherwise, the whole process will be iterated
   309      with the new code equations.
   310 
   311   \item @{ML Code_Preproc.del_functrans}~@{text "name"}~@{text "thy"} removes
   312      function transformer named @{text name} from executable content.
   313 
   314   \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds
   315      a datatype to executable content, with generation
   316      set @{text cs}.
   317 
   318   \item @{ML Code.get_type_of_constr_or_abstr}~@{text "thy"}~@{text "const"}
   319      returns type constructor corresponding to
   320      constructor @{text const}; returns @{text NONE}
   321      if @{text const} is no constructor.
   322 
   323   \end{description}
   324 \<close>
   325 
   326 
   327 subsubsection \<open>Data depending on the theory's executable content\<close>
   328 
   329 text \<open>
   330   Implementing code generator applications on top of the framework set
   331   out so far usually not only involves using those primitive
   332   interfaces but also storing code-dependent data and various other
   333   things.
   334 
   335   Due to incrementality of code generation, changes in the theory's
   336   executable content have to be propagated in a certain fashion.
   337   Additionally, such changes may occur not only during theory
   338   extension but also during theory merge, which is a little bit nasty
   339   from an implementation point of view.  The framework provides a
   340   solution to this technical challenge by providing a functorial data
   341   slot @{ML_functor Code_Data}; on instantiation of this functor, the
   342   following types and operations are required:
   343 
   344   \medskip
   345   \begin{tabular}{l}
   346   @{text "type T"} \\
   347   @{text "val empty: T"} \\
   348   \end{tabular}
   349 
   350   \begin{description}
   351 
   352   \item @{text T} the type of data to store.
   353 
   354   \item @{text empty} initial (empty) data.
   355 
   356   \end{description}
   357 
   358   \noindent An instance of @{ML_functor Code_Data} provides the
   359   following interface:
   360 
   361   \medskip
   362   \begin{tabular}{l}
   363   @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\
   364   @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}
   365   \end{tabular}
   366 
   367   \begin{description}
   368 
   369   \item @{text change} update of current data (cached!) by giving a
   370     continuation.
   371 
   372   \item @{text change_yield} update with side result.
   373 
   374   \end{description}
   375 \<close>
   376 
   377 end
   378