src/Doc/Codegen/Further.thy
changeset 48985 5386df44a037
parent 48951 b9238cbcdd41
child 49739 13aa6d8268ec
equal deleted inserted replaced
48984:f51d4a302962 48985:5386df44a037
       
     1 theory Further
       
     2 imports Setup
       
     3 begin
       
     4 
       
     5 section {* Further issues \label{sec:further} *}
       
     6 
       
     7 subsection {* Specialities of the @{text Scala} target language \label{sec:scala} *}
       
     8 
       
     9 text {*
       
    10   @{text Scala} deviates from languages of the ML family in a couple
       
    11   of aspects; those which affect code generation mainly have to do with
       
    12   @{text Scala}'s type system:
       
    13 
       
    14   \begin{itemize}
       
    15 
       
    16     \item @{text Scala} prefers tupled syntax over curried syntax.
       
    17 
       
    18     \item @{text Scala} sacrifices Hindely-Milner type inference for a
       
    19       much more rich type system with subtyping etc.  For this reason
       
    20       type arguments sometimes have to be given explicitly in square
       
    21       brackets (mimicking System F syntax).
       
    22 
       
    23     \item In contrast to @{text Haskell} where most specialities of
       
    24       the type system are implemented using \emph{type classes},
       
    25       @{text Scala} provides a sophisticated system of \emph{implicit
       
    26       arguments}.
       
    27 
       
    28   \end{itemize}
       
    29 
       
    30   \noindent Concerning currying, the @{text Scala} serializer counts
       
    31   arguments in code equations to determine how many arguments
       
    32   shall be tupled; remaining arguments and abstractions in terms
       
    33   rather than function definitions are always curried.
       
    34 
       
    35   The second aspect affects user-defined adaptations with @{command
       
    36   code_const}.  For regular terms, the @{text Scala} serializer prints
       
    37   all type arguments explicitly.  For user-defined term adaptations
       
    38   this is only possible for adaptations which take no arguments: here
       
    39   the type arguments are just appended.  Otherwise they are ignored;
       
    40   hence user-defined adaptations for polymorphic constants have to be
       
    41   designed very carefully to avoid ambiguity.
       
    42 
       
    43   Isabelle's type classes are mapped onto @{text Scala} implicits; in
       
    44   cases with diamonds in the subclass hierarchy this can lead to
       
    45   ambiguities in the generated code:
       
    46 *}
       
    47 
       
    48 class %quote class1 =
       
    49   fixes foo :: "'a \<Rightarrow> 'a"
       
    50 
       
    51 class %quote class2 = class1
       
    52 
       
    53 class %quote class3 = class1
       
    54 
       
    55 text {*
       
    56   \noindent Here both @{class class2} and @{class class3} inherit from @{class class1},
       
    57   forming the upper part of a diamond.
       
    58 *}
       
    59 
       
    60 definition %quote bar :: "'a :: {class2, class3} \<Rightarrow> 'a" where
       
    61   "bar = foo"
       
    62 
       
    63 text {*
       
    64   \noindent This yields the following code:
       
    65 *}
       
    66 
       
    67 text %quotetypewriter {*
       
    68   @{code_stmts bar (Scala)}
       
    69 *}
       
    70 
       
    71 text {*
       
    72   \noindent This code is rejected by the @{text Scala} compiler: in
       
    73   the definition of @{text bar}, it is not clear from where to derive
       
    74   the implicit argument for @{text foo}.
       
    75 
       
    76   The solution to the problem is to close the diamond by a further
       
    77   class with inherits from both @{class class2} and @{class class3}:
       
    78 *}
       
    79 
       
    80 class %quote class4 = class2 + class3
       
    81 
       
    82 text {*
       
    83   \noindent Then the offending code equation can be restricted to
       
    84   @{class class4}:
       
    85 *}
       
    86 
       
    87 lemma %quote [code]:
       
    88   "(bar :: 'a::class4 \<Rightarrow> 'a) = foo"
       
    89   by (simp only: bar_def)
       
    90 
       
    91 text {*
       
    92   \noindent with the following code:
       
    93 *}
       
    94 
       
    95 text %quotetypewriter {*
       
    96   @{code_stmts bar (Scala)}
       
    97 *}
       
    98 
       
    99 text {*
       
   100   \noindent which exposes no ambiguity.
       
   101 
       
   102   Since the preprocessor (cf.~\secref{sec:preproc}) propagates sort
       
   103   constraints through a system of code equations, it is usually not
       
   104   very difficult to identify the set of code equations which actually
       
   105   needs more restricted sort constraints.
       
   106 *}
       
   107 
       
   108 subsection {* Modules namespace *}
       
   109 
       
   110 text {*
       
   111   When invoking the @{command export_code} command it is possible to
       
   112   leave out the @{keyword "module_name"} part; then code is
       
   113   distributed over different modules, where the module name space
       
   114   roughly is induced by the Isabelle theory name space.
       
   115 
       
   116   Then sometimes the awkward situation occurs that dependencies
       
   117   between definitions introduce cyclic dependencies between modules,
       
   118   which in the @{text Haskell} world leaves you to the mercy of the
       
   119   @{text Haskell} implementation you are using, while for @{text
       
   120   SML}/@{text OCaml} code generation is not possible.
       
   121 
       
   122   A solution is to declare module names explicitly.  Let use assume
       
   123   the three cyclically dependent modules are named \emph{A}, \emph{B}
       
   124   and \emph{C}.  Then, by stating
       
   125 *}
       
   126 
       
   127 code_modulename %quote SML
       
   128   A ABC
       
   129   B ABC
       
   130   C ABC
       
   131 
       
   132 text {*
       
   133   \noindent we explicitly map all those modules on \emph{ABC},
       
   134   resulting in an ad-hoc merge of this three modules at serialisation
       
   135   time.
       
   136 *}
       
   137 
       
   138 subsection {* Locales and interpretation *}
       
   139 
       
   140 text {*
       
   141   A technical issue comes to surface when generating code from
       
   142   specifications stemming from locale interpretation.
       
   143 
       
   144   Let us assume a locale specifying a power operation on arbitrary
       
   145   types:
       
   146 *}
       
   147 
       
   148 locale %quote power =
       
   149   fixes power :: "'a \<Rightarrow> 'b \<Rightarrow> 'b"
       
   150   assumes power_commute: "power x \<circ> power y = power y \<circ> power x"
       
   151 begin
       
   152 
       
   153 text {*
       
   154   \noindent Inside that locale we can lift @{text power} to exponent
       
   155   lists by means of specification relative to that locale:
       
   156 *}
       
   157 
       
   158 primrec %quote powers :: "'a list \<Rightarrow> 'b \<Rightarrow> 'b" where
       
   159   "powers [] = id"
       
   160 | "powers (x # xs) = power x \<circ> powers xs"
       
   161 
       
   162 lemma %quote powers_append:
       
   163   "powers (xs @ ys) = powers xs \<circ> powers ys"
       
   164   by (induct xs) simp_all
       
   165 
       
   166 lemma %quote powers_power:
       
   167   "powers xs \<circ> power x = power x \<circ> powers xs"
       
   168   by (induct xs)
       
   169     (simp_all del: o_apply id_apply add: o_assoc [symmetric],
       
   170       simp del: o_apply add: o_assoc power_commute)
       
   171 
       
   172 lemma %quote powers_rev:
       
   173   "powers (rev xs) = powers xs"
       
   174     by (induct xs) (simp_all add: powers_append powers_power)
       
   175 
       
   176 end %quote
       
   177 
       
   178 text {*
       
   179   After an interpretation of this locale (say, @{command_def
       
   180   interpretation} @{text "fun_power:"} @{term [source] "power (\<lambda>n (f
       
   181   :: 'a \<Rightarrow> 'a). f ^^ n)"}), one would expect to have a constant @{text
       
   182   "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"} for which code
       
   183   can be generated.  But this not the case: internally, the term
       
   184   @{text "fun_power.powers"} is an abbreviation for the foundational
       
   185   term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}
       
   186   (see \cite{isabelle-locale} for the details behind).
       
   187 
       
   188   Fortunately, with minor effort the desired behaviour can be
       
   189   achieved.  First, a dedicated definition of the constant on which
       
   190   the local @{text "powers"} after interpretation is supposed to be
       
   191   mapped on:
       
   192 *}
       
   193 
       
   194 definition %quote funpows :: "nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a" where
       
   195   [code del]: "funpows = power.powers (\<lambda>n f. f ^^ n)"
       
   196 
       
   197 text {*
       
   198   \noindent In general, the pattern is @{text "c = t"} where @{text c}
       
   199   is the name of the future constant and @{text t} the foundational
       
   200   term corresponding to the local constant after interpretation.
       
   201 
       
   202   The interpretation itself is enriched with an equation @{text "t = c"}:
       
   203 *}
       
   204 
       
   205 interpretation %quote fun_power: power "\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n" where
       
   206   "power.powers (\<lambda>n f. f ^^ n) = funpows"
       
   207   by unfold_locales
       
   208     (simp_all add: fun_eq_iff funpow_mult mult_commute funpows_def)
       
   209 
       
   210 text {*
       
   211   \noindent This additional equation is trivially proved by the
       
   212   definition itself.
       
   213 
       
   214   After this setup procedure, code generation can continue as usual:
       
   215 *}
       
   216 
       
   217 text %quotetypewriter {*
       
   218   @{code_stmts funpows (consts) Nat.funpow funpows (Haskell)}
       
   219 *}
       
   220 
       
   221 
       
   222 subsection {* Imperative data structures *}
       
   223 
       
   224 text {*
       
   225   If you consider imperative data structures as inevitable for a
       
   226   specific application, you should consider \emph{Imperative
       
   227   Functional Programming with Isabelle/HOL}
       
   228   \cite{bulwahn-et-al:2008:imperative}; the framework described there
       
   229   is available in session @{text Imperative_HOL}, together with a
       
   230   short primer document.
       
   231 *}
       
   232 
       
   233 
       
   234 subsection {* ML system interfaces \label{sec:ml} *}
       
   235 
       
   236 text {*
       
   237   Since the code generator framework not only aims to provide a nice
       
   238   Isar interface but also to form a base for code-generation-based
       
   239   applications, here a short description of the most fundamental ML
       
   240   interfaces.
       
   241 *}
       
   242 
       
   243 subsubsection {* Managing executable content *}
       
   244 
       
   245 text %mlref {*
       
   246   \begin{mldecls}
       
   247   @{index_ML Code.read_const: "theory -> string -> string"} \\
       
   248   @{index_ML Code.add_eqn: "thm -> theory -> theory"} \\
       
   249   @{index_ML Code.del_eqn: "thm -> theory -> theory"} \\
       
   250   @{index_ML Code_Preproc.map_pre: "(simpset -> simpset) -> theory -> theory"} \\
       
   251   @{index_ML Code_Preproc.map_post: "(simpset -> simpset) -> theory -> theory"} \\
       
   252   @{index_ML Code_Preproc.add_functrans: "
       
   253     string * (theory -> (thm * bool) list -> (thm * bool) list option)
       
   254       -> theory -> theory"} \\
       
   255   @{index_ML Code_Preproc.del_functrans: "string -> theory -> theory"} \\
       
   256   @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\
       
   257   @{index_ML Code.get_type: "theory -> string
       
   258     -> ((string * sort) list * (string * ((string * sort) list * typ list)) list) * bool"} \\
       
   259   @{index_ML Code.get_type_of_constr_or_abstr: "theory -> string -> (string * bool) option"}
       
   260   \end{mldecls}
       
   261 
       
   262   \begin{description}
       
   263 
       
   264   \item @{ML Code.read_const}~@{text thy}~@{text s}
       
   265      reads a constant as a concrete term expression @{text s}.
       
   266 
       
   267   \item @{ML Code.add_eqn}~@{text "thm"}~@{text "thy"} adds function
       
   268      theorem @{text "thm"} to executable content.
       
   269 
       
   270   \item @{ML Code.del_eqn}~@{text "thm"}~@{text "thy"} removes function
       
   271      theorem @{text "thm"} from executable content, if present.
       
   272 
       
   273   \item @{ML Code_Preproc.map_pre}~@{text "f"}~@{text "thy"} changes
       
   274      the preprocessor simpset.
       
   275 
       
   276   \item @{ML Code_Preproc.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds
       
   277      function transformer @{text f} (named @{text name}) to executable content;
       
   278      @{text f} is a transformer of the code equations belonging
       
   279      to a certain function definition, depending on the
       
   280      current theory context.  Returning @{text NONE} indicates that no
       
   281      transformation took place;  otherwise, the whole process will be iterated
       
   282      with the new code equations.
       
   283 
       
   284   \item @{ML Code_Preproc.del_functrans}~@{text "name"}~@{text "thy"} removes
       
   285      function transformer named @{text name} from executable content.
       
   286 
       
   287   \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds
       
   288      a datatype to executable content, with generation
       
   289      set @{text cs}.
       
   290 
       
   291   \item @{ML Code.get_type_of_constr_or_abstr}~@{text "thy"}~@{text "const"}
       
   292      returns type constructor corresponding to
       
   293      constructor @{text const}; returns @{text NONE}
       
   294      if @{text const} is no constructor.
       
   295 
       
   296   \end{description}
       
   297 *}
       
   298 
       
   299 
       
   300 subsubsection {* Data depending on the theory's executable content *}
       
   301 
       
   302 text {*
       
   303   Implementing code generator applications on top of the framework set
       
   304   out so far usually not only involves using those primitive
       
   305   interfaces but also storing code-dependent data and various other
       
   306   things.
       
   307 
       
   308   Due to incrementality of code generation, changes in the theory's
       
   309   executable content have to be propagated in a certain fashion.
       
   310   Additionally, such changes may occur not only during theory
       
   311   extension but also during theory merge, which is a little bit nasty
       
   312   from an implementation point of view.  The framework provides a
       
   313   solution to this technical challenge by providing a functorial data
       
   314   slot @{ML_functor Code_Data}; on instantiation of this functor, the
       
   315   following types and operations are required:
       
   316 
       
   317   \medskip
       
   318   \begin{tabular}{l}
       
   319   @{text "type T"} \\
       
   320   @{text "val empty: T"} \\
       
   321   \end{tabular}
       
   322 
       
   323   \begin{description}
       
   324 
       
   325   \item @{text T} the type of data to store.
       
   326 
       
   327   \item @{text empty} initial (empty) data.
       
   328 
       
   329   \end{description}
       
   330 
       
   331   \noindent An instance of @{ML_functor Code_Data} provides the
       
   332   following interface:
       
   333 
       
   334   \medskip
       
   335   \begin{tabular}{l}
       
   336   @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\
       
   337   @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}
       
   338   \end{tabular}
       
   339 
       
   340   \begin{description}
       
   341 
       
   342   \item @{text change} update of current data (cached!) by giving a
       
   343     continuation.
       
   344 
       
   345   \item @{text change_yield} update with side result.
       
   346 
       
   347   \end{description}
       
   348 *}
       
   349 
       
   350 end
       
   351