src/Doc/Codegen/Further.thy
 author haftmann Sat Nov 14 08:45:52 2015 +0100 (2015-11-14) changeset 61670 301e0b4ecd45 parent 61566 c3d6e570ccef child 61671 20d4cd2ceab2 permissions -rw-r--r--
coalesce permanent_interpretation.ML with interpretation.ML
     1 theory Further

     2 imports Setup

     3 begin

     4

     5 section \<open>Further issues \label{sec:further}\<close>

     6

     7 subsection \<open>Specialities of the @{text Scala} target language \label{sec:scala}\<close>

     8

     9 text \<open>

    10   @{text Scala} deviates from languages of the ML family in a couple

    11   of aspects; those which affect code generation mainly have to do with

    12   @{text Scala}'s type system:

    13

    14   \begin{itemize}

    15

    16     \item @{text Scala} prefers tupled syntax over curried syntax.

    17

    18     \item @{text Scala} sacrifices Hindely-Milner type inference for a

    19       much more rich type system with subtyping etc.  For this reason

    20       type arguments sometimes have to be given explicitly in square

    21       brackets (mimicking System F syntax).

    22

    23     \item In contrast to @{text Haskell} where most specialities of

    24       the type system are implemented using \emph{type classes},

    25       @{text Scala} provides a sophisticated system of \emph{implicit

    26       arguments}.

    27

    28   \end{itemize}

    29

    30   \noindent Concerning currying, the @{text Scala} serializer counts

    31   arguments in code equations to determine how many arguments

    32   shall be tupled; remaining arguments and abstractions in terms

    33   rather than function definitions are always curried.

    34

    35   The second aspect affects user-defined adaptations with @{command

    36   code_printing}.  For regular terms, the @{text Scala} serializer prints

    37   all type arguments explicitly.  For user-defined term adaptations

    38   this is only possible for adaptations which take no arguments: here

    39   the type arguments are just appended.  Otherwise they are ignored;

    40   hence user-defined adaptations for polymorphic constants have to be

    41   designed very carefully to avoid ambiguity.

    42

    43   Isabelle's type classes are mapped onto @{text Scala} implicits; in

    44   cases with diamonds in the subclass hierarchy this can lead to

    45   ambiguities in the generated code:

    46 \<close>

    47

    48 class %quote class1 =

    49   fixes foo :: "'a \<Rightarrow> 'a"

    50

    51 class %quote class2 = class1

    52

    53 class %quote class3 = class1

    54

    55 text \<open>

    56   \noindent Here both @{class class2} and @{class class3} inherit from @{class class1},

    57   forming the upper part of a diamond.

    58 \<close>

    59

    60 definition %quote bar :: "'a :: {class2, class3} \<Rightarrow> 'a" where

    61   "bar = foo"

    62

    63 text \<open>

    64   \noindent This yields the following code:

    65 \<close>

    66

    67 text %quotetypewriter \<open>

    68   @{code_stmts bar (Scala)}

    69 \<close>

    70

    71 text \<open>

    72   \noindent This code is rejected by the @{text Scala} compiler: in

    73   the definition of @{text bar}, it is not clear from where to derive

    74   the implicit argument for @{text foo}.

    75

    76   The solution to the problem is to close the diamond by a further

    77   class with inherits from both @{class class2} and @{class class3}:

    78 \<close>

    79

    80 class %quote class4 = class2 + class3

    81

    82 text \<open>

    83   \noindent Then the offending code equation can be restricted to

    84   @{class class4}:

    85 \<close>

    86

    87 lemma %quote [code]:

    88   "(bar :: 'a::class4 \<Rightarrow> 'a) = foo"

    89   by (simp only: bar_def)

    90

    91 text \<open>

    92   \noindent with the following code:

    93 \<close>

    94

    95 text %quotetypewriter \<open>

    96   @{code_stmts bar (Scala)}

    97 \<close>

    98

    99 text \<open>

   100   \noindent which exposes no ambiguity.

   101

   102   Since the preprocessor (cf.~\secref{sec:preproc}) propagates sort

   103   constraints through a system of code equations, it is usually not

   104   very difficult to identify the set of code equations which actually

   105   needs more restricted sort constraints.

   106 \<close>

   107

   108 subsection \<open>Modules namespace\<close>

   109

   110 text \<open>

   111   When invoking the @{command export_code} command it is possible to

   112   leave out the @{keyword "module_name"} part; then code is

   113   distributed over different modules, where the module name space

   114   roughly is induced by the Isabelle theory name space.

   115

   116   Then sometimes the awkward situation occurs that dependencies

   117   between definitions introduce cyclic dependencies between modules,

   118   which in the @{text Haskell} world leaves you to the mercy of the

   119   @{text Haskell} implementation you are using, while for @{text

   120   SML}/@{text OCaml} code generation is not possible.

   121

   122   A solution is to declare module names explicitly.  Let use assume

   123   the three cyclically dependent modules are named \emph{A}, \emph{B}

   124   and \emph{C}.  Then, by stating

   125 \<close>

   126

   127 code_identifier %quote

   128   code_module A \<rightharpoonup> (SML) ABC

   129 | code_module B \<rightharpoonup> (SML) ABC

   130 | code_module C \<rightharpoonup> (SML) ABC

   131

   132 text \<open>

   133   \noindent we explicitly map all those modules on \emph{ABC},

   134   resulting in an ad-hoc merge of this three modules at serialisation

   135   time.

   136 \<close>

   137

   138 subsection \<open>Locales and interpretation\<close>

   139

   140 text \<open>

   141   A technical issue comes to surface when generating code from

   142   specifications stemming from locale interpretation.

   143

   144   Let us assume a locale specifying a power operation on arbitrary

   145   types:

   146 \<close>

   147

   148 locale %quote power =

   149   fixes power :: "'a \<Rightarrow> 'b \<Rightarrow> 'b"

   150   assumes power_commute: "power x \<circ> power y = power y \<circ> power x"

   151 begin

   152

   153 text \<open>

   154   \noindent Inside that locale we can lift @{text power} to exponent

   155   lists by means of specification relative to that locale:

   156 \<close>

   157

   158 primrec %quote powers :: "'a list \<Rightarrow> 'b \<Rightarrow> 'b" where

   159   "powers [] = id"

   160 | "powers (x # xs) = power x \<circ> powers xs"

   161

   162 lemma %quote powers_append:

   163   "powers (xs @ ys) = powers xs \<circ> powers ys"

   164   by (induct xs) simp_all

   165

   166 lemma %quote powers_power:

   167   "powers xs \<circ> power x = power x \<circ> powers xs"

   168   by (induct xs)

   169     (simp_all del: o_apply id_apply add: comp_assoc,

   170       simp del: o_apply add: o_assoc power_commute)

   171

   172 lemma %quote powers_rev:

   173   "powers (rev xs) = powers xs"

   174     by (induct xs) (simp_all add: powers_append powers_power)

   175

   176 end %quote

   177

   178 text \<open>

   179   After an interpretation of this locale (say, @{command_def

   180   interpretation} @{text "fun_power:"} @{term [source] "power (\<lambda>n (f

   181   :: 'a \<Rightarrow> 'a). f ^^ n)"}), one would expect to have a constant @{text

   182   "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"} for which code

   183   can be generated.  But this not the case: internally, the term

   184   @{text "fun_power.powers"} is an abbreviation for the foundational

   185   term @{term [source] "power.powers (\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"}

   186   (see @{cite "isabelle-locale"} for the details behind).

   187

   188   Fortunately, with minor effort the desired behaviour can be

   189   achieved.  First, a dedicated definition of the constant on which

   190   the local @{text "powers"} after interpretation is supposed to be

   191   mapped on:

   192 \<close>

   193

   194 definition %quote funpows :: "nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a" where

   195   [code del]: "funpows = power.powers (\<lambda>n f. f ^^ n)"

   196

   197 text \<open>

   198   \noindent In general, the pattern is @{text "c = t"} where @{text c}

   199   is the name of the future constant and @{text t} the foundational

   200   term corresponding to the local constant after interpretation.

   201

   202   The interpretation itself is enriched with an equation @{text "t = c"}:

   203 \<close>

   204

   205 interpretation %quote fun_power: power "(\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)" rewrites

   206   "power.powers (\<lambda>n f. f ^^ n) = funpows"

   207   by unfold_locales

   208     (simp_all add: fun_eq_iff funpow_mult mult.commute funpows_def)

   209

   210 text \<open>

   211   \noindent This additional equation is trivially proved by the

   212   definition itself.

   213

   214   After this setup procedure, code generation can continue as usual:

   215 \<close>

   216

   217 text %quotetypewriter \<open>

   218   @{code_stmts funpows (consts) Nat.funpow funpows (Haskell)}

   219 \<close> (*<*)

   220

   221 (*>*) text \<open>

   222   Fortunately, an even more succint approach is available using command

   223   @{command permanent_interpretation}.

   224   Then the pattern above collapses to

   225 \<close> (*<*)

   226

   227 setup \<open>Sign.add_path "funpows"\<close>

   228 hide_const (open) funpows

   229

   230 (*>*)

   231 permanent_interpretation %quote fun_power: power "(\<lambda>n (f :: 'a \<Rightarrow> 'a). f ^^ n)"

   232   defining funpows = "fun_power.powers :: nat list \<Rightarrow> ('a \<Rightarrow> 'a) \<Rightarrow> 'a \<Rightarrow> 'a"

   233   by unfold_locales

   234     (simp_all add: fun_eq_iff funpow_mult mult.commute) (*<*)

   235

   236 setup \<open>Sign.parent_path\<close>

   237

   238 (*>*)

   239

   240

   241 subsection \<open>Parallel computation\<close>

   242

   243 text \<open>

   244   Theory @{text Parallel} in @{file "~~/src/HOL/Library"} contains

   245   operations to exploit parallelism inside the Isabelle/ML

   246   runtime engine.

   247 \<close>

   248

   249 subsection \<open>Imperative data structures\<close>

   250

   251 text \<open>

   252   If you consider imperative data structures as inevitable for a

   253   specific application, you should consider \emph{Imperative

   254   Functional Programming with Isabelle/HOL}

   255   @{cite "bulwahn-et-al:2008:imperative"}; the framework described there

   256   is available in session @{text Imperative_HOL}, together with a

   257   short primer document.

   258 \<close>

   259

   260

   261 subsection \<open>ML system interfaces \label{sec:ml}\<close>

   262

   263 text \<open>

   264   Since the code generator framework not only aims to provide a nice

   265   Isar interface but also to form a base for code-generation-based

   266   applications, here a short description of the most fundamental ML

   267   interfaces.

   268 \<close>

   269

   270 subsubsection \<open>Managing executable content\<close>

   271

   272 text %mlref \<open>

   273   \begin{mldecls}

   274   @{index_ML Code.read_const: "theory -> string -> string"} \\

   275   @{index_ML Code.add_eqn: "thm -> theory -> theory"} \\

   276   @{index_ML Code.del_eqn: "thm -> theory -> theory"} \\

   277   @{index_ML Code_Preproc.map_pre: "(Proof.context -> Proof.context) -> theory -> theory"} \\

   278   @{index_ML Code_Preproc.map_post: "(Proof.context -> Proof.context) -> theory -> theory"} \\

   279   @{index_ML Code_Preproc.add_functrans: "

   280     string * (Proof.context -> (thm * bool) list -> (thm * bool) list option)

   281       -> theory -> theory"} \\

   282   @{index_ML Code_Preproc.del_functrans: "string -> theory -> theory"} \\

   283   @{index_ML Code.add_datatype: "(string * typ) list -> theory -> theory"} \\

   284   @{index_ML Code.get_type: "theory -> string

   285     -> ((string * sort) list * (string * ((string * sort) list * typ list)) list) * bool"} \\

   286   @{index_ML Code.get_type_of_constr_or_abstr: "theory -> string -> (string * bool) option"}

   287   \end{mldecls}

   288

   289   \begin{description}

   290

   291   \item @{ML Code.read_const}~@{text thy}~@{text s}

   292      reads a constant as a concrete term expression @{text s}.

   293

   294   \item @{ML Code.add_eqn}~@{text "thm"}~@{text "thy"} adds function

   295      theorem @{text "thm"} to executable content.

   296

   297   \item @{ML Code.del_eqn}~@{text "thm"}~@{text "thy"} removes function

   298      theorem @{text "thm"} from executable content, if present.

   299

   300   \item @{ML Code_Preproc.map_pre}~@{text "f"}~@{text "thy"} changes

   301      the preprocessor simpset.

   302

   303   \item @{ML Code_Preproc.add_functrans}~@{text "(name, f)"}~@{text "thy"} adds

   304      function transformer @{text f} (named @{text name}) to executable content;

   305      @{text f} is a transformer of the code equations belonging

   306      to a certain function definition, depending on the

   307      current theory context.  Returning @{text NONE} indicates that no

   308      transformation took place;  otherwise, the whole process will be iterated

   309      with the new code equations.

   310

   311   \item @{ML Code_Preproc.del_functrans}~@{text "name"}~@{text "thy"} removes

   312      function transformer named @{text name} from executable content.

   313

   314   \item @{ML Code.add_datatype}~@{text cs}~@{text thy} adds

   315      a datatype to executable content, with generation

   316      set @{text cs}.

   317

   318   \item @{ML Code.get_type_of_constr_or_abstr}~@{text "thy"}~@{text "const"}

   319      returns type constructor corresponding to

   320      constructor @{text const}; returns @{text NONE}

   321      if @{text const} is no constructor.

   322

   323   \end{description}

   324 \<close>

   325

   326

   327 subsubsection \<open>Data depending on the theory's executable content\<close>

   328

   329 text \<open>

   330   Implementing code generator applications on top of the framework set

   331   out so far usually not only involves using those primitive

   332   interfaces but also storing code-dependent data and various other

   333   things.

   334

   335   Due to incrementality of code generation, changes in the theory's

   336   executable content have to be propagated in a certain fashion.

   337   Additionally, such changes may occur not only during theory

   338   extension but also during theory merge, which is a little bit nasty

   339   from an implementation point of view.  The framework provides a

   340   solution to this technical challenge by providing a functorial data

   341   slot @{ML_functor Code_Data}; on instantiation of this functor, the

   342   following types and operations are required:

   343

   344   \medskip

   345   \begin{tabular}{l}

   346   @{text "type T"} \\

   347   @{text "val empty: T"} \\

   348   \end{tabular}

   349

   350   \begin{description}

   351

   352   \item @{text T} the type of data to store.

   353

   354   \item @{text empty} initial (empty) data.

   355

   356   \end{description}

   357

   358   \noindent An instance of @{ML_functor Code_Data} provides the

   359   following interface:

   360

   361   \medskip

   362   \begin{tabular}{l}

   363   @{text "change: theory \<rightarrow> (T \<rightarrow> T) \<rightarrow> T"} \\

   364   @{text "change_yield: theory \<rightarrow> (T \<rightarrow> 'a * T) \<rightarrow> 'a * T"}

   365   \end{tabular}

   366

   367   \begin{description}

   368

   369   \item @{text change} update of current data (cached!) by giving a

   370     continuation.

   371

   372   \item @{text change_yield} update with side result.

   373

   374   \end{description}

   375 \<close>

   376

   377 end

   378