haftmann@31050: theory Adaptation
haftmann@28213: imports Setup
haftmann@28213: begin
haftmann@28213: 
haftmann@59377: setup %invisible \<open>Code_Target.add_derived_target ("\<SML>", [("SML", I)])
haftmann@59377:   #> Code_Target.add_derived_target ("\<SMLdummy>", [("Haskell", I)])\<close>
haftmann@28561: 
haftmann@59377: section \<open>Adaptation to target languages \label{sec:adaptation}\<close>
haftmann@28419: 
haftmann@59377: subsection \<open>Adapting code generation\<close>
haftmann@28561: 
haftmann@59377: text \<open>
haftmann@28561:   The aspects of code generation introduced so far have two aspects
haftmann@28561:   in common:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@38450: 
haftmann@38450:     \item They act uniformly, without reference to a specific target
haftmann@38450:        language.
haftmann@38450: 
haftmann@28561:     \item They are \emph{safe} in the sense that as long as you trust
haftmann@28561:        the code generator meta theory and implementation, you cannot
haftmann@38450:        produce programs that yield results which are not derivable in
haftmann@38450:        the logic.
haftmann@38450: 
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@38450:   \noindent In this section we will introduce means to \emph{adapt}
haftmann@38450:   the serialiser to a specific target language, i.e.~to print program
haftmann@38450:   fragments in a way which accommodates \qt{already existing}
haftmann@38450:   ingredients of a target language environment, for three reasons:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@28593:     \item improving readability and aesthetics of generated code
haftmann@28561:     \item gaining efficiency
haftmann@28561:     \item interface with language parts which have no direct counterpart
haftmann@28561:       in @{text "HOL"} (say, imperative data structures)
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@28561:   \noindent Generally, you should avoid using those features yourself
haftmann@28561:   \emph{at any cost}:
haftmann@28561: 
haftmann@28561:   \begin{itemize}
haftmann@38450: 
haftmann@38450:     \item The safe configuration methods act uniformly on every target
haftmann@38450:       language, whereas for adaptation you have to treat each target
haftmann@38450:       language separately.
haftmann@38450: 
haftmann@38450:     \item Application is extremely tedious since there is no
haftmann@38450:       abstraction which would allow for a static check, making it easy
haftmann@38450:       to produce garbage.
haftmann@38450: 
paulson@34155:     \item Subtle errors can be introduced unconsciously.
haftmann@38450: 
haftmann@28561:   \end{itemize}
haftmann@28561: 
haftmann@38450:   \noindent However, even if you ought refrain from setting up
haftmann@51162:   adaptation yourself, already @{text "HOL"} comes with some
haftmann@38450:   reasonable default adaptations (say, using target language list
haftmann@38450:   syntax).  There also some common adaptation cases which you can
haftmann@38450:   setup by importing particular library theories.  In order to
haftmann@38450:   understand these, we provide some clues here; these however are not
haftmann@38450:   supposed to replace a careful study of the sources.
haftmann@59377: \<close>
haftmann@28561: 
haftmann@38450: 
haftmann@59377: subsection \<open>The adaptation principle\<close>
haftmann@28561: 
haftmann@59377: text \<open>
haftmann@38450:   Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is
haftmann@38450:   conceptually supposed to be:
haftmann@28601: 
wenzelm@60768:   \begin{figure}[h]
wenzelm@52742:     \begin{tikzpicture}[scale = 0.5]
wenzelm@52742:       \tikzstyle water=[color = blue, thick]
wenzelm@52742:       \tikzstyle ice=[color = black, very thick, cap = round, join = round, fill = white]
wenzelm@52742:       \tikzstyle process=[color = green, semithick, ->]
wenzelm@52742:       \tikzstyle adaptation=[color = red, semithick, ->]
wenzelm@52742:       \tikzstyle target=[color = black]
wenzelm@52742:       \foreach \x in {0, ..., 24}
wenzelm@52742:         \draw[style=water] (\x, 0.25) sin + (0.25, 0.25) cos + (0.25, -0.25) sin
wenzelm@52742:           + (0.25, -0.25) cos + (0.25, 0.25);
wenzelm@52742:       \draw[style=ice] (1, 0) --
wenzelm@52742:         (3, 6) node[above, fill=white] {logic} -- (5, 0) -- cycle;
wenzelm@52742:       \draw[style=ice] (9, 0) --
wenzelm@52742:         (11, 6) node[above, fill=white] {intermediate language} -- (13, 0) -- cycle;
wenzelm@52742:       \draw[style=ice] (15, -6) --
wenzelm@52742:         (19, 6) node[above, fill=white] {target language} -- (23, -6) -- cycle;
wenzelm@52742:       \draw[style=process]
wenzelm@52742:         (3.5, 3) .. controls (7, 5) .. node[fill=white] {translation} (10.5, 3);
wenzelm@52742:       \draw[style=process]
wenzelm@52742:         (11.5, 3) .. controls (15, 5) .. node[fill=white] (serialisation) {serialisation} (18.5, 3);
wenzelm@52742:       \node (adaptation) at (11, -2) [style=adaptation] {adaptation};
wenzelm@52742:       \node at (19, 3) [rotate=90] {generated};
wenzelm@52742:       \node at (19.5, -5) {language};
wenzelm@52742:       \node at (19.5, -3) {library};
wenzelm@52742:       \node (includes) at (19.5, -1) {includes};
wenzelm@52742:       \node (reserved) at (16.5, -3) [rotate=72] {reserved}; % proper 71.57
wenzelm@52742:       \draw[style=process]
wenzelm@52742:         (includes) -- (serialisation);
wenzelm@52742:       \draw[style=process]
wenzelm@52742:         (reserved) -- (serialisation);
wenzelm@52742:       \draw[style=adaptation]
wenzelm@52742:         (adaptation) -- (serialisation);
wenzelm@52742:       \draw[style=adaptation]
wenzelm@52742:         (adaptation) -- (includes);
wenzelm@52742:       \draw[style=adaptation]
wenzelm@52742:         (adaptation) -- (reserved);
wenzelm@52742:     \end{tikzpicture}
haftmann@31050:     \caption{The adaptation principle}
haftmann@31050:     \label{fig:adaptation}
haftmann@28601:   \end{figure}
haftmann@28601: 
haftmann@28601:   \noindent In the tame view, code generation acts as broker between
haftmann@38450:   @{text logic}, @{text "intermediate language"} and @{text "target
haftmann@38450:   language"} by means of @{text translation} and @{text
haftmann@38450:   serialisation}; for the latter, the serialiser has to observe the
haftmann@38450:   structure of the @{text language} itself plus some @{text reserved}
haftmann@38450:   keywords which have to be avoided for generated code.  However, if
haftmann@38450:   you consider @{text adaptation} mechanisms, the code generated by
haftmann@38450:   the serializer is just the tip of the iceberg:
haftmann@28601: 
haftmann@28601:   \begin{itemize}
haftmann@38450: 
haftmann@28635:     \item @{text serialisation} can be \emph{parametrised} such that
haftmann@28635:       logical entities are mapped to target-specific ones
haftmann@38450:       (e.g. target-specific list syntax, see also
haftmann@38450:       \secref{sec:adaptation_mechanisms})
haftmann@38450: 
haftmann@28635:     \item Such parametrisations can involve references to a
haftmann@38450:       target-specific standard @{text library} (e.g. using the @{text
haftmann@38450:       Haskell} @{verbatim Maybe} type instead of the @{text HOL}
haftmann@38450:       @{type "option"} type); if such are used, the corresponding
haftmann@38450:       identifiers (in our example, @{verbatim Maybe}, @{verbatim
haftmann@38450:       Nothing} and @{verbatim Just}) also have to be considered @{text
haftmann@38450:       reserved}.
haftmann@38450: 
haftmann@28635:     \item Even more, the user can enrich the library of the
haftmann@38450:       target-language by providing code snippets (\qt{@{text
haftmann@38450:       "includes"}}) which are prepended to any generated code (see
haftmann@38450:       \secref{sec:include}); this typically also involves further
haftmann@38450:       @{text reserved} identifiers.
haftmann@38450: 
haftmann@28601:   \end{itemize}
haftmann@28635: 
haftmann@38450:   \noindent As figure \ref{fig:adaptation} illustrates, all these
haftmann@38450:   adaptation mechanisms have to act consistently; it is at the
haftmann@38450:   discretion of the user to take care for this.
haftmann@59377: \<close>
haftmann@28561: 
haftmann@59482: subsection \<open>Common adaptation applications\<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@28428:   The @{theory HOL} @{theory Main} theory already provides a code
haftmann@38450:   generator setup which should be suitable for most applications.
haftmann@38450:   Common extensions and modifications are available by certain
wenzelm@63680:   theories in \<^dir>\<open>~~/src/HOL/Library\<close>; beside being useful in
haftmann@38450:   applications, they may serve as a tutorial for customising the code
haftmann@38450:   generator setup (see below \secref{sec:adaptation_mechanisms}).
haftmann@28419: 
haftmann@28419:   \begin{description}
haftmann@28419: 
haftmann@51143:     \item[@{theory "Code_Numeral"}] provides additional numeric
haftmann@51143:        types @{typ integer} and @{typ natural} isomorphic to types
haftmann@51143:        @{typ int} and @{typ nat} respectively.  Type @{typ integer}
haftmann@51143:        is mapped to target-language built-in integers; @{typ natural}
haftmann@51143:        is implemented as abstract type over @{typ integer}.
haftmann@51143:        Useful for code setups which involve e.g.~indexing
haftmann@51143:        of target-language arrays.  Part of @{text "HOL-Main"}.
haftmann@51143: 
haftmann@51143:     \item[@{text "Code_Target_Int"}] implements type @{typ int}
haftmann@51143:        by @{typ integer} and thus by target-language built-in integers.
haftmann@38450: 
haftmann@51171:     \item[@{text "Code_Binary_Nat"}] implements type
haftmann@51143:        @{typ nat} using a binary rather than a linear representation,
haftmann@51143:        which yields a considerable speedup for computations.
wenzelm@61076:        Pattern matching with @{term "0::nat"} / @{const "Suc"} is eliminated
haftmann@51171:        by a preprocessor.\label{abstract_nat}
haftmann@51143: 
haftmann@51171:     \item[@{text "Code_Target_Nat"}] implements type @{typ nat}
haftmann@51171:        by @{typ integer} and thus by target-language built-in integers.
wenzelm@61076:        Pattern matching with @{term "0::nat"} / @{const "Suc"} is eliminated
haftmann@51171:        by a preprocessor.
haftmann@51143: 
haftmann@51162:     \item[@{text "Code_Target_Numeral"}] is a convenience theory
haftmann@51143:        containing both @{text "Code_Target_Nat"} and
haftmann@51143:        @{text "Code_Target_Int"}.
haftmann@38450: 
haftmann@38450:     \item[@{theory "String"}] provides an additional datatype @{typ
haftmann@38450:        String.literal} which is isomorphic to strings; @{typ
haftmann@38450:        String.literal}s are mapped to target-language strings.  Useful
haftmann@38450:        for code setups which involve e.g.~printing (error) messages.
haftmann@46519:        Part of @{text "HOL-Main"}.
haftmann@28419: 
haftmann@59482:     \item[@{text "Code_Char"}] represents @{text HOL} characters by
haftmann@59482:        character literals in target languages.  \emph{Warning:} This
haftmann@59482:        modifies adaptation in a non-conservative manner and thus
haftmann@59482:        should always be imported \emph{last} in a theory header.
haftmann@59482: 
haftmann@51162:     \item[@{theory "IArray"}] provides a type @{typ "'a iarray"}
haftmann@51162:        isomorphic to lists but implemented by (effectively immutable)
haftmann@51162:        arrays \emph{in SML only}.
haftmann@28419: 
haftmann@51162:   \end{description}
haftmann@59377: \<close>
haftmann@28419: 
haftmann@28419: 
haftmann@59377: subsection \<open>Parametrising serialisation \label{sec:adaptation_mechanisms}\<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   Consider the following function and its corresponding SML code:
haftmann@59377: \<close>
haftmann@28419: 
haftmann@28564: primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
haftmann@28419:   "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
haftmann@28447: (*<*)
haftmann@52378: code_printing %invisible
haftmann@52378:   type_constructor bool \<rightharpoonup> (SML)
haftmann@52378: | constant True \<rightharpoonup> (SML)
haftmann@52378: | constant False \<rightharpoonup> (SML)
haftmann@52378: | constant HOL.conj \<rightharpoonup> (SML)
haftmann@52378: | constant Not \<rightharpoonup> (SML)
haftmann@28447: (*>*)
haftmann@59377: text %quotetypewriter \<open>
haftmann@39683:   @{code_stmts in_interval (SML)}
haftmann@59377: \<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   \noindent Though this is correct code, it is a little bit
haftmann@38450:   unsatisfactory: boolean values and operators are materialised as
haftmann@38450:   distinguished entities with have nothing to do with the SML-built-in
haftmann@38450:   notion of \qt{bool}.  This results in less readable code;
haftmann@38450:   additionally, eager evaluation may cause programs to loop or break
haftmann@38450:   which would perfectly terminate when the existing SML @{verbatim
haftmann@38450:   "bool"} would be used.  To map the HOL @{typ bool} on SML @{verbatim
haftmann@38450:   "bool"}, we may use \qn{custom serialisations}:
haftmann@59377: \<close>
haftmann@28419: 
haftmann@52378: code_printing %quotett
haftmann@52378:   type_constructor bool \<rightharpoonup> (SML) "bool"
haftmann@52378: | constant True \<rightharpoonup> (SML) "true"
haftmann@52378: | constant False \<rightharpoonup> (SML) "false"
haftmann@52378: | constant HOL.conj \<rightharpoonup> (SML) "_ andalso _"
haftmann@28213: 
haftmann@59377: text \<open>
haftmann@52378:   \noindent The @{command_def code_printing} command takes a series
haftmann@52378:   of symbols (contants, type constructor, \ldots)
haftmann@52378:   together with target-specific custom serialisations.  Each
haftmann@38450:   custom serialisation starts with a target language identifier
haftmann@38450:   followed by an expression, which during code serialisation is
haftmann@52378:   inserted whenever the type constructor would occur.  Each
haftmann@38450:   ``@{verbatim "_"}'' in a serialisation expression is treated as a
haftmann@52378:   placeholder for the constant's or the type constructor's arguments.
haftmann@59377: \<close>
haftmann@28419: 
haftmann@59377: text %quotetypewriter \<open>
haftmann@39683:   @{code_stmts in_interval (SML)}
haftmann@59377: \<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   \noindent This still is not perfect: the parentheses around the
haftmann@38450:   \qt{andalso} expression are superfluous.  Though the serialiser by
haftmann@38450:   no means attempts to imitate the rich Isabelle syntax framework, it
haftmann@38450:   provides some common idioms, notably associative infixes with
haftmann@38450:   precedences which may be used here:
haftmann@59377: \<close>
haftmann@28419: 
haftmann@52378: code_printing %quotett
haftmann@52378:   constant HOL.conj \<rightharpoonup> (SML) infixl 1 "andalso"
haftmann@28419: 
haftmann@59377: text %quotetypewriter \<open>
haftmann@39683:   @{code_stmts in_interval (SML)}
haftmann@59377: \<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   \noindent The attentive reader may ask how we assert that no
haftmann@38450:   generated code will accidentally overwrite.  For this reason the
haftmann@38450:   serialiser has an internal table of identifiers which have to be
haftmann@38450:   avoided to be used for new declarations.  Initially, this table
haftmann@38450:   typically contains the keywords of the target language.  It can be
haftmann@38450:   extended manually, thus avoiding accidental overwrites, using the
haftmann@38505:   @{command_def "code_reserved"} command:
haftmann@59377: \<close>
haftmann@28561: 
haftmann@40351: code_reserved %quote "\<SMLdummy>" bool true false andalso
haftmann@28561: 
haftmann@59377: text \<open>
haftmann@28447:   \noindent Next, we try to map HOL pairs to SML pairs, using the
haftmann@28419:   infix ``@{verbatim "*"}'' type constructor and parentheses:
haftmann@59377: \<close>
haftmann@28447: (*<*)
haftmann@52378: code_printing %invisible
haftmann@52378:   type_constructor prod \<rightharpoonup> (SML)
haftmann@52378: | constant Pair \<rightharpoonup> (SML)
haftmann@28447: (*>*)
haftmann@52378: code_printing %quotett
haftmann@52378:   type_constructor prod \<rightharpoonup> (SML) infix 2 "*"
haftmann@52378: | constant Pair \<rightharpoonup> (SML) "!((_),/ (_))"
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@28593:   \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
haftmann@38450:   never to put parentheses around the whole expression (they are
haftmann@38450:   already present), while the parentheses around argument place
haftmann@38450:   holders tell not to put parentheses around the arguments.  The slash
haftmann@38450:   ``@{verbatim "/"}'' (followed by arbitrary white space) inserts a
haftmann@38450:   space which may be used as a break if necessary during pretty
haftmann@38450:   printing.
haftmann@28419: 
haftmann@38450:   These examples give a glimpse what mechanisms custom serialisations
haftmann@38450:   provide; however their usage requires careful thinking in order not
haftmann@38450:   to introduce inconsistencies -- or, in other words: custom
haftmann@38450:   serialisations are completely axiomatic.
haftmann@28419: 
haftmann@39643:   A further noteworthy detail is that any special character in a
haftmann@38450:   custom serialisation may be quoted using ``@{verbatim "'"}''; thus,
haftmann@38450:   in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a
haftmann@38450:   proper underscore while the second ``@{verbatim "_"}'' is a
haftmann@38450:   placeholder.
haftmann@59377: \<close>
haftmann@28419: 
haftmann@28419: 
haftmann@59377: subsection \<open>@{text Haskell} serialisation\<close>
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   For convenience, the default @{text HOL} setup for @{text Haskell}
haftmann@39063:   maps the @{class equal} class to its counterpart in @{text Haskell},
haftmann@52378:   giving custom serialisations for the class @{class equal}
haftmann@52378:   and its operation @{const [source] HOL.equal}.
haftmann@59377: \<close>
haftmann@28419: 
haftmann@52378: code_printing %quotett
haftmann@52378:   type_class equal \<rightharpoonup> (Haskell) "Eq"
haftmann@52378: | constant HOL.equal \<rightharpoonup> (Haskell) infixl 4 "=="
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   \noindent A problem now occurs whenever a type which is an instance
haftmann@39063:   of @{class equal} in @{text HOL} is mapped on a @{text
haftmann@38450:   Haskell}-built-in type which is also an instance of @{text Haskell}
haftmann@38450:   @{text Eq}:
haftmann@59377: \<close>
haftmann@28419: 
haftmann@28564: typedecl %quote bar
haftmann@28419: 
haftmann@39063: instantiation %quote bar :: equal
haftmann@28419: begin
haftmann@28419: 
wenzelm@61076: definition %quote "HOL.equal (x::bar) y \<longleftrightarrow> x = y"
haftmann@28419: 
wenzelm@61169: instance %quote by standard (simp add: equal_bar_def)
haftmann@28213: 
haftmann@30880: end %quote (*<*)
haftmann@30880: 
haftmann@52378: (*>*) code_printing %quotett
haftmann@52378:   type_constructor bar \<rightharpoonup> (Haskell) "Integer"
haftmann@28419: 
haftmann@59377: text \<open>
haftmann@38450:   \noindent The code generator would produce an additional instance,
haftmann@38450:   which of course is rejected by the @{text Haskell} compiler.  To
haftmann@52378:   suppress this additional instance:
haftmann@59377: \<close>
haftmann@28419: 
haftmann@52378: code_printing %quotett
haftmann@52378:   class_instance bar :: "HOL.equal" \<rightharpoonup> (Haskell) -
haftmann@28419: 
haftmann@28561: 
haftmann@59377: subsection \<open>Enhancing the target language context \label{sec:include}\<close>
haftmann@28561: 
haftmann@59377: text \<open>
haftmann@28593:   In rare cases it is necessary to \emph{enrich} the context of a
haftmann@52378:   target language; this can also be accomplished using the @{command
haftmann@52378:   "code_printing"} command:
haftmann@59377: \<close>
haftmann@28561: 
haftmann@52378: code_printing %quotett
haftmann@59379:   code_module "Errno" \<rightharpoonup> (Haskell)
haftmann@59379:     \<open>errno i = error ("Error number: " ++ show i)\<close>
haftmann@28561: 
haftmann@39745: code_reserved %quotett Haskell Errno
haftmann@28561: 
haftmann@59377: text \<open>
haftmann@52378:   \noindent Such named modules are then prepended to every
haftmann@38450:   generated code.  Inspect such code in order to find out how
haftmann@52378:   this behaves with respect to a particular
haftmann@38450:   target language.
haftmann@59377: \<close>
haftmann@28561: 
haftmann@28419: end
haftmann@46519: