haftmann@31050: theory Adaptation haftmann@28213: imports Setup haftmann@28213: begin haftmann@28213: haftmann@59377: setup %invisible \Code_Target.add_derived_target ("\", [("SML", I)]) haftmann@59377: #> Code_Target.add_derived_target ("\", [("Haskell", I)])\ haftmann@28561: haftmann@59377: section \Adaptation to target languages \label{sec:adaptation}\ haftmann@28419: haftmann@59377: subsection \Adapting code generation\ haftmann@28561: haftmann@59377: text \ haftmann@28561: The aspects of code generation introduced so far have two aspects haftmann@28561: in common: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@38450: haftmann@38450: \item They act uniformly, without reference to a specific target haftmann@38450: language. haftmann@38450: haftmann@28561: \item They are \emph{safe} in the sense that as long as you trust haftmann@28561: the code generator meta theory and implementation, you cannot haftmann@38450: produce programs that yield results which are not derivable in haftmann@38450: the logic. haftmann@38450: haftmann@28561: \end{itemize} haftmann@28561: haftmann@38450: \noindent In this section we will introduce means to \emph{adapt} haftmann@38450: the serialiser to a specific target language, i.e.~to print program haftmann@38450: fragments in a way which accommodates \qt{already existing} haftmann@38450: ingredients of a target language environment, for three reasons: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@28593: \item improving readability and aesthetics of generated code haftmann@28561: \item gaining efficiency haftmann@28561: \item interface with language parts which have no direct counterpart haftmann@28561: in @{text "HOL"} (say, imperative data structures) haftmann@28561: \end{itemize} haftmann@28561: haftmann@28561: \noindent Generally, you should avoid using those features yourself haftmann@28561: \emph{at any cost}: haftmann@28561: haftmann@28561: \begin{itemize} haftmann@38450: haftmann@38450: \item The safe configuration methods act uniformly on every target haftmann@38450: language, whereas for adaptation you have to treat each target haftmann@38450: language separately. haftmann@38450: haftmann@38450: \item Application is extremely tedious since there is no haftmann@38450: abstraction which would allow for a static check, making it easy haftmann@38450: to produce garbage. haftmann@38450: paulson@34155: \item Subtle errors can be introduced unconsciously. haftmann@38450: haftmann@28561: \end{itemize} haftmann@28561: haftmann@38450: \noindent However, even if you ought refrain from setting up haftmann@51162: adaptation yourself, already @{text "HOL"} comes with some haftmann@38450: reasonable default adaptations (say, using target language list haftmann@38450: syntax). There also some common adaptation cases which you can haftmann@38450: setup by importing particular library theories. In order to haftmann@38450: understand these, we provide some clues here; these however are not haftmann@38450: supposed to replace a careful study of the sources. haftmann@59377: \ haftmann@28561: haftmann@38450: haftmann@59377: subsection \The adaptation principle\ haftmann@28561: haftmann@59377: text \ haftmann@38450: Figure \ref{fig:adaptation} illustrates what \qt{adaptation} is haftmann@38450: conceptually supposed to be: haftmann@28601: wenzelm@60768: \begin{figure}[h] wenzelm@52742: \begin{tikzpicture}[scale = 0.5] wenzelm@52742: \tikzstyle water=[color = blue, thick] wenzelm@52742: \tikzstyle ice=[color = black, very thick, cap = round, join = round, fill = white] wenzelm@52742: \tikzstyle process=[color = green, semithick, ->] wenzelm@52742: \tikzstyle adaptation=[color = red, semithick, ->] wenzelm@52742: \tikzstyle target=[color = black] wenzelm@52742: \foreach \x in {0, ..., 24} wenzelm@52742: \draw[style=water] (\x, 0.25) sin + (0.25, 0.25) cos + (0.25, -0.25) sin wenzelm@52742: + (0.25, -0.25) cos + (0.25, 0.25); wenzelm@52742: \draw[style=ice] (1, 0) -- wenzelm@52742: (3, 6) node[above, fill=white] {logic} -- (5, 0) -- cycle; wenzelm@52742: \draw[style=ice] (9, 0) -- wenzelm@52742: (11, 6) node[above, fill=white] {intermediate language} -- (13, 0) -- cycle; wenzelm@52742: \draw[style=ice] (15, -6) -- wenzelm@52742: (19, 6) node[above, fill=white] {target language} -- (23, -6) -- cycle; wenzelm@52742: \draw[style=process] wenzelm@52742: (3.5, 3) .. controls (7, 5) .. node[fill=white] {translation} (10.5, 3); wenzelm@52742: \draw[style=process] wenzelm@52742: (11.5, 3) .. controls (15, 5) .. node[fill=white] (serialisation) {serialisation} (18.5, 3); wenzelm@52742: \node (adaptation) at (11, -2) [style=adaptation] {adaptation}; wenzelm@52742: \node at (19, 3) [rotate=90] {generated}; wenzelm@52742: \node at (19.5, -5) {language}; wenzelm@52742: \node at (19.5, -3) {library}; wenzelm@52742: \node (includes) at (19.5, -1) {includes}; wenzelm@52742: \node (reserved) at (16.5, -3) [rotate=72] {reserved}; % proper 71.57 wenzelm@52742: \draw[style=process] wenzelm@52742: (includes) -- (serialisation); wenzelm@52742: \draw[style=process] wenzelm@52742: (reserved) -- (serialisation); wenzelm@52742: \draw[style=adaptation] wenzelm@52742: (adaptation) -- (serialisation); wenzelm@52742: \draw[style=adaptation] wenzelm@52742: (adaptation) -- (includes); wenzelm@52742: \draw[style=adaptation] wenzelm@52742: (adaptation) -- (reserved); wenzelm@52742: \end{tikzpicture} haftmann@31050: \caption{The adaptation principle} haftmann@31050: \label{fig:adaptation} haftmann@28601: \end{figure} haftmann@28601: haftmann@28601: \noindent In the tame view, code generation acts as broker between haftmann@38450: @{text logic}, @{text "intermediate language"} and @{text "target haftmann@38450: language"} by means of @{text translation} and @{text haftmann@38450: serialisation}; for the latter, the serialiser has to observe the haftmann@38450: structure of the @{text language} itself plus some @{text reserved} haftmann@38450: keywords which have to be avoided for generated code. However, if haftmann@38450: you consider @{text adaptation} mechanisms, the code generated by haftmann@38450: the serializer is just the tip of the iceberg: haftmann@28601: haftmann@28601: \begin{itemize} haftmann@38450: haftmann@28635: \item @{text serialisation} can be \emph{parametrised} such that haftmann@28635: logical entities are mapped to target-specific ones haftmann@38450: (e.g. target-specific list syntax, see also haftmann@38450: \secref{sec:adaptation_mechanisms}) haftmann@38450: haftmann@28635: \item Such parametrisations can involve references to a haftmann@38450: target-specific standard @{text library} (e.g. using the @{text haftmann@38450: Haskell} @{verbatim Maybe} type instead of the @{text HOL} haftmann@38450: @{type "option"} type); if such are used, the corresponding haftmann@38450: identifiers (in our example, @{verbatim Maybe}, @{verbatim haftmann@38450: Nothing} and @{verbatim Just}) also have to be considered @{text haftmann@38450: reserved}. haftmann@38450: haftmann@28635: \item Even more, the user can enrich the library of the haftmann@38450: target-language by providing code snippets (\qt{@{text haftmann@38450: "includes"}}) which are prepended to any generated code (see haftmann@38450: \secref{sec:include}); this typically also involves further haftmann@38450: @{text reserved} identifiers. haftmann@38450: haftmann@28601: \end{itemize} haftmann@28635: haftmann@38450: \noindent As figure \ref{fig:adaptation} illustrates, all these haftmann@38450: adaptation mechanisms have to act consistently; it is at the haftmann@38450: discretion of the user to take care for this. haftmann@59377: \ haftmann@28561: haftmann@59482: subsection \Common adaptation applications\ haftmann@28419: haftmann@59377: text \ haftmann@28428: The @{theory HOL} @{theory Main} theory already provides a code haftmann@38450: generator setup which should be suitable for most applications. haftmann@38450: Common extensions and modifications are available by certain wenzelm@63680: theories in \<^dir>\~~/src/HOL/Library\; beside being useful in haftmann@38450: applications, they may serve as a tutorial for customising the code haftmann@38450: generator setup (see below \secref{sec:adaptation_mechanisms}). haftmann@28419: haftmann@28419: \begin{description} haftmann@28419: haftmann@51143: \item[@{theory "Code_Numeral"}] provides additional numeric haftmann@51143: types @{typ integer} and @{typ natural} isomorphic to types haftmann@51143: @{typ int} and @{typ nat} respectively. Type @{typ integer} haftmann@51143: is mapped to target-language built-in integers; @{typ natural} haftmann@51143: is implemented as abstract type over @{typ integer}. haftmann@51143: Useful for code setups which involve e.g.~indexing haftmann@51143: of target-language arrays. Part of @{text "HOL-Main"}. haftmann@51143: haftmann@51143: \item[@{text "Code_Target_Int"}] implements type @{typ int} haftmann@51143: by @{typ integer} and thus by target-language built-in integers. haftmann@38450: haftmann@51171: \item[@{text "Code_Binary_Nat"}] implements type haftmann@51143: @{typ nat} using a binary rather than a linear representation, haftmann@51143: which yields a considerable speedup for computations. wenzelm@61076: Pattern matching with @{term "0::nat"} / @{const "Suc"} is eliminated haftmann@51171: by a preprocessor.\label{abstract_nat} haftmann@51143: haftmann@51171: \item[@{text "Code_Target_Nat"}] implements type @{typ nat} haftmann@51171: by @{typ integer} and thus by target-language built-in integers. wenzelm@61076: Pattern matching with @{term "0::nat"} / @{const "Suc"} is eliminated haftmann@51171: by a preprocessor. haftmann@51143: haftmann@51162: \item[@{text "Code_Target_Numeral"}] is a convenience theory haftmann@51143: containing both @{text "Code_Target_Nat"} and haftmann@51143: @{text "Code_Target_Int"}. haftmann@38450: haftmann@38450: \item[@{theory "String"}] provides an additional datatype @{typ haftmann@38450: String.literal} which is isomorphic to strings; @{typ haftmann@38450: String.literal}s are mapped to target-language strings. Useful haftmann@38450: for code setups which involve e.g.~printing (error) messages. haftmann@46519: Part of @{text "HOL-Main"}. haftmann@28419: haftmann@59482: \item[@{text "Code_Char"}] represents @{text HOL} characters by haftmann@59482: character literals in target languages. \emph{Warning:} This haftmann@59482: modifies adaptation in a non-conservative manner and thus haftmann@59482: should always be imported \emph{last} in a theory header. haftmann@59482: haftmann@51162: \item[@{theory "IArray"}] provides a type @{typ "'a iarray"} haftmann@51162: isomorphic to lists but implemented by (effectively immutable) haftmann@51162: arrays \emph{in SML only}. haftmann@28419: haftmann@51162: \end{description} haftmann@59377: \ haftmann@28419: haftmann@28419: haftmann@59377: subsection \Parametrising serialisation \label{sec:adaptation_mechanisms}\ haftmann@28419: haftmann@59377: text \ haftmann@38450: Consider the following function and its corresponding SML code: haftmann@59377: \ haftmann@28419: haftmann@28564: primrec %quote in_interval :: "nat \ nat \ nat \ bool" where haftmann@28419: "in_interval (k, l) n \ k \ n \ n \ l" haftmann@28447: (*<*) haftmann@52378: code_printing %invisible haftmann@52378: type_constructor bool \ (SML) haftmann@52378: | constant True \ (SML) haftmann@52378: | constant False \ (SML) haftmann@52378: | constant HOL.conj \ (SML) haftmann@52378: | constant Not \ (SML) haftmann@28447: (*>*) haftmann@59377: text %quotetypewriter \ haftmann@39683: @{code_stmts in_interval (SML)} haftmann@59377: \ haftmann@28419: haftmann@59377: text \ haftmann@38450: \noindent Though this is correct code, it is a little bit haftmann@38450: unsatisfactory: boolean values and operators are materialised as haftmann@38450: distinguished entities with have nothing to do with the SML-built-in haftmann@38450: notion of \qt{bool}. This results in less readable code; haftmann@38450: additionally, eager evaluation may cause programs to loop or break haftmann@38450: which would perfectly terminate when the existing SML @{verbatim haftmann@38450: "bool"} would be used. To map the HOL @{typ bool} on SML @{verbatim haftmann@38450: "bool"}, we may use \qn{custom serialisations}: haftmann@59377: \ haftmann@28419: haftmann@52378: code_printing %quotett haftmann@52378: type_constructor bool \ (SML) "bool" haftmann@52378: | constant True \ (SML) "true" haftmann@52378: | constant False \ (SML) "false" haftmann@52378: | constant HOL.conj \ (SML) "_ andalso _" haftmann@28213: haftmann@59377: text \ haftmann@52378: \noindent The @{command_def code_printing} command takes a series haftmann@52378: of symbols (contants, type constructor, \ldots) haftmann@52378: together with target-specific custom serialisations. Each haftmann@38450: custom serialisation starts with a target language identifier haftmann@38450: followed by an expression, which during code serialisation is haftmann@52378: inserted whenever the type constructor would occur. Each haftmann@38450: ``@{verbatim "_"}'' in a serialisation expression is treated as a haftmann@52378: placeholder for the constant's or the type constructor's arguments. haftmann@59377: \ haftmann@28419: haftmann@59377: text %quotetypewriter \ haftmann@39683: @{code_stmts in_interval (SML)} haftmann@59377: \ haftmann@28419: haftmann@59377: text \ haftmann@38450: \noindent This still is not perfect: the parentheses around the haftmann@38450: \qt{andalso} expression are superfluous. Though the serialiser by haftmann@38450: no means attempts to imitate the rich Isabelle syntax framework, it haftmann@38450: provides some common idioms, notably associative infixes with haftmann@38450: precedences which may be used here: haftmann@59377: \ haftmann@28419: haftmann@52378: code_printing %quotett haftmann@52378: constant HOL.conj \ (SML) infixl 1 "andalso" haftmann@28419: haftmann@59377: text %quotetypewriter \ haftmann@39683: @{code_stmts in_interval (SML)} haftmann@59377: \ haftmann@28419: haftmann@59377: text \ haftmann@38450: \noindent The attentive reader may ask how we assert that no haftmann@38450: generated code will accidentally overwrite. For this reason the haftmann@38450: serialiser has an internal table of identifiers which have to be haftmann@38450: avoided to be used for new declarations. Initially, this table haftmann@38450: typically contains the keywords of the target language. It can be haftmann@38450: extended manually, thus avoiding accidental overwrites, using the haftmann@38505: @{command_def "code_reserved"} command: haftmann@59377: \ haftmann@28561: haftmann@40351: code_reserved %quote "\" bool true false andalso haftmann@28561: haftmann@59377: text \ haftmann@28447: \noindent Next, we try to map HOL pairs to SML pairs, using the haftmann@28419: infix ``@{verbatim "*"}'' type constructor and parentheses: haftmann@59377: \ haftmann@28447: (*<*) haftmann@52378: code_printing %invisible haftmann@52378: type_constructor prod \ (SML) haftmann@52378: | constant Pair \ (SML) haftmann@28447: (*>*) haftmann@52378: code_printing %quotett haftmann@52378: type_constructor prod \ (SML) infix 2 "*" haftmann@52378: | constant Pair \ (SML) "!((_),/ (_))" haftmann@28419: haftmann@59377: text \ haftmann@28593: \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser haftmann@38450: never to put parentheses around the whole expression (they are haftmann@38450: already present), while the parentheses around argument place haftmann@38450: holders tell not to put parentheses around the arguments. The slash haftmann@38450: ``@{verbatim "/"}'' (followed by arbitrary white space) inserts a haftmann@38450: space which may be used as a break if necessary during pretty haftmann@38450: printing. haftmann@28419: haftmann@38450: These examples give a glimpse what mechanisms custom serialisations haftmann@38450: provide; however their usage requires careful thinking in order not haftmann@38450: to introduce inconsistencies -- or, in other words: custom haftmann@38450: serialisations are completely axiomatic. haftmann@28419: haftmann@39643: A further noteworthy detail is that any special character in a haftmann@38450: custom serialisation may be quoted using ``@{verbatim "'"}''; thus, haftmann@38450: in ``@{verbatim "fn '_ => _"}'' the first ``@{verbatim "_"}'' is a haftmann@38450: proper underscore while the second ``@{verbatim "_"}'' is a haftmann@38450: placeholder. haftmann@59377: \ haftmann@28419: haftmann@28419: haftmann@59377: subsection \@{text Haskell} serialisation\ haftmann@28419: haftmann@59377: text \ haftmann@38450: For convenience, the default @{text HOL} setup for @{text Haskell} haftmann@39063: maps the @{class equal} class to its counterpart in @{text Haskell}, haftmann@52378: giving custom serialisations for the class @{class equal} haftmann@52378: and its operation @{const [source] HOL.equal}. haftmann@59377: \ haftmann@28419: haftmann@52378: code_printing %quotett haftmann@52378: type_class equal \ (Haskell) "Eq" haftmann@52378: | constant HOL.equal \ (Haskell) infixl 4 "==" haftmann@28419: haftmann@59377: text \ haftmann@38450: \noindent A problem now occurs whenever a type which is an instance haftmann@39063: of @{class equal} in @{text HOL} is mapped on a @{text haftmann@38450: Haskell}-built-in type which is also an instance of @{text Haskell} haftmann@38450: @{text Eq}: haftmann@59377: \ haftmann@28419: haftmann@28564: typedecl %quote bar haftmann@28419: haftmann@39063: instantiation %quote bar :: equal haftmann@28419: begin haftmann@28419: wenzelm@61076: definition %quote "HOL.equal (x::bar) y \ x = y" haftmann@28419: wenzelm@61169: instance %quote by standard (simp add: equal_bar_def) haftmann@28213: haftmann@30880: end %quote (*<*) haftmann@30880: haftmann@52378: (*>*) code_printing %quotett haftmann@52378: type_constructor bar \ (Haskell) "Integer" haftmann@28419: haftmann@59377: text \ haftmann@38450: \noindent The code generator would produce an additional instance, haftmann@38450: which of course is rejected by the @{text Haskell} compiler. To haftmann@52378: suppress this additional instance: haftmann@59377: \ haftmann@28419: haftmann@52378: code_printing %quotett haftmann@52378: class_instance bar :: "HOL.equal" \ (Haskell) - haftmann@28419: haftmann@28561: haftmann@59377: subsection \Enhancing the target language context \label{sec:include}\ haftmann@28561: haftmann@59377: text \ haftmann@28593: In rare cases it is necessary to \emph{enrich} the context of a haftmann@52378: target language; this can also be accomplished using the @{command haftmann@52378: "code_printing"} command: haftmann@59377: \ haftmann@28561: haftmann@52378: code_printing %quotett haftmann@59379: code_module "Errno" \ (Haskell) haftmann@59379: \errno i = error ("Error number: " ++ show i)\ haftmann@28561: haftmann@39745: code_reserved %quotett Haskell Errno haftmann@28561: haftmann@59377: text \ haftmann@52378: \noindent Such named modules are then prepended to every haftmann@38450: generated code. Inspect such code in order to find out how haftmann@52378: this behaves with respect to a particular haftmann@38450: target language. haftmann@59377: \ haftmann@28561: haftmann@28419: end haftmann@46519: