doc-src/Codegen/Thy/Adaption.thy
changeset 31108 0ce5f53fc65d
parent 31107 657386d94f14
parent 31093 ee45b1c733c1
child 31109 54092b86ef81
equal deleted inserted replaced
31107:657386d94f14 31108:0ce5f53fc65d
     1 theory Adaption
       
     2 imports Setup
       
     3 begin
       
     4 
       
     5 setup %invisible {* Code_Target.extend_target ("\<SML>", ("SML", K I)) *}
       
     6 
       
     7 section {* Adaption to target languages \label{sec:adaption} *}
       
     8 
       
     9 subsection {* Adapting code generation *}
       
    10 
       
    11 text {*
       
    12   The aspects of code generation introduced so far have two aspects
       
    13   in common:
       
    14 
       
    15   \begin{itemize}
       
    16     \item They act uniformly, without reference to a specific
       
    17        target language.
       
    18     \item They are \emph{safe} in the sense that as long as you trust
       
    19        the code generator meta theory and implementation, you cannot
       
    20        produce programs that yield results which are not derivable
       
    21        in the logic.
       
    22   \end{itemize}
       
    23 
       
    24   \noindent In this section we will introduce means to \emph{adapt} the serialiser
       
    25   to a specific target language, i.e.~to print program fragments
       
    26   in a way which accommodates \qt{already existing} ingredients of
       
    27   a target language environment, for three reasons:
       
    28 
       
    29   \begin{itemize}
       
    30     \item improving readability and aesthetics of generated code
       
    31     \item gaining efficiency
       
    32     \item interface with language parts which have no direct counterpart
       
    33       in @{text "HOL"} (say, imperative data structures)
       
    34   \end{itemize}
       
    35 
       
    36   \noindent Generally, you should avoid using those features yourself
       
    37   \emph{at any cost}:
       
    38 
       
    39   \begin{itemize}
       
    40     \item The safe configuration methods act uniformly on every target language,
       
    41       whereas for adaption you have to treat each target language separate.
       
    42     \item Application is extremely tedious since there is no abstraction
       
    43       which would allow for a static check, making it easy to produce garbage.
       
    44     \item More or less subtle errors can be introduced unconsciously.
       
    45   \end{itemize}
       
    46 
       
    47   \noindent However, even if you ought refrain from setting up adaption
       
    48   yourself, already the @{text "HOL"} comes with some reasonable default
       
    49   adaptions (say, using target language list syntax).  There also some
       
    50   common adaption cases which you can setup by importing particular
       
    51   library theories.  In order to understand these, we provide some clues here;
       
    52   these however are not supposed to replace a careful study of the sources.
       
    53 *}
       
    54 
       
    55 subsection {* The adaption principle *}
       
    56 
       
    57 text {*
       
    58   Figure \ref{fig:adaption} illustrates what \qt{adaption} is conceptually
       
    59   supposed to be:
       
    60 
       
    61   \begin{figure}[here]
       
    62     \includegraphics{adaption}
       
    63     \caption{The adaption principle}
       
    64     \label{fig:adaption}
       
    65   \end{figure}
       
    66 
       
    67   \noindent In the tame view, code generation acts as broker between
       
    68   @{text logic}, @{text "intermediate language"} and
       
    69   @{text "target language"} by means of @{text translation} and
       
    70   @{text serialisation};  for the latter, the serialiser has to observe
       
    71   the structure of the @{text language} itself plus some @{text reserved}
       
    72   keywords which have to be avoided for generated code.
       
    73   However, if you consider @{text adaption} mechanisms, the code generated
       
    74   by the serializer is just the tip of the iceberg:
       
    75 
       
    76   \begin{itemize}
       
    77     \item @{text serialisation} can be \emph{parametrised} such that
       
    78       logical entities are mapped to target-specific ones
       
    79       (e.g. target-specific list syntax,
       
    80         see also \secref{sec:adaption_mechanisms})
       
    81     \item Such parametrisations can involve references to a
       
    82       target-specific standard @{text library} (e.g. using
       
    83       the @{text Haskell} @{verbatim Maybe} type instead
       
    84       of the @{text HOL} @{type "option"} type);
       
    85       if such are used, the corresponding identifiers
       
    86       (in our example, @{verbatim Maybe}, @{verbatim Nothing}
       
    87       and @{verbatim Just}) also have to be considered @{text reserved}.
       
    88     \item Even more, the user can enrich the library of the
       
    89       target-language by providing code snippets
       
    90       (\qt{@{text "includes"}}) which are prepended to
       
    91       any generated code (see \secref{sec:include});  this typically
       
    92       also involves further @{text reserved} identifiers.
       
    93   \end{itemize}
       
    94 
       
    95   \noindent As figure \ref{fig:adaption} illustrates, all these adaption mechanisms
       
    96   have to act consistently;  it is at the discretion of the user
       
    97   to take care for this.
       
    98 *}
       
    99 
       
   100 subsection {* Common adaption patterns *}
       
   101 
       
   102 text {*
       
   103   The @{theory HOL} @{theory Main} theory already provides a code
       
   104   generator setup
       
   105   which should be suitable for most applications.  Common extensions
       
   106   and modifications are available by certain theories of the @{text HOL}
       
   107   library; beside being useful in applications, they may serve
       
   108   as a tutorial for customising the code generator setup (see below
       
   109   \secref{sec:adaption_mechanisms}).
       
   110 
       
   111   \begin{description}
       
   112 
       
   113     \item[@{theory "Code_Integer"}] represents @{text HOL} integers by big
       
   114        integer literals in target languages.
       
   115     \item[@{theory "Code_Char"}] represents @{text HOL} characters by 
       
   116        character literals in target languages.
       
   117     \item[@{theory "Code_Char_chr"}] like @{text "Code_Char"},
       
   118        but also offers treatment of character codes; includes
       
   119        @{theory "Code_Char"}.
       
   120     \item[@{theory "Efficient_Nat"}] \label{eff_nat} implements natural numbers by integers,
       
   121        which in general will result in higher efficiency; pattern
       
   122        matching with @{term "0\<Colon>nat"} / @{const "Suc"}
       
   123        is eliminated;  includes @{theory "Code_Integer"}
       
   124        and @{theory "Code_Index"}.
       
   125     \item[@{theory "Code_Index"}] provides an additional datatype
       
   126        @{typ index} which is mapped to target-language built-in integers.
       
   127        Useful for code setups which involve e.g. indexing of
       
   128        target-language arrays.
       
   129     \item[@{theory "Code_Message"}] provides an additional datatype
       
   130        @{typ message_string} which is isomorphic to strings;
       
   131        @{typ message_string}s are mapped to target-language strings.
       
   132        Useful for code setups which involve e.g. printing (error) messages.
       
   133 
       
   134   \end{description}
       
   135 
       
   136   \begin{warn}
       
   137     When importing any of these theories, they should form the last
       
   138     items in an import list.  Since these theories adapt the
       
   139     code generator setup in a non-conservative fashion,
       
   140     strange effects may occur otherwise.
       
   141   \end{warn}
       
   142 *}
       
   143 
       
   144 
       
   145 subsection {* Parametrising serialisation \label{sec:adaption_mechanisms} *}
       
   146 
       
   147 text {*
       
   148   Consider the following function and its corresponding
       
   149   SML code:
       
   150 *}
       
   151 
       
   152 primrec %quote in_interval :: "nat \<times> nat \<Rightarrow> nat \<Rightarrow> bool" where
       
   153   "in_interval (k, l) n \<longleftrightarrow> k \<le> n \<and> n \<le> l"
       
   154 (*<*)
       
   155 code_type %invisible bool
       
   156   (SML)
       
   157 code_const %invisible True and False and "op \<and>" and Not
       
   158   (SML and and and)
       
   159 (*>*)
       
   160 text %quote {*@{code_stmts in_interval (SML)}*}
       
   161 
       
   162 text {*
       
   163   \noindent Though this is correct code, it is a little bit unsatisfactory:
       
   164   boolean values and operators are materialised as distinguished
       
   165   entities with have nothing to do with the SML-built-in notion
       
   166   of \qt{bool}.  This results in less readable code;
       
   167   additionally, eager evaluation may cause programs to
       
   168   loop or break which would perfectly terminate when
       
   169   the existing SML @{verbatim "bool"} would be used.  To map
       
   170   the HOL @{typ bool} on SML @{verbatim "bool"}, we may use
       
   171   \qn{custom serialisations}:
       
   172 *}
       
   173 
       
   174 code_type %quotett bool
       
   175   (SML "bool")
       
   176 code_const %quotett True and False and "op \<and>"
       
   177   (SML "true" and "false" and "_ andalso _")
       
   178 
       
   179 text {*
       
   180   \noindent The @{command code_type} command takes a type constructor
       
   181   as arguments together with a list of custom serialisations.
       
   182   Each custom serialisation starts with a target language
       
   183   identifier followed by an expression, which during
       
   184   code serialisation is inserted whenever the type constructor
       
   185   would occur.  For constants, @{command code_const} implements
       
   186   the corresponding mechanism.  Each ``@{verbatim "_"}'' in
       
   187   a serialisation expression is treated as a placeholder
       
   188   for the type constructor's (the constant's) arguments.
       
   189 *}
       
   190 
       
   191 text %quote {*@{code_stmts in_interval (SML)}*}
       
   192 
       
   193 text {*
       
   194   \noindent This still is not perfect: the parentheses
       
   195   around the \qt{andalso} expression are superfluous.
       
   196   Though the serialiser
       
   197   by no means attempts to imitate the rich Isabelle syntax
       
   198   framework, it provides some common idioms, notably
       
   199   associative infixes with precedences which may be used here:
       
   200 *}
       
   201 
       
   202 code_const %quotett "op \<and>"
       
   203   (SML infixl 1 "andalso")
       
   204 
       
   205 text %quote {*@{code_stmts in_interval (SML)}*}
       
   206 
       
   207 text {*
       
   208   \noindent The attentive reader may ask how we assert that no generated
       
   209   code will accidentally overwrite.  For this reason the serialiser has
       
   210   an internal table of identifiers which have to be avoided to be used
       
   211   for new declarations.  Initially, this table typically contains the
       
   212   keywords of the target language.  It can be extended manually, thus avoiding
       
   213   accidental overwrites, using the @{command "code_reserved"} command:
       
   214 *}
       
   215 
       
   216 code_reserved %quote "\<SML>" bool true false andalso
       
   217 
       
   218 text {*
       
   219   \noindent Next, we try to map HOL pairs to SML pairs, using the
       
   220   infix ``@{verbatim "*"}'' type constructor and parentheses:
       
   221 *}
       
   222 (*<*)
       
   223 code_type %invisible *
       
   224   (SML)
       
   225 code_const %invisible Pair
       
   226   (SML)
       
   227 (*>*)
       
   228 code_type %quotett *
       
   229   (SML infix 2 "*")
       
   230 code_const %quotett Pair
       
   231   (SML "!((_),/ (_))")
       
   232 
       
   233 text {*
       
   234   \noindent The initial bang ``@{verbatim "!"}'' tells the serialiser
       
   235   never to put
       
   236   parentheses around the whole expression (they are already present),
       
   237   while the parentheses around argument place holders
       
   238   tell not to put parentheses around the arguments.
       
   239   The slash ``@{verbatim "/"}'' (followed by arbitrary white space)
       
   240   inserts a space which may be used as a break if necessary
       
   241   during pretty printing.
       
   242 
       
   243   These examples give a glimpse what mechanisms
       
   244   custom serialisations provide; however their usage
       
   245   requires careful thinking in order not to introduce
       
   246   inconsistencies -- or, in other words:
       
   247   custom serialisations are completely axiomatic.
       
   248 
       
   249   A further noteworthy details is that any special
       
   250   character in a custom serialisation may be quoted
       
   251   using ``@{verbatim "'"}''; thus, in
       
   252   ``@{verbatim "fn '_ => _"}'' the first
       
   253   ``@{verbatim "_"}'' is a proper underscore while the
       
   254   second ``@{verbatim "_"}'' is a placeholder.
       
   255 *}
       
   256 
       
   257 
       
   258 subsection {* @{text Haskell} serialisation *}
       
   259 
       
   260 text {*
       
   261   For convenience, the default
       
   262   @{text HOL} setup for @{text Haskell} maps the @{class eq} class to
       
   263   its counterpart in @{text Haskell}, giving custom serialisations
       
   264   for the class @{class eq} (by command @{command code_class}) and its operation
       
   265   @{const HOL.eq}
       
   266 *}
       
   267 
       
   268 code_class %quotett eq
       
   269   (Haskell "Eq")
       
   270 
       
   271 code_const %quotett "op ="
       
   272   (Haskell infixl 4 "==")
       
   273 
       
   274 text {*
       
   275   \noindent A problem now occurs whenever a type which
       
   276   is an instance of @{class eq} in @{text HOL} is mapped
       
   277   on a @{text Haskell}-built-in type which is also an instance
       
   278   of @{text Haskell} @{text Eq}:
       
   279 *}
       
   280 
       
   281 typedecl %quote bar
       
   282 
       
   283 instantiation %quote bar :: eq
       
   284 begin
       
   285 
       
   286 definition %quote "eq_class.eq (x\<Colon>bar) y \<longleftrightarrow> x = y"
       
   287 
       
   288 instance %quote by default (simp add: eq_bar_def)
       
   289 
       
   290 end %quote (*<*)
       
   291 
       
   292 (*>*) code_type %quotett bar
       
   293   (Haskell "Integer")
       
   294 
       
   295 text {*
       
   296   \noindent The code generator would produce
       
   297   an additional instance, which of course is rejected by the @{text Haskell}
       
   298   compiler.
       
   299   To suppress this additional instance, use
       
   300   @{text "code_instance"}:
       
   301 *}
       
   302 
       
   303 code_instance %quotett bar :: eq
       
   304   (Haskell -)
       
   305 
       
   306 
       
   307 subsection {* Enhancing the target language context \label{sec:include} *}
       
   308 
       
   309 text {*
       
   310   In rare cases it is necessary to \emph{enrich} the context of a
       
   311   target language;  this is accomplished using the @{command "code_include"}
       
   312   command:
       
   313 *}
       
   314 
       
   315 code_include %quotett Haskell "Errno"
       
   316 {*errno i = error ("Error number: " ++ show i)*}
       
   317 
       
   318 code_reserved %quotett Haskell Errno
       
   319 
       
   320 text {*
       
   321   \noindent Such named @{text include}s are then prepended to every generated code.
       
   322   Inspect such code in order to find out how @{command "code_include"} behaves
       
   323   with respect to a particular target language.
       
   324 *}
       
   325 
       
   326 end