doc-src/IsarAdvanced/Codegen/Thy/Codegen.thy
changeset 22550 c5039bee2602
parent 22479 de15ea8fb348
child 22751 1bfd75c1f232
equal deleted inserted replaced
22549:ab23925c64d6 22550:c5039bee2602
    24   application of theorem proving systems in the area of
    24   application of theorem proving systems in the area of
    25   software development and verification, its relevance manifests
    25   software development and verification, its relevance manifests
    26   for running test cases and rapid prototyping.  In logical
    26   for running test cases and rapid prototyping.  In logical
    27   calculi like constructive type theory,
    27   calculi like constructive type theory,
    28   a notion of executability is implicit due to the nature
    28   a notion of executability is implicit due to the nature
    29   of the calculus.  In contrast, specifications in Isabelle/HOL
    29   of the calculus.  In contrast, specifications in Isabelle
    30   can be highly non-executable.  In order to bridge
    30   can be highly non-executable.  In order to bridge
    31   the gap between logic and executable specifications,
    31   the gap between logic and executable specifications,
    32   an explicit non-trivial transformation has to be applied:
    32   an explicit non-trivial transformation has to be applied:
    33   code generation.
    33   code generation.
    34 
    34 
    36   Isabelle system \cite{isa-tutorial}.
    36   Isabelle system \cite{isa-tutorial}.
    37   Generic in the sense that the
    37   Generic in the sense that the
    38   \qn{target language} for which code shall ultimately be
    38   \qn{target language} for which code shall ultimately be
    39   generated is not fixed but may be an arbitrary state-of-the-art
    39   generated is not fixed but may be an arbitrary state-of-the-art
    40   functional programming language (currently, the implementation
    40   functional programming language (currently, the implementation
    41   supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell \cite{haskell-revised-report}).
    41   supports SML \cite{SML}, OCaml \cite{OCaml} and Haskell
       
    42   \cite{haskell-revised-report}).
    42   We aim to provide a
    43   We aim to provide a
    43   versatile environment
    44   versatile environment
    44   suitable for software development and verification,
    45   suitable for software development and verification,
    45   structuring the process
    46   structuring the process
    46   of code generation into a small set of orthogonal principles
    47   of code generation into a small set of orthogonal principles
    47   while achieving a big coverage of application areas
    48   while achieving a big coverage of application areas
    48   with maximum flexibility.
    49   with maximum flexibility.
    49 
    50 
    50   For readers, some familiarity and experience
    51   Conceptually the code generator framework is part
    51   with the ingredients
    52   of Isabelle's @{text Pure} meta logic; the object logic
    52   of the HOL \emph{Main} theory is assumed.
    53   @{text HOL} which is an extension of @{text Pure}
       
    54   already comes with a reasonable framework setup and thus provides
       
    55   a good working horse for raising code-generation-driven
       
    56   applications.  So, we assume some familiarity and experience
       
    57   with the ingredients of the @{text HOL} \emph{Main} theory
       
    58   (see also \cite{isa-tutorial}).
    53 *}
    59 *}
    54 
    60 
    55 
    61 
    56 subsection {* Overview *}
    62 subsection {* Overview *}
    57 
    63 
    58 text {*
    64 text {*
    59   The code generator aims to be usable with no further ado
    65   The code generator aims to be usable with no further ado
    60   in most cases while allowing for detailed customization.
    66   in most cases while allowing for detailed customization.
    61   This manifests in the structure of this tutorial: this introduction
    67   This manifests in the structure of this tutorial:
    62   continues with a short introduction of concepts.  Section
    68   we start with a generic example \secref{sec:example}
       
    69   and introduce code generation concepts \secref{sec:concept}.
       
    70   Section
    63   \secref{sec:basics} explains how to use the framework naively,
    71   \secref{sec:basics} explains how to use the framework naively,
    64   presuming a reasonable default setup.  Then, section
    72   presuming a reasonable default setup.  Then, section
    65   \secref{sec:advanced} deals with advanced topics,
    73   \secref{sec:advanced} deals with advanced topics,
    66   introducing further aspects of the code generator framework
    74   introducing further aspects of the code generator framework
    67   in a motivation-driven manner.  Last, section \secref{sec:ml}
    75   in a motivation-driven manner.  Last, section \secref{sec:ml}
    72     is supposed to replace the already established code generator
    80     is supposed to replace the already established code generator
    73     by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
    81     by Stefan Berghofer \cite{Berghofer-Nipkow:2002}.
    74     So, for the moment, there are two distinct code generators
    82     So, for the moment, there are two distinct code generators
    75     in Isabelle.
    83     in Isabelle.
    76     Also note that while the framework itself is largely
    84     Also note that while the framework itself is largely
    77     object-logic independent, only HOL provides a reasonable
    85     object-logic independent, only @{text HOL} provides a reasonable
    78     framework setup.    
    86     framework setup.    
    79   \end{warn}
    87   \end{warn}
    80 *}
    88 *}
    81 
    89 
    82 
    90 
    83 subsection {* An exmaple: a simple theory of search trees *}
    91 section {* An example: a simple theory of search trees \label{sec:example} *}
       
    92 
       
    93 text {*
       
    94   When writing executable specifications, it is convenient to use
       
    95   three existing packages: the datatype package for defining
       
    96   datatypes, the function package for (recursive) functions,
       
    97   and the class package for overloaded definitions.
       
    98 
       
    99   We develope a small theory of search trees; trees are represented
       
   100   as a datatype with key type @{typ "'a"} and value type @{typ "'b"}:
       
   101 *}
    84 
   102 
    85 datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b
   103 datatype ('a, 'b) searchtree = Leaf "'a\<Colon>linorder" 'b
    86   | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree"
   104   | Branch "('a, 'b) searchtree" "'a" "('a, 'b) searchtree"
       
   105 
       
   106 text {*
       
   107   \noindent Note that we have constrained the type of keys
       
   108   to the class of total orders, @{text linorder}.
       
   109 
       
   110   We define @{text find} and @{text update} functions:
       
   111 *}
    87 
   112 
    88 fun
   113 fun
    89   find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where
   114   find :: "('a\<Colon>linorder, 'b) searchtree \<Rightarrow> 'a \<Rightarrow> 'b option" where
    90   "find (Leaf key val) it = (if it = key then Some val else None)"
   115   "find (Leaf key val) it = (if it = key then Some val else None)"
    91   | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)"
   116   | "find (Branch t1 key t2) it = (if it \<le> key then find t1 it else find t2 it)"
   102     if it \<le> key
   127     if it \<le> key
   103       then (Branch (update (it, entry) t1) key t2)
   128       then (Branch (update (it, entry) t1) key t2)
   104       else (Branch t1 key (update (it, entry) t2))
   129       else (Branch t1 key (update (it, entry) t2))
   105    )"
   130    )"
   106 
   131 
       
   132 text {*
       
   133   \noindent For testing purpose, we define a small example
       
   134   using natural numbers @{typ nat} (which are a @{text linorder})
       
   135   as keys and strings values:
       
   136 *}
       
   137 
   107 fun
   138 fun
   108   example :: "nat \<Rightarrow> (nat, string) searchtree" where
   139   example :: "nat \<Rightarrow> (nat, string) searchtree" where
   109   "example n = update (n, ''bar'') (Leaf 0 ''foo'')"
   140   "example n = update (n, ''bar'') (Leaf 0 ''foo'')"
   110 
   141 
       
   142 text {*
       
   143   \noindent Then we generate code
       
   144 *}
       
   145 
   111 code_gen example (*<*)(SML #)(*>*)(SML "examples/tree.ML")
   146 code_gen example (*<*)(SML #)(*>*)(SML "examples/tree.ML")
   112 
   147 
   113 text {*
   148 text {*
       
   149   \noindent which looks like:
   114   \lstsml{Thy/examples/tree.ML}
   150   \lstsml{Thy/examples/tree.ML}
   115 *}
   151 *}
   116 
   152 
   117 subsection {* Code generation process *}
   153 
       
   154 section {* Code generation concepts and process \label{sec:concept} *}
   118 
   155 
   119 text {*
   156 text {*
   120   \begin{figure}[h]
   157   \begin{figure}[h]
   121   \centering
   158   \centering
   122   \includegraphics[width=0.7\textwidth]{codegen_process}
   159   \includegraphics[width=0.7\textwidth]{codegen_process}
   179   fac :: "nat \<Rightarrow> nat" where
   216   fac :: "nat \<Rightarrow> nat" where
   180     "fac 0 = 1"
   217     "fac 0 = 1"
   181   | "fac (Suc n) = Suc n * fac n"
   218   | "fac (Suc n) = Suc n * fac n"
   182 
   219 
   183 text {*
   220 text {*
   184   This executable specification is now turned to SML code:
   221   \noindent This executable specification is now turned to SML code:
   185 *}
   222 *}
   186 
   223 
   187 code_gen fac (*<*)(SML #)(*>*)(SML "examples/fac.ML")
   224 code_gen fac (*<*)(SML #)(*>*)(SML "examples/fac.ML")
   188 
   225 
   189 text {*
   226 text {*
   190   The @{text "\<CODEGEN>"} command takes a space-separated list of
   227   \noindent  The @{text "\<CODEGEN>"} command takes a space-separated list of
   191   constants together with \qn{serialization directives}
   228   constants together with \qn{serialization directives}
   192   in parentheses. These start with a \qn{target language}
   229   in parentheses. These start with a \qn{target language}
   193   identifier, followed by arguments, their semantics
   230   identifier, followed by arguments, their semantics
   194   depending on the target. In the SML case, a filename
   231   depending on the target. In the SML case, a filename
   195   is given where to write the generated code to.
   232   is given where to write the generated code to.
   200   code:
   237   code:
   201 
   238 
   202   \lstsml{Thy/examples/fac.ML}
   239   \lstsml{Thy/examples/fac.ML}
   203 
   240 
   204   The code generator will complain when a required
   241   The code generator will complain when a required
   205   ingredient does not provide a executable counterpart.
   242   ingredient does not provide a executable counterpart,
   206   This is the case if an involved type is not a datatype:
   243   e.g.~generating code
   207 *}
       
   208 
       
   209 (*<*)
       
   210 setup {* Sign.add_path "foo" *}
       
   211 (*>*)
       
   212 
       
   213 typedecl 'a foo
       
   214 
       
   215 definition
       
   216   bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a" where
       
   217   "bar x y = y"
       
   218 
       
   219 (*<*)
       
   220 hide type foo
       
   221 hide const bar
       
   222 
       
   223 setup {* Sign.parent_path *}
       
   224 
       
   225 datatype 'a foo = Foo
       
   226 
       
   227 definition
       
   228   bar :: "'a foo \<Rightarrow> 'a \<Rightarrow> 'a" where
       
   229   "bar x y = y"
       
   230 (*>*)
       
   231 
       
   232 code_gen bar (SML "examples/fail_type.ML")
       
   233 
       
   234 text {*
       
   235   \noindent will result in an error. Likewise, generating code
       
   236   for constants not yielding
   244   for constants not yielding
   237   a defining equation will fail, e.g.~the Hilbert choice
   245   a defining equation (e.g.~the Hilbert choice
   238   operation @{text "SOME"}:
   246   operation @{text "SOME"}):
   239 *}
   247 *}
   240 
   248 
   241 (*<*)
   249 (*<*)
   242 setup {* Sign.add_path "foo" *}
   250 setup {* Sign.add_path "foo" *}
   243 (*>*)
   251 (*>*)
   256   "pick_some = hd"
   264   "pick_some = hd"
   257 (*>*)
   265 (*>*)
   258 
   266 
   259 code_gen pick_some (SML "examples/fail_const.ML")
   267 code_gen pick_some (SML "examples/fail_const.ML")
   260 
   268 
       
   269 text {* \noindent will fail. *}
       
   270 
   261 subsection {* Theorem selection *}
   271 subsection {* Theorem selection *}
   262 
   272 
   263 text {*
   273 text {*
   264   The list of all defining equations in a theory may be inspected
   274   The list of all defining equations in a theory may be inspected
   265   using the @{text "\<PRINTCODESETUP>"} command:
   275   using the @{text "\<PRINTCODESETUP>"} command:
   269 
   279 
   270 text {*
   280 text {*
   271   \noindent which displays a table of constant with corresponding
   281   \noindent which displays a table of constant with corresponding
   272   defining equations (the additional stuff displayed
   282   defining equations (the additional stuff displayed
   273   shall not bother us for the moment). If this table does
   283   shall not bother us for the moment). If this table does
   274   not provide at least one function
   284   not provide at least one defining
   275   equation, the table of primitive definitions is searched
   285   equation for a particular constant, the table of primitive
       
   286   definitions is searched
   276   whether it provides one.
   287   whether it provides one.
   277 
   288 
   278   The typical HOL tools are already set up in a way that
   289   The typical HOL tools are already set up in a way that
   279   function definitions introduced by @{text "\<FUN>"},
   290   function definitions introduced by @{text "\<FUN>"},
   280   @{text "\<FUNCTION>"}, @{text "\<PRIMREC>"}
   291   @{text "\<FUNCTION>"}, @{text "\<PRIMREC>"}
   445 text {*
   456 text {*
   446   \noindent print all cached defining equations (i.e.~\emph{after}
   457   \noindent print all cached defining equations (i.e.~\emph{after}
   447   any applied transformation).  A
   458   any applied transformation).  A
   448   list of constants may be given; their function
   459   list of constants may be given; their function
   449   equations are added to the cache if not already present.
   460   equations are added to the cache if not already present.
   450 *}
   461 
       
   462   Similarly, the @{text "\<CODEDEPS>"} command shows a graph
       
   463   visualizing dependencies between defining equations.
       
   464 *}
       
   465 
   451 
   466 
   452 
   467 
   453 section {* Recipes and advanced topics \label{sec:advanced} *}
   468 section {* Recipes and advanced topics \label{sec:advanced} *}
   454 
   469 
   455 text {*
   470 text {*
  1018 text {*
  1033 text {*
  1019   would mean nothing else than to introduce the evil
  1034   would mean nothing else than to introduce the evil
  1020   sort constraint by hand.
  1035   sort constraint by hand.
  1021 *}
  1036 *}
  1022 
  1037 
       
  1038 
       
  1039 subsection {* Constructor sets for datatypes *}
       
  1040 
       
  1041 text {*
       
  1042   \fixme
       
  1043 *}
       
  1044 
       
  1045 
  1023 subsection {* Cyclic module dependencies *}
  1046 subsection {* Cyclic module dependencies *}
  1024 
  1047 
  1025 text {*
  1048 text {*
  1026   Sometimes the awkward situation occurs that dependencies
  1049   Sometimes the awkward situation occurs that dependencies
  1027   between definitions introduce cyclic dependencies
  1050   between definitions introduce cyclic dependencies
  1145   constants by unique identifiers.
  1168   constants by unique identifiers.
  1146 *}
  1169 *}
  1147 
  1170 
  1148 text %mlref {*
  1171 text %mlref {*
  1149   \begin{mldecls}
  1172   \begin{mldecls}
  1150   @{index_ML_type CodegenConsts.const: "string * typ list"} \\
  1173   @{index_ML_type CodegenConsts.const: "string * string option"} \\
  1151   @{index_ML CodegenConsts.norm_of_typ: "theory -> string * typ -> CodegenConsts.const"} \\
  1174   @{index_ML CodegenConsts.const_of_cexpr: "theory -> string * typ -> CodegenConsts.const"} \\
  1152   @{index_ML CodegenConsts.typ_of_inst: "theory -> CodegenConsts.const -> string * typ"} \\
       
  1153  \end{mldecls}
  1175  \end{mldecls}
  1154 
  1176 
  1155   \begin{description}
  1177   \begin{description}
  1156 
  1178 
  1157   \item @{ML_type CodegenConsts.const} is the identifier type:
  1179   \item @{ML_type CodegenConsts.const} is the identifier type:
  1158      the product of a \emph{string} with a list of \emph{typs}.
  1180      the product of a \emph{string} with a list of \emph{typs}.
  1159      The \emph{string} is the constant name as represented inside Isabelle;
  1181      The \emph{string} is the constant name as represented inside Isabelle;
  1160      the \emph{typs} are a type instantiation in the sense of System F,
  1182      for overloaded constants, the attached \emph{string option}
  1161      with canonical names for type variables.
  1183      is either @{text SOME} type constructor denoting an instance,
  1162 
  1184      or @{text NONE} for the polymorphic constant.
  1163   \item @{ML CodegenConsts.norm_of_typ}~@{text thy}~@{text "(constname, typ)"}
  1185 
  1164      maps a constant expression @{text "(constname, typ)"} to its canonical identifier.
  1186   \item @{ML CodegenConsts.const_of_cexpr}~@{text thy}~@{text "(constname, typ)"}
  1165 
  1187      maps a constant expression @{text "(constname, typ)"}
  1166   \item @{ML CodegenConsts.typ_of_inst}~@{text thy}~@{text const}
  1188      to its canonical identifier.
  1167      maps a canonical identifier @{text const} to a constant
       
  1168      expression with appropriate type.
       
  1169 
  1189 
  1170   \end{description}
  1190   \end{description}
  1171 *}
  1191 *}
  1172 
  1192 
  1173 subsection {* Executable theory content: codegen\_data.ML *}
  1193 subsection {* Executable theory content: codegen\_data.ML *}
  1194 
  1214 
  1195 subsubsection {* Managing executable content *}
  1215 subsubsection {* Managing executable content *}
  1196 
  1216 
  1197 text %mlref {*
  1217 text %mlref {*
  1198   \begin{mldecls}
  1218   \begin{mldecls}
  1199   @{index_ML CodegenData.add_func: "thm -> theory -> theory"} \\
  1219   @{index_ML CodegenData.add_func: "bool -> thm -> theory -> theory"} \\
  1200   @{index_ML CodegenData.del_func: "thm -> theory -> theory"} \\
  1220   @{index_ML CodegenData.del_func: "thm -> theory -> theory"} \\
  1201   @{index_ML CodegenData.add_funcl: "CodegenConsts.const * thm list Susp.T -> theory -> theory"} \\
  1221   @{index_ML CodegenData.add_funcl: "CodegenConsts.const * thm list Susp.T -> theory -> theory"} \\
  1202   @{index_ML CodegenData.add_inline: "thm -> theory -> theory"} \\
  1222   @{index_ML CodegenData.add_inline: "thm -> theory -> theory"} \\
  1203   @{index_ML CodegenData.del_inline: "thm -> theory -> theory"} \\
  1223   @{index_ML CodegenData.del_inline: "thm -> theory -> theory"} \\
  1204   @{index_ML CodegenData.add_inline_proc: "string * (theory -> cterm list -> thm list)
  1224   @{index_ML CodegenData.add_inline_proc: "string * (theory -> cterm list -> thm list)
  1270 
  1290 
  1271 text %mlref {*
  1291 text %mlref {*
  1272   \begin{mldecls}
  1292   \begin{mldecls}
  1273   @{index_ML CodegenConsts.const_ord: "CodegenConsts.const * CodegenConsts.const -> order"} \\
  1293   @{index_ML CodegenConsts.const_ord: "CodegenConsts.const * CodegenConsts.const -> order"} \\
  1274   @{index_ML CodegenConsts.eq_const: "CodegenConsts.const * CodegenConsts.const -> bool"} \\
  1294   @{index_ML CodegenConsts.eq_const: "CodegenConsts.const * CodegenConsts.const -> bool"} \\
  1275   @{index_ML CodegenConsts.consts_of: "theory -> term -> CodegenConsts.const list"} \\
       
  1276   @{index_ML CodegenConsts.read_const: "theory -> string -> CodegenConsts.const"} \\
  1295   @{index_ML CodegenConsts.read_const: "theory -> string -> CodegenConsts.const"} \\
  1277   @{index_ML_structure CodegenConsts.Consttab} \\
  1296   @{index_ML_structure CodegenConsts.Consttab} \\
  1278   @{index_ML CodegenFunc.typ_func: "thm -> typ"} \\
  1297   @{index_ML CodegenFunc.typ_func: "thm -> typ"} \\
  1279   @{index_ML CodegenFunc.rewrite_func: "thm list -> thm -> thm"} \\
  1298   @{index_ML CodegenFunc.rewrite_func: "thm list -> thm -> thm"} \\
  1280   \end{mldecls}
  1299   \end{mldecls}
  1284   \item @{ML CodegenConsts.const_ord},~@{ML CodegenConsts.eq_const}
  1303   \item @{ML CodegenConsts.const_ord},~@{ML CodegenConsts.eq_const}
  1285      provide order and equality on constant identifiers.
  1304      provide order and equality on constant identifiers.
  1286 
  1305 
  1287   \item @{ML_struct CodegenConsts.Consttab}
  1306   \item @{ML_struct CodegenConsts.Consttab}
  1288      provides table structures with constant identifiers as keys.
  1307      provides table structures with constant identifiers as keys.
  1289 
       
  1290   \item @{ML CodegenConsts.consts_of}~@{text thy}~@{text t}
       
  1291      returns all constant identifiers mentioned in a term @{text t}.
       
  1292 
  1308 
  1293   \item @{ML CodegenConsts.read_const}~@{text thy}~@{text s}
  1309   \item @{ML CodegenConsts.read_const}~@{text thy}~@{text s}
  1294      reads a constant as a concrete term expression @{text s}.
  1310      reads a constant as a concrete term expression @{text s}.
  1295 
  1311 
  1296   \item @{ML CodegenFunc.typ_func}~@{text thm}
  1312   \item @{ML CodegenFunc.typ_func}~@{text thm}