177 \subsection{Lexical matters *} |
177 \subsection{Lexical matters *} |
178 |
178 |
179 The parser does not process input strings directly, it rather operates on |
179 The parser does not process input strings directly, it rather operates on |
180 token lists provided by Isabelle's \rmindex{lexical analyzer} (the |
180 token lists provided by Isabelle's \rmindex{lexical analyzer} (the |
181 \bfindex{lexer}). There are two different kinds of tokens: {\bf |
181 \bfindex{lexer}). There are two different kinds of tokens: {\bf |
182 literals}\indexbold{literal token}\indexbold{token!literal} and {\bf valued |
182 delimiters}\indexbold{delimiter} and {\bf names}\indexbold{name}. |
183 tokens}\indexbold{valued token}\indexbold{token!valued}. |
183 |
184 |
184 Delimiters can be regarded as reserved words\index{reserved word} of the |
185 Literals can be regarded as reserved words\index{reserved word} of the syntax |
185 syntax and the user can add new ones, when extending theories. In |
186 and the user can add new ones, when extending theories. In |
186 Figure~\ref{fig:pure_gram} they appear in typewriter font, e.g.\ {\tt PROP}, |
187 Figure~\ref{fig:pure_gram} they appear in typewriter type, e.g.\ {\tt PROP}, |
|
188 {\tt ==}, {\tt =?=}, {\tt ;}. |
187 {\tt ==}, {\tt =?=}, {\tt ;}. |
189 |
188 |
190 Valued tokens on the other hand have a fixed predefined syntax. The lexer |
189 Names on the other hand have a fixed predefined syntax. The lexer |
191 distinguishes four kinds of them: identifiers\index{identifier}, |
190 distinguishes four kinds of them: identifiers\index{identifier}, |
192 unknowns\index{unknown}\index{scheme variable|see{unknown}}, type |
191 unknowns\index{unknown}\index{scheme variable|see{unknown}}, type |
193 variables\index{type variable}, type unknowns\index{type unknown}\index{type |
192 variables\index{type variable}, type unknowns\index{type unknown}\index{type |
194 scheme variable|see{type unknown}}; they are denoted by $id$\index{id@$id$}, |
193 scheme variable|see{type unknown}}; they are denoted by $id$\index{id@$id$}, |
195 $var$\index{var@$var$}, $tfree$\index{tfree@$tfree$}, |
194 $var$\index{var@$var$}, $tfree$\index{tfree@$tfree$}, |
196 $tvar$\index{tvar@$tvar$}, respectively. Typical examples are {\tt x}, {\tt |
195 $tvar$\index{tvar@$tvar$}, respectively. Typical examples are {\tt x}, {\tt |
197 ?x7}, {\tt 'a}, {\tt ?'a3}, the exact syntax is: |
196 ?x7}, {\tt 'a}, {\tt ?'a3}, the exact syntax is: |
198 |
197 \begin{eqnarray*} |
199 \begin{tabular}{rcl} |
198 id & = & letter~quasiletter^* \\ |
200 $id$ & = & $letter~quasiletter^*$ \\ |
199 var & = & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ |
201 $var$ & = & ${\tt ?}id ~~|~~ {\tt ?}id{\tt .}nat$ \\ |
200 tfree & = & \mbox{\tt '}id \\ |
202 $tfree$ & = & ${\tt '}id$ \\ |
201 tvar & = & \mbox{\tt ?}tfree ~~|~~ |
203 $tvar$ & = & ${\tt ?}tfree ~~|~~ {\tt ?}tfree{\tt .}nat$ \\[1ex] |
202 \mbox{\tt ?}tfree\mbox{\tt .}nat \\[1ex] |
204 |
203 letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\ |
205 $letter$ & = & one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z} \\ |
204 digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\ |
206 $digit$ & = & one of {\tt 0}\dots {\tt 9} \\ |
205 quasiletter & = & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\ |
207 $quasiletter$ & = & $letter ~~|~~ digit ~~|~~ {\tt _} ~~|~~ {\tt '}$ \\ |
206 nat & = & digit^+ |
208 $nat$ & = & $digit^+$ |
207 \end{eqnarray*} |
209 \end{tabular} |
208 A $var$ or $tvar$ describes an \rmindex{unknown} which is internally a pair |
210 |
209 of base name and index (\ML\ type \ttindex{indexname}). These components are |
211 A string of $var$ or $tvar$ describes an \rmindex{unknown} which is |
210 either explicitly separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or |
212 internally a pair of base name and index (\ML\ type \ttindex{indexname}). |
211 directly run together as in {\tt ?x1}. The latter form is possible, if the |
213 These components are either explicitly separated by a dot as in {\tt ?x.1} or |
212 base name does not end with digits; if the index is 0, it may be dropped |
214 {\tt ?x7.3} or directly run together as in {\tt ?x1}. The latter form is |
213 altogether: {\tt ?x} abbreviates {\tt ?x0} or {\tt ?x.0}. |
215 possible, if the base name does not end with digits. And if the index is 0, |
|
216 it may be dropped altogether: {\tt ?x} abbreviates {\tt ?x0} or {\tt ?x.0}. |
|
217 |
214 |
218 Note that $id$, $var$, $tfree$, $tvar$ are mutually disjoint, but it is |
215 Note that $id$, $var$, $tfree$, $tvar$ are mutually disjoint, but it is |
219 perfectly legal that they overlap with the set of literal tokens (e.g.\ {\tt |
216 perfectly legal that they overlap with the set of delimiters (e.g.\ {\tt |
220 PROP}, {\tt ALL}, {\tt EX}). |
217 PROP}, {\tt ALL}, {\tt EX}). |
221 |
218 |
222 The lexical analyzer translates input strings to token lists by repeatedly |
219 The lexical analyzer translates input strings to token lists by repeatedly |
223 taking the maximal prefix of the input string that forms a valid token. A |
220 taking the maximal prefix of the input string that forms a valid token. A |
224 maximal prefix that is both a literal and a valued token is treated as a |
221 maximal prefix that is both a delimiter and a name is treated as a delimiter. |
225 literal. Spaces, tabs and newlines are separators; they never occur within |
222 Spaces, tabs and newlines are separators; they never occur within tokens. |
226 tokens. |
223 |
227 |
224 Note that delimiters need not necessarily be surrounded by white space to be |
228 Note that literals need not necessarily be surrounded by white space to be |
225 recognized as separate. For example, if {\tt -} is a delimiter but {\tt --} |
229 recognized as separate. For example, if {\tt -} is a literal but {\tt --} is |
226 is not, then the string {\tt --} is treated as two consecutive occurrences of |
230 not, then the string {\tt --} is treated as two consecutive occurrences of |
227 {\tt -}. This is in contrast to \ML\ which always treats {\tt --} as a single |
231 {\tt -}. This is in contrast to \ML\ which would treat {\tt --} always as a |
228 symbolic name. The consequence of Isabelle's more liberal scheme is that the |
232 single symbolic name. The consequence of Isabelle's more liberal scheme is |
229 same string may be parsed in different ways after extending the syntax: after |
233 that the same string may be parsed in different ways after extending the |
230 adding {\tt --} as a delimiter, the input {\tt --} is treated as a single |
234 syntax: after adding {\tt --} as a literal, the input {\tt --} is treated as |
231 token. |
235 a single token. |
|
236 |
232 |
237 |
233 |
238 \subsection{Inspecting syntax *} |
234 \subsection{Inspecting syntax *} |
239 |
235 |
240 You may get the \ML\ representation of the syntax of any Isabelle theory by |
236 You may get the \ML\ representation of the syntax of any Isabelle theory by |
249 Syntax.print_syntax: Syntax.syntax -> unit |
245 Syntax.print_syntax: Syntax.syntax -> unit |
250 Syntax.print_gram: Syntax.syntax -> unit |
246 Syntax.print_gram: Syntax.syntax -> unit |
251 Syntax.print_trans: Syntax.syntax -> unit |
247 Syntax.print_trans: Syntax.syntax -> unit |
252 \end{ttbox} |
248 \end{ttbox} |
253 {\tt Syntax.print_syntax} shows virtually all information contained in a |
249 {\tt Syntax.print_syntax} shows virtually all information contained in a |
254 syntax, therefore being quite verbose. Its output is divided into labeled |
250 syntax. Its output is divided into labeled sections. The syntax proper is |
255 sections. The syntax proper is represented by {\tt lexicon}, {\tt roots} and |
251 represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest refers to |
256 {\tt prods}. The rest refers to the manifold facilities to apply syntactic |
252 the manifold facilities to apply syntactic translations (macro expansion |
257 translations (macro expansion etc.). |
253 etc.). |
258 |
254 |
259 To simplify coping with the verbosity of {\tt Syntax.print_syntax}, there are |
255 To cope with the verbosity of {\tt Syntax.print_syntax}, there are |
260 \ttindex{Syntax.print_gram} to print the syntax proper only and |
256 \ttindex{Syntax.print_gram} and \ttindex{Syntax.print_trans} to print the |
261 \ttindex{Syntax.print_trans} to print the translation related information |
257 syntax proper and the translation related information only. |
262 only. |
|
263 |
258 |
264 Let's have a closer look at part of Pure's syntax: |
259 Let's have a closer look at part of Pure's syntax: |
265 \begin{ttbox} |
260 \begin{ttbox} |
266 Syntax.print_syntax (syn_of Pure.thy); |
261 Syntax.print_syntax (syn_of Pure.thy); |
267 {\out lexicon: "!!" "%" "(" ")" "," "." "::" ";" "==" "==>" \dots} |
262 {\out lexicon: "!!" "%" "(" ")" "," "." "::" ";" "==" "==>" \dots} |
306 constructor and $\alpha$ a type variable or unknown. Note that only the |
299 constructor and $\alpha$ a type variable or unknown. Note that only the |
307 outermost type constructor is taken into account. |
300 outermost type constructor is taken into account. |
308 |
301 |
309 \item[\ttindex{prods}] |
302 \item[\ttindex{prods}] |
310 The list of productions describing the precedence grammar. Nonterminals |
303 The list of productions describing the precedence grammar. Nonterminals |
311 $A@n$ are rendered in {\sc ascii} as {\tt $A$[$n$]}, literal tokens are |
304 $A@n$ are rendered in {\sc ascii} as {\tt $A$[$n$]}, delimiters are |
312 quoted. Some productions have strings attached after an {\tt =>}. These |
305 quoted. Some productions have strings attached after an {\tt =>}. These |
313 strings later become the heads of parse trees, but they also play a vital |
306 strings later become the heads of parse trees, but they also play a vital |
314 role when terms are printed (see \S\ref{sec:asts}). |
307 role when terms are printed (see \S\ref{sec:asts}). |
315 |
308 |
316 Productions which do not have string attached and thus do not create a |
309 Productions which do not have string attached and thus do not create a |
317 new parse tree node are called {\bf copy productions}\indexbold{copy |
310 new parse tree node are called {\bf copy productions}\indexbold{copy |
318 production}. They must have exactly one |
311 production}. They must have exactly one |
319 argument\index{argument!production} (i.e.\ nonterminal or valued token) |
312 argument\index{argument!production} (i.e.\ nonterminal or name token) |
320 on the right-hand side. The parse tree generated when parsing that |
313 on the right-hand side. The parse tree generated when parsing that |
321 argument is simply passed up as the result of parsing the whole copy |
314 argument is simply passed up as the result of parsing the whole copy |
322 production. |
315 production. |
323 |
316 |
324 A special kind of copy production is one where the argument is a |
317 A special kind of copy production is one where the argument is a |
325 nonterminal and no additional syntactic sugar (literals) exists. It is |
318 nonterminal and no additional syntactic sugar (delimiters) exists. It is |
326 called a \bfindex{chain production}. Chain productions should be seen as |
319 called a \bfindex{chain production}. Chain productions should be seen as |
327 an abbreviation mechanism: conceptually, they are removed from the |
320 an abbreviation mechanism: conceptually, they are removed from the |
328 grammar by adding appropriate new rules. Precedence information attached |
321 grammar by adding appropriate new rules. Precedence information attached |
329 to chain productions is ignored, only the dummy value $-1$ is displayed. |
322 to chain productions is ignored, only the dummy value $-1$ is displayed. |
330 |
323 |
336 \index{*print_translation} \index{*print_ast_translation} |
329 \index{*print_translation} \index{*print_ast_translation} |
337 The sets of constants that invoke translation functions. These are more |
330 The sets of constants that invoke translation functions. These are more |
338 arcane matters (see \S\ref{sec:asts} and \S\ref{sec:tr_funs}). |
331 arcane matters (see \S\ref{sec:asts} and \S\ref{sec:tr_funs}). |
339 \end{description} |
332 \end{description} |
340 |
333 |
341 Of course you may inspect the syntax of any theory using the above calling |
334 You may inspect the syntax of any theory using the above calling |
342 sequence. Beware that, as more and more material accumulates, the output |
335 sequence. Beware that, as more and more material accumulates, the output |
343 becomes even more verbose. When extending syntaxes, new {\tt roots}, {\tt |
336 becomes longer and longer. |
344 prods}, {\tt parse_rules} and {\tt print_rules} are appended to the end. The |
337 % When extending syntaxes, new {\tt roots}, {\tt prods}, {\tt parse_rules} |
345 other lists are displayed sorted. |
338 % and {\tt print_rules} are appended to the end. The other lists are |
346 |
339 % displayed sorted. |
347 |
340 |
348 |
341 |
349 \section{Abstract syntax trees} \label{sec:asts} |
342 \section{Abstract syntax trees} \label{sec:asts} |
350 |
343 |
351 Figure~\ref{fig:parse_print} shows a simplified model of the parsing and |
344 Figure~\ref{fig:parse_print} shows a simplified model of the parsing and |
385 The printing process is the reverse, except for some subtleties to be |
378 The printing process is the reverse, except for some subtleties to be |
386 discussed later. |
379 discussed later. |
387 |
380 |
388 Asts are an intermediate form between the very raw parse trees and the typed |
381 Asts are an intermediate form between the very raw parse trees and the typed |
389 $\lambda$-terms. An ast is either an atom (constant or variable) or a list of |
382 $\lambda$-terms. An ast is either an atom (constant or variable) or a list of |
390 {\em at least two\/} subasts. Internally, they have type \ttindex{Syntax.ast} |
383 {\em at least two\/} subasts. Internally, they have type |
391 which is defined as: \index{*Constant} \index{*Variable} \index{*Appl} |
384 \ttindex{Syntax.ast}: \index{*Constant} \index{*Variable} \index{*Appl} |
392 \begin{ttbox} |
385 \begin{ttbox} |
393 datatype ast = |
386 datatype ast = Constant of string |
394 Constant of string | |
387 | Variable of string |
395 Variable of string | |
388 | Appl of ast list |
396 Appl of ast list |
389 \end{ttbox} |
397 \end{ttbox} |
390 |
398 |
391 We write constant atoms as quoted strings, variable atoms as non-quoted |
399 Notation: We write constant atoms as quoted strings, variable atoms as |
392 strings and applications as list of subasts separated by space and enclosed |
400 non-quoted strings and applications as list of subasts separated by space and |
393 in parentheses. For example: |
401 enclosed in parentheses. For example: |
|
402 \begin{ttbox} |
394 \begin{ttbox} |
403 Appl [Constant "_constrain", |
395 Appl [Constant "_constrain", |
404 Appl [Constant "_abs", Variable "x", Variable "t"], |
396 Appl [Constant "_abs", Variable "x", Variable "t"], |
405 Appl [Constant "fun", Variable "'a", Variable "'b"]] |
397 Appl [Constant "fun", Variable "'a", Variable "'b"]] |
406 {\rm is written as} ("_constrain" ("_abs" x t) ("fun" 'a 'b)) |
398 \end{ttbox} |
407 \end{ttbox} |
399 is written as {\tt ("_constrain" ("_abs" x t) ("fun" 'a 'b))}. |
408 |
|
409 Note that {\tt ()} and {\tt (f)} are both illegal. |
400 Note that {\tt ()} and {\tt (f)} are both illegal. |
410 |
401 |
411 The resemblance of Lisp's S-expressions is intentional, but notice the two |
402 The resemblance of Lisp's S-expressions is intentional, but notice the two |
412 kinds of atomic symbols: $\Constant x$ and $\Variable x$. This distinction |
403 kinds of atomic symbols: $\Constant x$ and $\Variable x$. This distinction |
413 has some more obscure reasons and you can ignore it about half of the time. |
404 has some more obscure reasons and you can ignore it about half of the time. |
445 |
436 |
446 Ignoring parse ast translations, the mapping |
437 Ignoring parse ast translations, the mapping |
447 $ast_of_pt$\index{ast_of_pt@$ast_of_pt$} from parse trees to asts is derived |
438 $ast_of_pt$\index{ast_of_pt@$ast_of_pt$} from parse trees to asts is derived |
448 from the productions involved as follows: |
439 from the productions involved as follows: |
449 \begin{itemize} |
440 \begin{itemize} |
450 \item Valued tokens: $ast_of_pt(t) = \Variable s$, where $t$ is an $id$, |
441 \item Name tokens: $ast_of_pt(t) = \Variable s$, where $t$ is an $id$, |
451 $var$, $tfree$ or $tvar$ token, and $s$ its value. |
442 $var$, $tfree$ or $tvar$ token, and $s$ its associated string. |
452 |
443 |
453 \item Copy productions: $ast_of_pt(\ldots P \ldots) = ast_of_pt(P)$. |
444 \item Copy productions: $ast_of_pt(\ldots P \ldots) = ast_of_pt(P)$. |
454 |
445 |
455 \item $0$-ary productions: $ast_of_pt(\ldots \mtt{=>} c) = \Constant c$. |
446 \item $0$-ary productions: $ast_of_pt(\ldots \mtt{=>} c) = \Constant c$. |
456 |
447 |
457 \item $n$-ary productions: $ast_of_pt(\ldots P@1 \ldots P@n \ldots \mtt{=>} |
448 \item $n$-ary productions: $ast_of_pt(\ldots P@1 \ldots P@n \ldots \mtt{=>} |
458 c) = \Appl{\Constant c,$ $ast_of_pt(P@1),$ $\ldots,$ $ast_of_pt(P@n)}$, |
449 c) = \Appl{\Constant c,$ $ast_of_pt(P@1),$ $\ldots,$ $ast_of_pt(P@n)}$, |
459 where $n \ge 1$. |
450 where $n \ge 1$. |
460 \end{itemize} |
451 \end{itemize} |
461 Thereby $P, P@1, \ldots, P@n$ denote sub parse trees or valued tokens and |
452 where $P, P@1, \ldots, P@n$ denote parse trees or name tokens and ``\dots'' |
462 ``\dots'' zero or more literal tokens. That means literal tokens are stripped |
453 zero or more delimiters. Delimiters are stripped and do not appear in asts. |
463 and do not appear in asts. |
|
464 |
|
465 The following table presents some simple examples: |
454 The following table presents some simple examples: |
466 |
455 \begin{center} |
467 {\tt\begin{tabular}{ll} |
456 {\tt\begin{tabular}{ll} |
468 \rm input string & \rm ast \\\hline |
457 \rm input string & \rm ast \\\hline |
469 "f" & f \\ |
458 "f" & f \\ |
470 "'a" & 'a \\ |
459 "'a" & 'a \\ |
471 "t == u" & ("==" t u) \\ |
460 "t == u" & ("==" t u) \\ |
472 "f(x)" & ("_appl" f x) \\ |
461 "f(x)" & ("_appl" f x) \\ |
473 "f(x, y)" & ("_appl" f ("_args" x y)) \\ |
462 "f(x, y)" & ("_appl" f ("_args" x y)) \\ |
474 "f(x, y, z)" & ("_appl" f ("_args" x ("_args" y z))) \\ |
463 "f(x, y, z)" & ("_appl" f ("_args" x ("_args" y z))) \\ |
475 "\%x y.\ t" & ("_lambda" ("_idts" x y) t) \\ |
464 "\%x y.\ t" & ("_lambda" ("_idts" x y) t) \\ |
476 \end{tabular}} |
465 \end{tabular}} |
477 |
466 \end{center} |
478 Some of these examples illustrate why further translations are desirable in |
467 Some of these examples illustrate why further translations are desirable in |
479 order to provide some nice standard form of asts before macro expansion takes |
468 order to provide some nice standard form of asts before macro expansion takes |
480 place. Hence the Pure syntax provides predefined parse ast |
469 place. Hence the Pure syntax provides predefined parse ast |
481 translations\index{parse ast translation!of Pure} for ordinary applications, |
470 translations\index{parse ast translation!of Pure} for ordinary applications, |
482 type applications, nested abstractions, meta implications and function types. |
471 type applications, nested abstractions, meta implications and function types. |
642 \item $sy$ is the right-hand side of the production, specified as a |
631 \item $sy$ is the right-hand side of the production, specified as a |
643 symbolic string. In general, $sy$ is of the form $\alpha@0 \_ \alpha@1 |
632 symbolic string. In general, $sy$ is of the form $\alpha@0 \_ \alpha@1 |
644 \dots \alpha@{n-1} \_ \alpha@n$, where each occurrence of \ttindex{_} |
633 \dots \alpha@{n-1} \_ \alpha@n$, where each occurrence of \ttindex{_} |
645 denotes an argument\index{argument!mixfix} position and the strings |
634 denotes an argument\index{argument!mixfix} position and the strings |
646 $\alpha@i$ do not contain (non-escaped) {\tt _}. The $\alpha@i$ may |
635 $\alpha@i$ do not contain (non-escaped) {\tt _}. The $\alpha@i$ may |
647 consist of delimiters\index{delimiter}, |
636 consist of delimiters\index{delimiter}, spaces\index{space (pretty |
648 spaces\index{space (pretty printing)} or \rmindex{pretty printing} |
637 printing)} or \rmindex{pretty printing} annotations (see below). |
649 annotations (see below). |
|
650 |
638 |
651 \item $\tau$ specifies the syntactic categories of the arguments on the |
639 \item $\tau$ specifies the syntactic categories of the arguments on the |
652 left-hand and of the right-hand side. Arguments may be nonterminals or |
640 left-hand and of the right-hand side. Arguments may be nonterminals or |
653 valued tokens. If $sy$ is of the form above, $\tau$ must be a nested |
641 name tokens. If $sy$ is of the form above, $\tau$ must be a nested |
654 function type with at least $n$ argument positions, say $\tau = [\tau@1, |
642 function type with at least $n$ argument positions, say $\tau = [\tau@1, |
655 \dots, \tau@n] \To \tau'$. The syntactic category of argument $i$ is |
643 \dots, \tau@n] \To \tau'$. The syntactic category of argument $i$ is |
656 derived from type $\tau@i$ (see $root_of_type$ in |
644 derived from type $\tau@i$ (see $root_of_type$ in |
657 \S\ref{sec:basic_syntax}). The result, i.e.\ the left-hand side of the |
645 \S\ref{sec:basic_syntax}). The result, i.e.\ the left-hand side of the |
658 production, is derived from type $\tau'$. Note that the $\tau@i$ and |
646 production, is derived from type $\tau'$. Note that the $\tau@i$ and |
902 |
890 |
903 |
891 |
904 \subsection{Specifying macros} |
892 \subsection{Specifying macros} |
905 |
893 |
906 Basically macros are rewrite rules on asts. But unlike other macro systems of |
894 Basically macros are rewrite rules on asts. But unlike other macro systems of |
907 various programming languages, Isabelle's macros work two way. Therefore a |
895 various programming languages, Isabelle's macros work in both directions. |
908 syntax contains two lists of rules: one for parsing and one for printing. |
896 Therefore a syntax contains two lists of rules: one for parsing and one for |
|
897 printing. |
909 |
898 |
910 The {\tt translations} section\index{translations section@{\tt translations} |
899 The {\tt translations} section\index{translations section@{\tt translations} |
911 section} consists of a list of rule specifications of the form: |
900 section} consists of a list of rule specifications of the form: |
912 |
901 |
913 \begin{center} |
902 \begin{center} |
914 {\tt $[$ ($root$) $]$ $string$ $[$ => $|$ <= $|$ == $]$ $[$ ($root$) $]$ |
903 {\tt $[$ ($root$) $]$ $string$ $[$ => $|$ <= $|$ == $]$ $[$ ($root$) $]$ |
915 $string$}. |
904 $string$}. |
916 \end{center} |
905 \end{center} |
917 |
906 |
918 This specifies a \rmindex{parse rule} ({\tt =>}) a \rmindex{print rule} ({\tt |
907 This specifies a \rmindex{parse rule} ({\tt =>}) a \rmindex{print rule} ({\tt |
919 <=}) or both ({\tt ==}). The two $string$s preceded by optional parenthesized |
908 <=}) or both ({\tt ==}). The two $string$s preceded by optional |
920 $root$s denote the left-hand and right-hand side of the rule as `source |
909 parenthesized $root$s are the left-hand and right-hand side of the rule in |
921 code', i.e.\ in the usual syntax of terms. |
910 user-defined syntax. |
922 |
911 |
923 Rules are internalized wrt.\ an intermediate signature that is obtained from |
912 Rules are internalized wrt.\ an intermediate signature that is obtained from |
924 the parent theories' ones by adding all material of all sections preceding |
913 the parent theories' ones by adding all material of all sections preceding |
925 {\tt translations} in the {\tt .thy} file. Especially, new syntax defined in |
914 {\tt translations} in the {\tt .thy} file. Especially, new syntax defined in |
926 {\tt consts} is already effective. |
915 {\tt consts} is already effective. |
1342 \begin{example} \label{ex:tr_funs} |
1331 \begin{example} \label{ex:tr_funs} |
1343 |
1332 |
1344 We continue Example~\ref{ex:prod_trans} by presenting the \ML\ sources of the |
1333 We continue Example~\ref{ex:prod_trans} by presenting the \ML\ sources of the |
1345 parse translation for \ttindex{_K} and the print translation |
1334 parse translation for \ttindex{_K} and the print translation |
1346 \ttindex{dependent_tr'}: |
1335 \ttindex{dependent_tr'}: |
1347 |
|
1348 \begin{ttbox} |
1336 \begin{ttbox} |
1349 (* nondependent abstraction *) |
1337 (* nondependent abstraction *) |
1350 |
|
1351 fun k_tr (*"_K"*) [t] = Abs ("x", dummyT, incr_boundvars 1 t) |
1338 fun k_tr (*"_K"*) [t] = Abs ("x", dummyT, incr_boundvars 1 t) |
1352 | k_tr (*"_K"*) ts = raise_term "k_tr" ts; |
1339 | k_tr (*"_K"*) ts = raise TERM("k_tr",ts); |
1353 |
1340 |
1354 (* dependent / nondependent quantifiers *) |
1341 (* dependent / nondependent quantifiers *) |
1355 |
|
1356 fun dependent_tr' (q, r) (A :: Abs (x, T, B) :: ts) = |
1342 fun dependent_tr' (q, r) (A :: Abs (x, T, B) :: ts) = |
1357 if 0 mem (loose_bnos B) then |
1343 if 0 mem (loose_bnos B) then |
1358 let val (x', B') = variant_abs (x, dummyT, B); |
1344 let val (x', B') = variant_abs (x, dummyT, B); |
1359 in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts) |
1345 in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts) |
1360 end |
1346 end |
1361 else list_comb (Const (r, dummyT) $ A $ B, ts) |
1347 else list_comb (Const (r, dummyT) $ A $ B, ts) |
1362 | dependent_tr' _ _ = raise Match; |
1348 | dependent_tr' _ _ = raise Match; |
1363 \end{ttbox} |
1349 \end{ttbox} |
1364 |
1350 This text is taken from the Pure sources. Ordinary user translations |
1365 This text is taken from the Pure sources, ordinary user translations would |
1351 would typically appear within the {\tt ML} section of the {\tt .thy} file. To |
1366 typically appear within the {\tt ML} section of the {\tt .thy} file. |
1352 link {\tt k_tr} into the parser requires the definition |
1367 |
1353 \begin{ttbox} |
1368 \medskip |
1354 val parse_translation = [("_K",k_tr)]; |
|
1355 \end{ttbox} |
|
1356 For {\tt k_tr} this is not necessary, as it is built-in, but user-defined |
|
1357 translation functions are added in exactly the same way. \medskip |
|
1358 |
1369 If {\tt k_tr} is called with exactly one argument $t$, it creates a new {\tt |
1359 If {\tt k_tr} is called with exactly one argument $t$, it creates a new {\tt |
1370 Abs} node with a body derived from $t$: loose {\tt Bound}s, i.e.\ those |
1360 Abs} node with a body derived from $t$: loose {\tt Bound}s, i.e.\ those |
1371 referring to outer {\tt Abs}s, are incremented using |
1361 referring to outer {\tt Abs}s, are incremented using |
1372 \ttindex{incr_boundvars}. This and many other basic term manipulation |
1362 \ttindex{incr_boundvars}. This and many other basic term manipulation |
1373 functions are defined in {\tt Pure/term.ML}, the comments there being in most |
1363 functions are defined in {\tt Pure/term.ML}, the comments there being in most |