323
|
1 |
%% $Id$
|
|
2 |
\chapter{Syntax Transformations} \label{chap:syntax}
|
|
3 |
\newcommand\ttapp{\mathrel{\hbox{\tt\$}}}
|
|
4 |
\newcommand\mtt[1]{\mbox{\tt #1}}
|
|
5 |
\newcommand\ttfct[1]{\mathop{\mtt{#1}}\nolimits}
|
|
6 |
\newcommand\Constant{\ttfct{Constant}}
|
|
7 |
\newcommand\Variable{\ttfct{Variable}}
|
|
8 |
\newcommand\Appl[1]{\ttfct{Appl}\,[#1]}
|
|
9 |
\index{syntax!transformations|(}
|
|
10 |
|
|
11 |
This chapter is intended for experienced Isabelle users who need to define
|
|
12 |
macros or code their own translation functions. It describes the
|
|
13 |
transformations between parse trees, abstract syntax trees and terms.
|
|
14 |
|
|
15 |
|
|
16 |
\section{Abstract syntax trees} \label{sec:asts}
|
|
17 |
\index{ASTs|(}
|
|
18 |
|
|
19 |
The parser, given a token list from the lexer, applies productions to yield
|
|
20 |
a parse tree\index{parse trees}. By applying some internal transformations
|
|
21 |
the parse tree becomes an abstract syntax tree, or \AST{}. Macro
|
|
22 |
expansion, further translations and finally type inference yields a
|
|
23 |
well-typed term. The printing process is the reverse, except for some
|
|
24 |
subtleties to be discussed later.
|
|
25 |
|
|
26 |
Figure~\ref{fig:parse_print} outlines the parsing and printing process.
|
|
27 |
Much of the complexity is due to the macro mechanism. Using macros, you
|
|
28 |
can specify most forms of concrete syntax without writing any \ML{} code.
|
|
29 |
|
|
30 |
\begin{figure}
|
|
31 |
\begin{center}
|
|
32 |
\begin{tabular}{cl}
|
|
33 |
string & \\
|
|
34 |
$\downarrow$ & parser \\
|
|
35 |
parse tree & \\
|
|
36 |
$\downarrow$ & parse \AST{} translation \\
|
|
37 |
\AST{} & \\
|
|
38 |
$\downarrow$ & \AST{} rewriting (macros) \\
|
|
39 |
\AST{} & \\
|
|
40 |
$\downarrow$ & parse translation, type inference \\
|
|
41 |
--- well-typed term --- & \\
|
|
42 |
$\downarrow$ & print translation \\
|
|
43 |
\AST{} & \\
|
|
44 |
$\downarrow$ & \AST{} rewriting (macros) \\
|
|
45 |
\AST{} & \\
|
|
46 |
$\downarrow$ & print \AST{} translation, printer \\
|
|
47 |
string &
|
|
48 |
\end{tabular}
|
|
49 |
|
|
50 |
\end{center}
|
|
51 |
\caption{Parsing and printing}\label{fig:parse_print}
|
|
52 |
\end{figure}
|
|
53 |
|
|
54 |
Abstract syntax trees are an intermediate form between the raw parse trees
|
|
55 |
and the typed $\lambda$-terms. An \AST{} is either an atom (constant or
|
|
56 |
variable) or a list of {\em at least two\/} subtrees. Internally, they
|
|
57 |
have type \mltydx{Syntax.ast}: \index{*Constant} \index{*Variable}
|
|
58 |
\index{*Appl}
|
|
59 |
\begin{ttbox}
|
|
60 |
datatype ast = Constant of string
|
|
61 |
| Variable of string
|
|
62 |
| Appl of ast list
|
|
63 |
\end{ttbox}
|
|
64 |
%
|
|
65 |
Isabelle uses an S-expression syntax for abstract syntax trees. Constant
|
|
66 |
atoms are shown as quoted strings, variable atoms as non-quoted strings and
|
|
67 |
applications as a parenthesized list of subtrees. For example, the \AST
|
|
68 |
\begin{ttbox}
|
|
69 |
Appl [Constant "_constrain",
|
|
70 |
Appl [Constant "_abs", Variable "x", Variable "t"],
|
|
71 |
Appl [Constant "fun", Variable "'a", Variable "'b"]]
|
|
72 |
\end{ttbox}
|
|
73 |
is shown as {\tt ("_constrain" ("_abs" x t) ("fun" 'a 'b))}.
|
|
74 |
Both {\tt ()} and {\tt (f)} are illegal because they have too few
|
|
75 |
subtrees.
|
|
76 |
|
|
77 |
The resemblance to Lisp's S-expressions is intentional, but there are two
|
|
78 |
kinds of atomic symbols: $\Constant x$ and $\Variable x$. Do not take the
|
|
79 |
names {\tt Constant} and {\tt Variable} too literally; in the later
|
|
80 |
translation to terms, $\Variable x$ may become a constant, free or bound
|
|
81 |
variable, even a type constructor or class name; the actual outcome depends
|
|
82 |
on the context.
|
|
83 |
|
|
84 |
Similarly, you can think of ${\tt (} f~x@1~\ldots~x@n{\tt )}$ as the
|
|
85 |
application of~$f$ to the arguments $x@1, \ldots, x@n$. But the kind of
|
|
86 |
application is determined later by context; it could be a type constructor
|
|
87 |
applied to types.
|
|
88 |
|
|
89 |
Forms like {\tt (("_abs" x $t$) $u$)} are legal, but \AST{}s are
|
|
90 |
first-order: the {\tt "_abs"} does not bind the {\tt x} in any way. Later
|
|
91 |
at the term level, {\tt ("_abs" x $t$)} will become an {\tt Abs} node and
|
|
92 |
occurrences of {\tt x} in $t$ will be replaced by bound variables (the term
|
|
93 |
constructor \ttindex{Bound}).
|
|
94 |
|
|
95 |
|
|
96 |
\section{Transforming parse trees to \AST{}s}\label{sec:astofpt}
|
|
97 |
\index{ASTs!made from parse trees}
|
|
98 |
\newcommand\astofpt[1]{\lbrakk#1\rbrakk}
|
|
99 |
|
|
100 |
The parse tree is the raw output of the parser. Translation functions,
|
|
101 |
called {\bf parse AST translations}\indexbold{translations!parse AST},
|
|
102 |
transform the parse tree into an abstract syntax tree.
|
|
103 |
|
|
104 |
The parse tree is constructed by nesting the right-hand sides of the
|
|
105 |
productions used to recognize the input. Such parse trees are simply lists
|
|
106 |
of tokens and constituent parse trees, the latter representing the
|
|
107 |
nonterminals of the productions. Let us refer to the actual productions in
|
|
108 |
the form displayed by {\tt Syntax.print_syntax}.
|
|
109 |
|
|
110 |
Ignoring parse \AST{} translations, parse trees are transformed to \AST{}s
|
|
111 |
by stripping out delimiters and copy productions. More precisely, the
|
|
112 |
mapping $\astofpt{-}$ is derived from the productions as follows:
|
|
113 |
\begin{itemize}
|
|
114 |
\item Name tokens: $\astofpt{t} = \Variable s$, where $t$ is an \ndx{id},
|
|
115 |
\ndx{var}, \ndx{tid} or \ndx{tvar} token, and $s$ its associated string.
|
|
116 |
|
|
117 |
\item Copy productions:\index{productions!copy}
|
|
118 |
$\astofpt{\ldots P \ldots} = \astofpt{P}$. Here $\ldots$ stands for
|
|
119 |
strings of delimiters, which are discarded. $P$ stands for the single
|
|
120 |
constituent that is not a delimiter; it is either a nonterminal symbol or
|
|
121 |
a name token.
|
|
122 |
|
|
123 |
\item 0-ary productions: $\astofpt{\ldots \mtt{=>} c} = \Constant c$.
|
|
124 |
Here there are no constituents other than delimiters, which are
|
|
125 |
discarded.
|
|
126 |
|
|
127 |
\item $n$-ary productions, where $n \ge 1$: delimiters are discarded and
|
|
128 |
the remaining constituents $P@1$, \ldots, $P@n$ are built into an
|
|
129 |
application whose head constant is~$c$:
|
|
130 |
\[ \astofpt{\ldots P@1 \ldots P@n \ldots \mtt{=>} c} =
|
|
131 |
\Appl{\Constant c, \astofpt{P@1}, \ldots, \astofpt{P@n}}
|
|
132 |
\]
|
|
133 |
\end{itemize}
|
|
134 |
Figure~\ref{fig:parse_ast} presents some simple examples, where {\tt ==},
|
|
135 |
{\tt _appl}, {\tt _args}, and so forth name productions of the Pure syntax.
|
|
136 |
These examples illustrate the need for further translations to make \AST{}s
|
|
137 |
closer to the typed $\lambda$-calculus. The Pure syntax provides
|
|
138 |
predefined parse \AST{} translations\index{translations!parse AST} for
|
|
139 |
ordinary applications, type applications, nested abstractions, meta
|
|
140 |
implications and function types. Figure~\ref{fig:parse_ast_tr} shows their
|
|
141 |
effect on some representative input strings.
|
|
142 |
|
|
143 |
|
|
144 |
\begin{figure}
|
|
145 |
\begin{center}
|
|
146 |
\tt\begin{tabular}{ll}
|
|
147 |
\rm input string & \rm \AST \\\hline
|
|
148 |
"f" & f \\
|
|
149 |
"'a" & 'a \\
|
|
150 |
"t == u" & ("==" t u) \\
|
|
151 |
"f(x)" & ("_appl" f x) \\
|
|
152 |
"f(x, y)" & ("_appl" f ("_args" x y)) \\
|
|
153 |
"f(x, y, z)" & ("_appl" f ("_args" x ("_args" y z))) \\
|
|
154 |
"\%x y.\ t" & ("_lambda" ("_idts" x y) t) \\
|
|
155 |
\end{tabular}
|
|
156 |
\end{center}
|
|
157 |
\caption{Parsing examples using the Pure syntax}\label{fig:parse_ast}
|
|
158 |
\end{figure}
|
|
159 |
|
|
160 |
\begin{figure}
|
|
161 |
\begin{center}
|
|
162 |
\tt\begin{tabular}{ll}
|
|
163 |
\rm input string & \rm \AST{} \\\hline
|
|
164 |
"f(x, y, z)" & (f x y z) \\
|
|
165 |
"'a ty" & (ty 'a) \\
|
|
166 |
"('a, 'b) ty" & (ty 'a 'b) \\
|
|
167 |
"\%x y z.\ t" & ("_abs" x ("_abs" y ("_abs" z t))) \\
|
|
168 |
"\%x ::\ 'a.\ t" & ("_abs" ("_constrain" x 'a) t) \\
|
|
169 |
"[| P; Q; R |] => S" & ("==>" P ("==>" Q ("==>" R S))) \\
|
|
170 |
"['a, 'b, 'c] => 'd" & ("fun" 'a ("fun" 'b ("fun" 'c 'd)))
|
|
171 |
\end{tabular}
|
|
172 |
\end{center}
|
|
173 |
\caption{Built-in parse \AST{} translations}\label{fig:parse_ast_tr}
|
|
174 |
\end{figure}
|
|
175 |
|
|
176 |
The names of constant heads in the \AST{} control the translation process.
|
|
177 |
The list of constants invoking parse \AST{} translations appears in the
|
|
178 |
output of {\tt Syntax.print_syntax} under {\tt parse_ast_translation}.
|
|
179 |
|
|
180 |
|
|
181 |
\section{Transforming \AST{}s to terms}\label{sec:termofast}
|
|
182 |
\index{terms!made from ASTs}
|
|
183 |
\newcommand\termofast[1]{\lbrakk#1\rbrakk}
|
|
184 |
|
|
185 |
The \AST{}, after application of macros (see \S\ref{sec:macros}), is
|
|
186 |
transformed into a term. This term is probably ill-typed since type
|
|
187 |
inference has not occurred yet. The term may contain type constraints
|
|
188 |
consisting of applications with head {\tt "_constrain"}; the second
|
|
189 |
argument is a type encoded as a term. Type inference later introduces
|
|
190 |
correct types or rejects the input.
|
|
191 |
|
|
192 |
Another set of translation functions, namely parse
|
|
193 |
translations\index{translations!parse}, may affect this process. If we
|
|
194 |
ignore parse translations for the time being, then \AST{}s are transformed
|
|
195 |
to terms by mapping \AST{} constants to constants, \AST{} variables to
|
|
196 |
schematic or free variables and \AST{} applications to applications.
|
|
197 |
|
|
198 |
More precisely, the mapping $\termofast{-}$ is defined by
|
|
199 |
\begin{itemize}
|
|
200 |
\item Constants: $\termofast{\Constant x} = \ttfct{Const} (x,
|
|
201 |
\mtt{dummyT})$.
|
|
202 |
|
|
203 |
\item Schematic variables: $\termofast{\Variable \mtt{"?}xi\mtt"} =
|
|
204 |
\ttfct{Var} ((x, i), \mtt{dummyT})$, where $x$ is the base name and $i$
|
|
205 |
the index extracted from~$xi$.
|
|
206 |
|
|
207 |
\item Free variables: $\termofast{\Variable x} = \ttfct{Free} (x,
|
|
208 |
\mtt{dummyT})$.
|
|
209 |
|
|
210 |
\item Function applications with $n$ arguments:
|
|
211 |
\[ \termofast{\Appl{f, x@1, \ldots, x@n}} =
|
|
212 |
\termofast{f} \ttapp
|
|
213 |
\termofast{x@1} \ttapp \ldots \ttapp \termofast{x@n}
|
|
214 |
\]
|
|
215 |
\end{itemize}
|
|
216 |
Here \ttindex{Const}, \ttindex{Var}, \ttindex{Free} and
|
|
217 |
\verb|$|\index{$@{\tt\$}} are constructors of the datatype \mltydx{term},
|
|
218 |
while \ttindex{dummyT} stands for some dummy type that is ignored during
|
|
219 |
type inference.
|
|
220 |
|
|
221 |
So far the outcome is still a first-order term. Abstractions and bound
|
|
222 |
variables (constructors \ttindex{Abs} and \ttindex{Bound}) are introduced
|
|
223 |
by parse translations. Such translations are attached to {\tt "_abs"},
|
|
224 |
{\tt "!!"} and user-defined binders.
|
|
225 |
|
|
226 |
|
|
227 |
\section{Printing of terms}
|
|
228 |
\newcommand\astofterm[1]{\lbrakk#1\rbrakk}\index{ASTs!made from terms}
|
|
229 |
|
|
230 |
The output phase is essentially the inverse of the input phase. Terms are
|
|
231 |
translated via abstract syntax trees into strings. Finally the strings are
|
|
232 |
pretty printed.
|
|
233 |
|
|
234 |
Print translations (\S\ref{sec:tr_funs}) may affect the transformation of
|
|
235 |
terms into \AST{}s. Ignoring those, the transformation maps
|
|
236 |
term constants, variables and applications to the corresponding constructs
|
|
237 |
on \AST{}s. Abstractions are mapped to applications of the special
|
|
238 |
constant {\tt _abs}.
|
|
239 |
|
|
240 |
More precisely, the mapping $\astofterm{-}$ is defined as follows:
|
|
241 |
\begin{itemize}
|
|
242 |
\item $\astofterm{\ttfct{Const} (x, \tau)} = \Constant x$.
|
|
243 |
|
|
244 |
\item $\astofterm{\ttfct{Free} (x, \tau)} = constrain (\Variable x,
|
|
245 |
\tau)$.
|
|
246 |
|
|
247 |
\item $\astofterm{\ttfct{Var} ((x, i), \tau)} = constrain (\Variable
|
|
248 |
\mtt{"?}xi\mtt", \tau)$, where $\mtt?xi$ is the string representation of
|
|
249 |
the {\tt indexname} $(x, i)$.
|
|
250 |
|
|
251 |
\item For the abstraction $\lambda x::\tau.t$, let $x'$ be a variant
|
|
252 |
of~$x$ renamed to differ from all names occurring in~$t$, and let $t'$
|
|
253 |
be obtained from~$t$ by replacing all bound occurrences of~$x$ by
|
|
254 |
the free variable $x'$. This replaces corresponding occurrences of the
|
|
255 |
constructor \ttindex{Bound} by the term $\ttfct{Free} (x',
|
|
256 |
\mtt{dummyT})$:
|
|
257 |
\[ \astofterm{\ttfct{Abs} (x, \tau, t)} =
|
|
258 |
\Appl{\Constant \mtt{"_abs"},
|
|
259 |
constrain(\Variable x', \tau), \astofterm{t'}}.
|
|
260 |
\]
|
|
261 |
|
|
262 |
\item $\astofterm{\ttfct{Bound} i} = \Variable \mtt{"B.}i\mtt"$.
|
|
263 |
The occurrence of constructor \ttindex{Bound} should never happen
|
|
264 |
when printing well-typed terms; it indicates a de Bruijn index with no
|
|
265 |
matching abstraction.
|
|
266 |
|
|
267 |
\item Where $f$ is not an application,
|
|
268 |
\[ \astofterm{f \ttapp x@1 \ttapp \ldots \ttapp x@n} =
|
|
269 |
\Appl{\astofterm{f}, \astofterm{x@1}, \ldots,\astofterm{x@n}}
|
|
270 |
\]
|
|
271 |
\end{itemize}
|
|
272 |
%
|
|
273 |
Type constraints are inserted to allow the printing of types. This is
|
|
274 |
governed by the boolean variable \ttindex{show_types}:
|
|
275 |
\begin{itemize}
|
|
276 |
\item $constrain(x, \tau) = x$ \ if $\tau = \mtt{dummyT}$ \index{*dummyT} or
|
|
277 |
\ttindex{show_types} not set to {\tt true}.
|
|
278 |
|
|
279 |
\item $constrain(x, \tau) = \Appl{\Constant \mtt{"_constrain"}, x,
|
|
280 |
\astofterm{\tau}}$ \ otherwise.
|
|
281 |
|
|
282 |
Here, $\astofterm{\tau}$ is the \AST{} encoding of $\tau$: type
|
|
283 |
constructors go to {\tt Constant}s; type identifiers go to {\tt
|
|
284 |
Variable}s; type applications go to {\tt Appl}s with the type
|
|
285 |
constructor as the first element. If \ttindex{show_sorts} is set to
|
|
286 |
{\tt true}, some type variables are decorated with an \AST{} encoding
|
|
287 |
of their sort.
|
|
288 |
\end{itemize}
|
|
289 |
%
|
|
290 |
The \AST{}, after application of macros (see \S\ref{sec:macros}), is
|
|
291 |
transformed into the final output string. The built-in {\bf print AST
|
|
292 |
translations}\indexbold{translations!print AST} effectively reverse the
|
|
293 |
parse \AST{} translations of Fig.\ts\ref{fig:parse_ast_tr}.
|
|
294 |
|
|
295 |
For the actual printing process, the names attached to productions
|
|
296 |
of the form $\ldots A^{(p@1)}@1 \ldots A^{(p@n)}@n \ldots \mtt{=>} c$ play
|
|
297 |
a vital role. Each \AST{} with constant head $c$, namely $\mtt"c\mtt"$ or
|
|
298 |
$(\mtt"c\mtt"~ x@1 \ldots x@n)$, is printed according to the production
|
|
299 |
for~$c$. Each argument~$x@i$ is converted to a string, and put in
|
|
300 |
parentheses if its priority~$(p@i)$ requires this. The resulting strings
|
|
301 |
and their syntactic sugar (denoted by \dots{} above) are joined to make a
|
|
302 |
single string.
|
|
303 |
|
|
304 |
If an application $(\mtt"c\mtt"~ x@1 \ldots x@m)$ has more arguments than the
|
|
305 |
corresponding production, it is first split into $((\mtt"c\mtt"~ x@1 \ldots
|
|
306 |
x@n) ~ x@{n+1} \ldots x@m)$. Applications with too few arguments or with
|
|
307 |
non-constant head or without a corresponding production are printed as
|
|
308 |
$f(x@1, \ldots, x@l)$ or $(\alpha@1, \ldots, \alpha@l) ty$. An occurrence of
|
|
309 |
$\Variable x$ is simply printed as~$x$.
|
|
310 |
|
|
311 |
Blanks are {\em not\/} inserted automatically. If blanks are required to
|
|
312 |
separate tokens, specify them in the mixfix declaration, possibly preceded
|
|
313 |
by a slash~({\tt/}) to allow a line break.
|
|
314 |
\index{ASTs|)}
|
|
315 |
|
|
316 |
|
|
317 |
|
|
318 |
\section{Macros: Syntactic rewriting} \label{sec:macros}
|
|
319 |
\index{macros|(}\index{rewriting!syntactic|(}
|
|
320 |
|
|
321 |
Mixfix declarations alone can handle situations where there is a direct
|
|
322 |
connection between the concrete syntax and the underlying term. Sometimes
|
|
323 |
we require a more elaborate concrete syntax, such as quantifiers and list
|
|
324 |
notation. Isabelle's {\bf macros} and {\bf translation functions} can
|
|
325 |
perform translations such as
|
|
326 |
\begin{center}\tt
|
|
327 |
\begin{tabular}{r@{$\quad\protect\rightleftharpoons\quad$}l}
|
|
328 |
ALL x:A.P & Ball(A, \%x.P) \\ \relax
|
|
329 |
[x, y, z] & Cons(x, Cons(y, Cons(z, Nil)))
|
|
330 |
\end{tabular}
|
|
331 |
\end{center}
|
|
332 |
Translation functions (see \S\ref{sec:tr_funs}) must be coded in ML; they
|
|
333 |
are the most powerful translation mechanism but are difficult to read or
|
|
334 |
write. Macros are specified by first-order rewriting systems that operate
|
|
335 |
on abstract syntax trees. They are usually easy to read and write, and can
|
|
336 |
express all but the most obscure translations.
|
|
337 |
|
|
338 |
Figure~\ref{fig:set_trans} defines a fragment of first-order logic and set
|
|
339 |
theory.\footnote{This and the following theories are complete working
|
|
340 |
examples, though they specify only syntax, no axioms. The file {\tt
|
|
341 |
ZF/zf.thy} presents the full set theory definition, including many
|
|
342 |
macro rules.} Theory {\tt SET} defines constants for set comprehension
|
|
343 |
({\tt Collect}), replacement ({\tt Replace}) and bounded universal
|
|
344 |
quantification ({\tt Ball}). Each of these binds some variables. Without
|
|
345 |
additional syntax we should have to express $\forall x \in A. P$ as {\tt
|
|
346 |
Ball(A,\%x.P)}, and similarly for the others.
|
|
347 |
|
|
348 |
\begin{figure}
|
|
349 |
\begin{ttbox}
|
|
350 |
SET = Pure +
|
|
351 |
types
|
|
352 |
i, o
|
|
353 |
arities
|
|
354 |
i, o :: logic
|
|
355 |
consts
|
|
356 |
Trueprop :: "o => prop" ("_" 5)
|
|
357 |
Collect :: "[i, i => o] => i"
|
|
358 |
"{\at}Collect" :: "[idt, i, o] => i" ("(1{\ttlbrace}_:_./ _{\ttrbrace})")
|
|
359 |
Replace :: "[i, [i, i] => o] => i"
|
|
360 |
"{\at}Replace" :: "[idt, idt, i, o] => i" ("(1{\ttlbrace}_./ _:_, _{\ttrbrace})")
|
|
361 |
Ball :: "[i, i => o] => o"
|
|
362 |
"{\at}Ball" :: "[idt, i, o] => o" ("(3ALL _:_./ _)" 10)
|
|
363 |
translations
|
|
364 |
"{\ttlbrace}x:A. P{\ttrbrace}" == "Collect(A, \%x. P)"
|
|
365 |
"{\ttlbrace}y. x:A, Q{\ttrbrace}" == "Replace(A, \%x y. Q)"
|
|
366 |
"ALL x:A. P" == "Ball(A, \%x. P)"
|
|
367 |
end
|
|
368 |
\end{ttbox}
|
|
369 |
\caption{Macro example: set theory}\label{fig:set_trans}
|
|
370 |
\end{figure}
|
|
371 |
|
|
372 |
The theory specifies a variable-binding syntax through additional
|
|
373 |
productions that have mixfix declarations. Each non-copy production must
|
|
374 |
specify some constant, which is used for building \AST{}s.
|
|
375 |
\index{constants!syntactic} The additional constants are decorated with
|
|
376 |
{\tt\at} to stress their purely syntactic purpose; they should never occur
|
|
377 |
within the final well-typed terms. Furthermore, they cannot be written in
|
|
378 |
formulae because they are not legal identifiers.
|
|
379 |
|
|
380 |
The translations cause the replacement of external forms by internal forms
|
|
381 |
after parsing, and vice versa before printing of terms. As a specification
|
|
382 |
of the set theory notation, they should be largely self-explanatory. The
|
|
383 |
syntactic constants, {\tt\at Collect}, {\tt\at Replace} and {\tt\at Ball},
|
|
384 |
appear implicitly in the macro rules via their mixfix forms.
|
|
385 |
|
|
386 |
Macros can define variable-binding syntax because they operate on \AST{}s,
|
|
387 |
which have no inbuilt notion of bound variable. The macro variables {\tt
|
|
388 |
x} and~{\tt y} have type~{\tt idt} and therefore range over identifiers,
|
|
389 |
in this case bound variables. The macro variables {\tt P} and~{\tt Q}
|
|
390 |
range over formulae containing bound variable occurrences.
|
|
391 |
|
|
392 |
Other applications of the macro system can be less straightforward, and
|
|
393 |
there are peculiarities. The rest of this section will describe in detail
|
|
394 |
how Isabelle macros are preprocessed and applied.
|
|
395 |
|
|
396 |
|
|
397 |
\subsection{Specifying macros}
|
|
398 |
Macros are basically rewrite rules on \AST{}s. But unlike other macro
|
|
399 |
systems found in programming languages, Isabelle's macros work in both
|
|
400 |
directions. Therefore a syntax contains two lists of rewrites: one for
|
|
401 |
parsing and one for printing.
|
|
402 |
|
|
403 |
\index{*translations section}
|
|
404 |
The {\tt translations} section specifies macros. The syntax for a macro is
|
|
405 |
\[ (root)\; string \quad
|
|
406 |
\left\{\begin{array}[c]{c} \mtt{=>} \\ \mtt{<=} \\ \mtt{==} \end{array}
|
|
407 |
\right\} \quad
|
|
408 |
(root)\; string
|
|
409 |
\]
|
|
410 |
%
|
|
411 |
This specifies a parse rule ({\tt =>}), a print rule ({\tt <=}), or both
|
|
412 |
({\tt ==}). The two strings specify the left and right-hand sides of the
|
|
413 |
macro rule. The $(root)$ specification is optional; it specifies the
|
|
414 |
nonterminal for parsing the $string$ and if omitted defaults to {\tt
|
|
415 |
logic}. \AST{} rewrite rules $(l, r)$ must obey certain conditions:
|
|
416 |
\begin{itemize}
|
|
417 |
\item Rules must be left linear: $l$ must not contain repeated variables.
|
|
418 |
|
|
419 |
\item Rules must have constant heads, namely $l = \mtt"c\mtt"$ or $l =
|
|
420 |
(\mtt"c\mtt" ~ x@1 \ldots x@n)$.
|
|
421 |
|
|
422 |
\item Every variable in~$r$ must also occur in~$l$.
|
|
423 |
\end{itemize}
|
|
424 |
|
|
425 |
Macro rules may refer to any syntax from the parent theories. They may
|
|
426 |
also refer to anything defined before the the {\tt .thy} file's {\tt
|
|
427 |
translations} section --- including any mixfix declarations.
|
|
428 |
|
|
429 |
Upon declaration, both sides of the macro rule undergo parsing and parse
|
|
430 |
\AST{} translations (see \S\ref{sec:asts}), but do not themselves undergo
|
|
431 |
macro expansion. The lexer runs in a different mode that additionally
|
|
432 |
accepts identifiers of the form $\_~letter~quasiletter^*$ (like {\tt _idt},
|
|
433 |
{\tt _K}). Thus, a constant whose name starts with an underscore can
|
|
434 |
appear in macro rules but not in ordinary terms.
|
|
435 |
|
|
436 |
Some atoms of the macro rule's \AST{} are designated as constants for
|
|
437 |
matching. These are all names that have been declared as classes, types or
|
|
438 |
constants.
|
|
439 |
|
|
440 |
The result of this preprocessing is two lists of macro rules, each stored
|
|
441 |
as a pair of \AST{}s. They can be viewed using {\tt Syntax.print_syntax}
|
|
442 |
(sections \ttindex{parse_rules} and \ttindex{print_rules}). For
|
|
443 |
theory~{\tt SET} of Fig.~\ref{fig:set_trans} these are
|
|
444 |
\begin{ttbox}
|
|
445 |
parse_rules:
|
|
446 |
("{\at}Collect" x A P) -> ("Collect" A ("_abs" x P))
|
|
447 |
("{\at}Replace" y x A Q) -> ("Replace" A ("_abs" x ("_abs" y Q)))
|
|
448 |
("{\at}Ball" x A P) -> ("Ball" A ("_abs" x P))
|
|
449 |
print_rules:
|
|
450 |
("Collect" A ("_abs" x P)) -> ("{\at}Collect" x A P)
|
|
451 |
("Replace" A ("_abs" x ("_abs" y Q))) -> ("{\at}Replace" y x A Q)
|
|
452 |
("Ball" A ("_abs" x P)) -> ("{\at}Ball" x A P)
|
|
453 |
\end{ttbox}
|
|
454 |
|
|
455 |
\begin{warn}
|
|
456 |
Avoid choosing variable names that have previously been used as
|
|
457 |
constants, types or type classes; the {\tt consts} section in the output
|
|
458 |
of {\tt Syntax.print_syntax} lists all such names. If a macro rule works
|
|
459 |
incorrectly, inspect its internal form as shown above, recalling that
|
|
460 |
constants appear as quoted strings and variables without quotes.
|
|
461 |
\end{warn}
|
|
462 |
|
|
463 |
\begin{warn}
|
|
464 |
If \ttindex{eta_contract} is set to {\tt true}, terms will be
|
|
465 |
$\eta$-contracted {\em before\/} the \AST{} rewriter sees them. Thus some
|
|
466 |
abstraction nodes needed for print rules to match may vanish. For example,
|
|
467 |
\verb|Ball(A, %x. P(x))| contracts {\tt Ball(A, P)}; the print rule does
|
|
468 |
not apply and the output will be {\tt Ball(A, P)}. This problem would not
|
|
469 |
occur if \ML{} translation functions were used instead of macros (as is
|
|
470 |
done for binder declarations).
|
|
471 |
\end{warn}
|
|
472 |
|
|
473 |
|
|
474 |
\begin{warn}
|
|
475 |
Another trap concerns type constraints. If \ttindex{show_types} is set to
|
|
476 |
{\tt true}, bound variables will be decorated by their meta types at the
|
|
477 |
binding place (but not at occurrences in the body). Matching with
|
|
478 |
\verb|Collect(A, %x. P)| binds {\tt x} to something like {\tt ("_constrain" y
|
|
479 |
"i")} rather than only {\tt y}. \AST{} rewriting will cause the constraint to
|
|
480 |
appear in the external form, say \verb|{y::i:A::i. P::o}|.
|
|
481 |
|
|
482 |
To allow such constraints to be re-read, your syntax should specify bound
|
|
483 |
variables using the nonterminal~\ndx{idt}. This is the case in our
|
|
484 |
example. Choosing {\tt id} instead of {\tt idt} is a common error,
|
|
485 |
especially since it appears in former versions of most of Isabelle's
|
|
486 |
object-logics.
|
|
487 |
\end{warn}
|
|
488 |
|
|
489 |
|
|
490 |
|
|
491 |
\subsection{Applying rules}
|
|
492 |
As a term is being parsed or printed, an \AST{} is generated as an
|
|
493 |
intermediate form (recall Fig.\ts\ref{fig:parse_print}). The \AST{} is
|
|
494 |
normalized by applying macro rules in the manner of a traditional term
|
|
495 |
rewriting system. We first examine how a single rule is applied.
|
|
496 |
|
|
497 |
Let $t$ be the abstract syntax tree to be normalized and $(l, r)$ some
|
|
498 |
translation rule. A subtree~$u$ of $t$ is a {\bf redex} if it is an
|
|
499 |
instance of~$l$; in this case $l$ is said to {\bf match}~$u$. A redex
|
|
500 |
matched by $l$ may be replaced by the corresponding instance of~$r$, thus
|
|
501 |
{\bf rewriting} the \AST~$t$. Matching requires some notion of {\bf
|
|
502 |
place-holders} that may occur in rule patterns but not in ordinary
|
|
503 |
\AST{}s; {\tt Variable} atoms serve this purpose.
|
|
504 |
|
|
505 |
The matching of the object~$u$ by the pattern~$l$ is performed as follows:
|
|
506 |
\begin{itemize}
|
|
507 |
\item Every constant matches itself.
|
|
508 |
|
|
509 |
\item $\Variable x$ in the object matches $\Constant x$ in the pattern.
|
|
510 |
This point is discussed further below.
|
|
511 |
|
|
512 |
\item Every \AST{} in the object matches $\Variable x$ in the pattern,
|
|
513 |
binding~$x$ to~$u$.
|
|
514 |
|
|
515 |
\item One application matches another if they have the same number of
|
|
516 |
subtrees and corresponding subtrees match.
|
|
517 |
|
|
518 |
\item In every other case, matching fails. In particular, {\tt
|
|
519 |
Constant}~$x$ can only match itself.
|
|
520 |
\end{itemize}
|
|
521 |
A successful match yields a substitution that is applied to~$r$, generating
|
|
522 |
the instance that replaces~$u$.
|
|
523 |
|
|
524 |
The second case above may look odd. This is where {\tt Variable}s of
|
|
525 |
non-rule \AST{}s behave like {\tt Constant}s. Recall that \AST{}s are not
|
|
526 |
far removed from parse trees; at this level it is not yet known which
|
|
527 |
identifiers will become constants, bounds, frees, types or classes. As
|
|
528 |
\S\ref{sec:asts} describes, former parse tree heads appear in \AST{}s as
|
|
529 |
{\tt Constant}s, while the name tokens \ndx{id}, \ndx{var}, \ndx{tid} and
|
|
530 |
\ndx{tvar} become {\tt Variable}s. On the other hand, when \AST{}s
|
|
531 |
generated from terms for printing, all constants and type constructors
|
|
532 |
become {\tt Constant}s; see \S\ref{sec:asts}. Thus \AST{}s may contain a
|
|
533 |
messy mixture of {\tt Variable}s and {\tt Constant}s. This is
|
|
534 |
insignificant at macro level because matching treats them alike.
|
|
535 |
|
|
536 |
Because of this behaviour, different kinds of atoms with the same name are
|
|
537 |
indistinguishable, which may make some rules prone to misbehaviour. Example:
|
|
538 |
\begin{ttbox}
|
|
539 |
types
|
|
540 |
Nil
|
|
541 |
consts
|
|
542 |
Nil :: "'a list"
|
|
543 |
"[]" :: "'a list" ("[]")
|
|
544 |
translations
|
|
545 |
"[]" == "Nil"
|
|
546 |
\end{ttbox}
|
|
547 |
The term {\tt Nil} will be printed as {\tt []}, just as expected.
|
|
548 |
The term \verb|%Nil.t| will be printed as \verb|%[].t|, which might not be
|
|
549 |
expected!
|
|
550 |
|
|
551 |
Normalizing an \AST{} involves repeatedly applying macro rules until none
|
|
552 |
is applicable. Macro rules are chosen in the order that they appear in the
|
|
553 |
{\tt translations} section. You can watch the normalization of \AST{}s
|
|
554 |
during parsing and printing by setting \ttindex{Syntax.trace_norm_ast} to
|
|
555 |
{\tt true}.\index{tracing!of macros} Alternatively, use
|
|
556 |
\ttindex{Syntax.test_read}. The information displayed when tracing
|
|
557 |
includes the \AST{} before normalization ({\tt pre}), redexes with results
|
|
558 |
({\tt rewrote}), the normal form finally reached ({\tt post}) and some
|
|
559 |
statistics ({\tt normalize}). If tracing is off,
|
|
560 |
\ttindex{Syntax.stat_norm_ast} can be set to {\tt true} in order to enable
|
|
561 |
printing of the normal form and statistics only.
|
|
562 |
|
|
563 |
|
|
564 |
\subsection{Example: the syntax of finite sets}
|
|
565 |
\index{examples!of macros}
|
|
566 |
|
|
567 |
This example demonstrates the use of recursive macros to implement a
|
|
568 |
convenient notation for finite sets.
|
|
569 |
\index{*empty constant}\index{*insert constant}\index{{}@\verb'{}' symbol}
|
|
570 |
\index{"@Enum@{\tt\at Enum} constant}
|
|
571 |
\index{"@Finset@{\tt\at Finset} constant}
|
|
572 |
\begin{ttbox}
|
|
573 |
FINSET = SET +
|
|
574 |
types
|
|
575 |
is
|
|
576 |
consts
|
|
577 |
"" :: "i => is" ("_")
|
|
578 |
"{\at}Enum" :: "[i, is] => is" ("_,/ _")
|
|
579 |
empty :: "i" ("{\ttlbrace}{\ttrbrace}")
|
|
580 |
insert :: "[i, i] => i"
|
|
581 |
"{\at}Finset" :: "is => i" ("{\ttlbrace}(_){\ttrbrace}")
|
|
582 |
translations
|
|
583 |
"{\ttlbrace}x, xs{\ttrbrace}" == "insert(x, {\ttlbrace}xs{\ttrbrace})"
|
|
584 |
"{\ttlbrace}x{\ttrbrace}" == "insert(x, {\ttlbrace}{\ttrbrace})"
|
|
585 |
end
|
|
586 |
\end{ttbox}
|
|
587 |
Finite sets are internally built up by {\tt empty} and {\tt insert}. The
|
|
588 |
declarations above specify \verb|{x, y, z}| as the external representation
|
|
589 |
of
|
|
590 |
\begin{ttbox}
|
|
591 |
insert(x, insert(y, insert(z, empty)))
|
|
592 |
\end{ttbox}
|
|
593 |
The nonterminal symbol~\ndx{is} stands for one or more objects of type~{\tt
|
|
594 |
i} separated by commas. The mixfix declaration \hbox{\verb|"_,/ _"|}
|
|
595 |
allows a line break after the comma for \rmindex{pretty printing}; if no
|
|
596 |
line break is required then a space is printed instead.
|
|
597 |
|
|
598 |
The nonterminal is declared as the type~{\tt is}, but with no {\tt arities}
|
|
599 |
declaration. Hence {\tt is} is not a logical type and no default
|
|
600 |
productions are added. If we had needed enumerations of the nonterminal
|
|
601 |
{\tt logic}, which would include all the logical types, we could have used
|
|
602 |
the predefined nonterminal symbol \ndx{args} and skipped this part
|
|
603 |
altogether. The nonterminal~{\tt is} can later be reused for other
|
|
604 |
enumerations of type~{\tt i} like lists or tuples.
|
|
605 |
|
|
606 |
\index{"@Finset@{\tt\at Finset} constant}
|
|
607 |
Next follows {\tt empty}, which is already equipped with its syntax
|
|
608 |
\verb|{}|, and {\tt insert} without concrete syntax. The syntactic
|
|
609 |
constant {\tt\at Finset} provides concrete syntax for enumerations of~{\tt
|
|
610 |
i} enclosed in curly braces. Remember that a pair of parentheses, as in
|
|
611 |
\verb|"{(_)}"|, specifies a block of indentation for pretty printing.
|
|
612 |
|
|
613 |
The translations may look strange at first. Macro rules are best
|
|
614 |
understood in their internal forms:
|
|
615 |
\begin{ttbox}
|
|
616 |
parse_rules:
|
|
617 |
("{\at}Finset" ("{\at}Enum" x xs)) -> ("insert" x ("{\at}Finset" xs))
|
|
618 |
("{\at}Finset" x) -> ("insert" x "empty")
|
|
619 |
print_rules:
|
|
620 |
("insert" x ("{\at}Finset" xs)) -> ("{\at}Finset" ("{\at}Enum" x xs))
|
|
621 |
("insert" x "empty") -> ("{\at}Finset" x)
|
|
622 |
\end{ttbox}
|
|
623 |
This shows that \verb|{x, xs}| indeed matches any set enumeration of at least
|
|
624 |
two elements, binding the first to {\tt x} and the rest to {\tt xs}.
|
|
625 |
Likewise, \verb|{xs}| and \verb|{x}| represent any set enumeration.
|
|
626 |
The parse rules only work in the order given.
|
|
627 |
|
|
628 |
\begin{warn}
|
|
629 |
The \AST{} rewriter cannot discern constants from variables and looks
|
|
630 |
only for names of atoms. Thus the names of {\tt Constant}s occurring in
|
|
631 |
the (internal) left-hand side of translation rules should be regarded as
|
|
632 |
\rmindex{reserved words}. Choose non-identifiers like {\tt\at Finset} or
|
|
633 |
sufficiently long and strange names. If a bound variable's name gets
|
|
634 |
rewritten, the result will be incorrect; for example, the term
|
|
635 |
\begin{ttbox}
|
|
636 |
\%empty insert. insert(x, empty)
|
|
637 |
\end{ttbox}
|
|
638 |
gets printed as \verb|%empty insert. {x}|.
|
|
639 |
\end{warn}
|
|
640 |
|
|
641 |
|
|
642 |
\subsection{Example: a parse macro for dependent types}\label{prod_trans}
|
|
643 |
\index{examples!of macros}
|
|
644 |
|
|
645 |
As stated earlier, a macro rule may not introduce new {\tt Variable}s on
|
|
646 |
the right-hand side. Something like \verb|"K(B)" => "%x. B"| is illegal;
|
|
647 |
if allowed, it could cause variable capture. In such cases you usually
|
|
648 |
must fall back on translation functions. But a trick can make things
|
|
649 |
readable in some cases: {\em calling translation functions by parse
|
|
650 |
macros}:
|
|
651 |
\begin{ttbox}
|
|
652 |
PROD = FINSET +
|
|
653 |
consts
|
|
654 |
Pi :: "[i, i => i] => i"
|
|
655 |
"{\at}PROD" :: "[idt, i, i] => i" ("(3PROD _:_./ _)" 10)
|
|
656 |
"{\at}->" :: "[i, i] => i" ("(_ ->/ _)" [51, 50] 50)
|
|
657 |
\ttbreak
|
|
658 |
translations
|
|
659 |
"PROD x:A. B" => "Pi(A, \%x. B)"
|
|
660 |
"A -> B" => "Pi(A, _K(B))"
|
|
661 |
end
|
|
662 |
ML
|
|
663 |
val print_translation = [("Pi", dependent_tr' ("{\at}PROD", "{\at}->"))];
|
|
664 |
\end{ttbox}
|
|
665 |
|
|
666 |
Here {\tt Pi} is an internal constant for constructing general products.
|
|
667 |
Two external forms exist: the general case {\tt PROD x:A.B} and the
|
|
668 |
function space {\tt A -> B}, which abbreviates \verb|Pi(A, %x.B)| when {\tt B}
|
|
669 |
does not depend on~{\tt x}.
|
|
670 |
|
|
671 |
The second parse macro introduces {\tt _K(B)}, which later becomes \verb|%x.B|
|
|
672 |
due to a parse translation associated with \cdx{_K}. The order of the
|
|
673 |
parse rules is critical. Unfortunately there is no such trick for
|
|
674 |
printing, so we have to add a {\tt ML} section for the print translation
|
|
675 |
\ttindex{dependent_tr'}.
|
|
676 |
|
|
677 |
Recall that identifiers with a leading {\tt _} are allowed in translation
|
|
678 |
rules, but not in ordinary terms. Thus we can create \AST{}s containing
|
|
679 |
names that are not directly expressible.
|
|
680 |
|
|
681 |
The parse translation for {\tt _K} is already installed in Pure, and {\tt
|
|
682 |
dependent_tr'} is exported by the syntax module for public use. See
|
|
683 |
\S\ref{sec:tr_funs} below for more of the arcane lore of translation functions.
|
|
684 |
\index{macros|)}\index{rewriting!syntactic|)}
|
|
685 |
|
|
686 |
|
|
687 |
\section{Translation functions} \label{sec:tr_funs}
|
|
688 |
\index{translations|(}
|
|
689 |
%
|
|
690 |
This section describes the translation function mechanism. By writing
|
|
691 |
\ML{} functions, you can do almost everything with terms or \AST{}s during
|
|
692 |
parsing and printing. The logic \LK\ is a good example of sophisticated
|
|
693 |
transformations between internal and external representations of
|
|
694 |
associative sequences; here, macros would be useless.
|
|
695 |
|
|
696 |
A full understanding of translations requires some familiarity
|
|
697 |
with Isabelle's internals, especially the datatypes {\tt term}, {\tt typ},
|
|
698 |
{\tt Syntax.ast} and the encodings of types and terms as such at the various
|
|
699 |
stages of the parsing or printing process. Most users should never need to
|
|
700 |
use translation functions.
|
|
701 |
|
|
702 |
\subsection{Declaring translation functions}
|
|
703 |
There are four kinds of translation functions. Each such function is
|
|
704 |
associated with a name, which triggers calls to it. Such names can be
|
|
705 |
constants (logical or syntactic) or type constructors.
|
|
706 |
|
|
707 |
{\tt Syntax.print_syntax} displays the sets of names associated with the
|
|
708 |
translation functions of a {\tt Syntax.syntax} under
|
|
709 |
\ttindex{parse_ast_translation}, \ttindex{parse_translation},
|
|
710 |
\ttindex{print_translation} and \ttindex{print_ast_translation}. You can
|
|
711 |
add new ones via the {\tt ML} section\index{*ML section} of
|
|
712 |
a {\tt .thy} file. There may never be more than one function of the same
|
|
713 |
kind per name. Conceptually, the {\tt ML} section should appear between
|
|
714 |
{\tt consts} and {\tt translations}; newly installed translation functions
|
|
715 |
are already effective when macros and logical rules are parsed.
|
|
716 |
|
|
717 |
The {\tt ML} section is copied verbatim into the \ML\ file generated from a
|
|
718 |
{\tt .thy} file. Definitions made here are accessible as components of an
|
|
719 |
\ML\ structure; to make some definitions private, use an \ML{} {\tt local}
|
|
720 |
declaration. The {\tt ML} section may install translation functions by
|
|
721 |
declaring any of the following identifiers:
|
|
722 |
\begin{ttbox}
|
|
723 |
val parse_ast_translation : (string * (ast list -> ast)) list
|
|
724 |
val print_ast_translation : (string * (ast list -> ast)) list
|
|
725 |
val parse_translation : (string * (term list -> term)) list
|
|
726 |
val print_translation : (string * (term list -> term)) list
|
|
727 |
\end{ttbox}
|
|
728 |
|
|
729 |
\subsection{The translation strategy}
|
|
730 |
All four kinds of translation functions are treated similarly. They are
|
|
731 |
called during the transformations between parse trees, \AST{}s and terms
|
|
732 |
(recall Fig.\ts\ref{fig:parse_print}). Whenever a combination of the form
|
|
733 |
$(\mtt"c\mtt"~x@1 \ldots x@n)$ is encountered, and a translation function
|
|
734 |
$f$ of appropriate kind exists for $c$, the result is computed by the \ML{}
|
|
735 |
function call $f \mtt[ x@1, \ldots, x@n \mtt]$.
|
|
736 |
|
|
737 |
For \AST{} translations, the arguments $x@1, \ldots, x@n$ are \AST{}s. A
|
|
738 |
combination has the form $\Constant c$ or $\Appl{\Constant c, x@1, \ldots,
|
|
739 |
x@n}$. For term translations, the arguments are terms and a combination
|
|
740 |
has the form $\ttfct{Const} (c, \tau)$ or $\ttfct{Const} (c, \tau) \ttapp
|
|
741 |
x@1 \ttapp \ldots \ttapp x@n$. Terms allow more sophisticated
|
|
742 |
transformations than \AST{}s do, typically involving abstractions and bound
|
|
743 |
variables.
|
|
744 |
|
|
745 |
Regardless of whether they act on terms or \AST{}s,
|
|
746 |
parse translations differ from print translations fundamentally:
|
|
747 |
\begin{description}
|
|
748 |
\item[Parse translations] are applied bottom-up. The arguments are already
|
|
749 |
in translated form. The translations must not fail; exceptions trigger
|
|
750 |
an error message.
|
|
751 |
|
|
752 |
\item[Print translations] are applied top-down. They are supplied with
|
|
753 |
arguments that are partly still in internal form. The result again
|
|
754 |
undergoes translation; therefore a print translation should not introduce
|
|
755 |
as head the very constant that invoked it. The function may raise
|
|
756 |
exception \xdx{Match} to indicate failure; in this event it has no
|
|
757 |
effect.
|
|
758 |
\end{description}
|
|
759 |
|
|
760 |
Only constant atoms --- constructor \ttindex{Constant} for \AST{}s and
|
|
761 |
\ttindex{Const} for terms --- can invoke translation functions. This
|
|
762 |
causes another difference between parsing and printing.
|
|
763 |
|
|
764 |
Parsing starts with a string and the constants are not yet identified.
|
|
765 |
Only parse tree heads create {\tt Constant}s in the resulting \AST, as
|
|
766 |
described in \S\ref{sec:astofpt}. Macros and parse \AST{} translations may
|
|
767 |
introduce further {\tt Constant}s. When the final \AST{} is converted to a
|
|
768 |
term, all {\tt Constant}s become {\tt Const}s, as described in
|
|
769 |
\S\ref{sec:termofast}.
|
|
770 |
|
|
771 |
Printing starts with a well-typed term and all the constants are known. So
|
|
772 |
all logical constants and type constructors may invoke print translations.
|
|
773 |
These, and macros, may introduce further constants.
|
|
774 |
|
|
775 |
|
|
776 |
\subsection{Example: a print translation for dependent types}
|
|
777 |
\index{examples!of translations}\indexbold{*dependent_tr'}
|
|
778 |
|
|
779 |
Let us continue the dependent type example (page~\pageref{prod_trans}) by
|
|
780 |
examining the parse translation for~\cdx{_K} and the print translation
|
|
781 |
{\tt dependent_tr'}, which are both built-in. By convention, parse
|
|
782 |
translations have names ending with {\tt _tr} and print translations have
|
|
783 |
names ending with {\tt _tr'}. Search for such names in the Isabelle
|
|
784 |
sources to locate more examples.
|
|
785 |
|
|
786 |
Here is the parse translation for {\tt _K}:
|
|
787 |
\begin{ttbox}
|
|
788 |
fun k_tr [t] = Abs ("x", dummyT, incr_boundvars 1 t)
|
|
789 |
| k_tr ts = raise TERM("k_tr",ts);
|
|
790 |
\end{ttbox}
|
|
791 |
If {\tt k_tr} is called with exactly one argument~$t$, it creates a new
|
|
792 |
{\tt Abs} node with a body derived from $t$. Since terms given to parse
|
|
793 |
translations are not yet typed, the type of the bound variable in the new
|
|
794 |
{\tt Abs} is simply {\tt dummyT}. The function increments all {\tt Bound}
|
|
795 |
nodes referring to outer abstractions by calling \ttindex{incr_boundvars},
|
|
796 |
a basic term manipulation function defined in {\tt Pure/term.ML}.
|
|
797 |
|
|
798 |
Here is the print translation for dependent types:
|
|
799 |
\begin{ttbox}
|
|
800 |
fun dependent_tr' (q,r) (A :: Abs (x, T, B) :: ts) =
|
|
801 |
if 0 mem (loose_bnos B) then
|
|
802 |
let val (x', B') = variant_abs (x, dummyT, B);
|
|
803 |
in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts)
|
|
804 |
end
|
|
805 |
else list_comb (Const (r, dummyT) $ A $ B, ts)
|
|
806 |
| dependent_tr' _ _ = raise Match;
|
|
807 |
\end{ttbox}
|
|
808 |
The argument {\tt (q,r)} is supplied to {\tt dependent_tr'} by a curried
|
|
809 |
function application during its installation. We could set up print
|
|
810 |
translations for both {\tt Pi} and {\tt Sigma} by including
|
|
811 |
\begin{ttbox}\index{*ML section}
|
|
812 |
val print_translation =
|
|
813 |
[("Pi", dependent_tr' ("{\at}PROD", "{\at}->")),
|
|
814 |
("Sigma", dependent_tr' ("{\at}SUM", "{\at}*"))];
|
|
815 |
\end{ttbox}
|
|
816 |
within the {\tt ML} section. The first of these transforms ${\tt Pi}(A,
|
|
817 |
\mtt{Abs}(x, T, B))$ into $\hbox{\tt{\at}PROD}(x', A, B')$ or
|
|
818 |
$\hbox{\tt{\at}->}r(A, B)$, choosing the latter form if $B$ does not depend
|
|
819 |
on~$x$. It checks this using \ttindex{loose_bnos}, yet another function
|
|
820 |
from {\tt Pure/term.ML}. Note that $x'$ is a version of $x$ renamed away
|
|
821 |
from all names in $B$, and $B'$ the body $B$ with {\tt Bound} nodes
|
|
822 |
referring to our {\tt Abs} node replaced by $\ttfct{Free} (x',
|
|
823 |
\mtt{dummyT})$.
|
|
824 |
|
|
825 |
We must be careful with types here. While types of {\tt Const}s are
|
|
826 |
ignored, type constraints may be printed for some {\tt Free}s and
|
|
827 |
{\tt Var}s if \ttindex{show_types} is set to {\tt true}. Variables of type
|
|
828 |
\ttindex{dummyT} are never printed with constraint, though. The line
|
|
829 |
\begin{ttbox}
|
|
830 |
let val (x', B') = variant_abs (x, dummyT, B);
|
|
831 |
\end{ttbox}\index{*variant_abs}
|
|
832 |
replaces bound variable occurrences in~$B$ by the free variable $x'$ with
|
|
833 |
type {\tt dummyT}. Only the binding occurrence of~$x'$ is given the
|
|
834 |
correct type~{\tt T}, so this is the only place where a type
|
|
835 |
constraint might appear.
|
|
836 |
\index{translations|)}
|
|
837 |
\index{syntax!transformations|)}
|