author | wenzelm |
Thu, 06 Dec 2001 00:46:24 +0100 | |
changeset 12406 | c9775847ed66 |
parent 291 | a615050a7494 |
permissions | -rw-r--r-- |
104 | 1 |
%% $Id$ |
291 | 2 |
%% \([a-zA-Z][a-zA-Z]}\.\) \([^ ]\) \1 \2 |
3 |
%% @\([a-z0-9]\) ^{(\1)} |
|
108 | 4 |
|
291 | 5 |
\newcommand\rmindex[1]{{#1}\index{#1}\@} |
6 |
\newcommand\mtt[1]{\mbox{\tt #1}} |
|
7 |
\newcommand\ttfct[1]{\mathop{\mtt{#1}}\nolimits} |
|
8 |
\newcommand\ttapp{\mathrel{\hbox{\tt\$}}} |
|
9 |
\newcommand\Constant{\ttfct{Constant}} |
|
10 |
\newcommand\Variable{\ttfct{Variable}} |
|
11 |
\newcommand\Appl[1]{\ttfct{Appl}\mathopen{\mtt[}#1\mathclose{\mtt]}} |
|
12 |
\newcommand\AST{{\sc ast}} |
|
13 |
\let\rew=\longrightarrow |
|
104 | 14 |
|
15 |
||
16 |
\chapter{Defining Logics} \label{Defining-Logics} |
|
17 |
||
291 | 18 |
This chapter explains how to define new formal systems --- in particular, |
19 |
their concrete syntax. While Isabelle can be regarded as a theorem prover |
|
20 |
for set theory, higher-order logic or the sequent calculus, its |
|
21 |
distinguishing feature is support for the definition of new logics. |
|
104 | 22 |
|
291 | 23 |
Isabelle logics are hierarchies of theories, which are described and |
24 |
illustrated in {\em Introduction to Isabelle}. That material, together |
|
25 |
with the theory files provided in the examples directories, should suffice |
|
26 |
for all simple applications. The easiest way to define a new theory is by |
|
27 |
modifying a copy of an existing theory. |
|
28 |
||
29 |
This chapter is intended for experienced Isabelle users. It documents all |
|
30 |
aspects of theories concerned with syntax: mixfix declarations, pretty |
|
31 |
printing, macros and translation functions. The extended examples of |
|
32 |
\S\ref{sec:min_logics} demonstrate the logical aspects of the definition of |
|
33 |
theories. Sections marked with * are highly technical and might be skipped |
|
34 |
on the first reading. |
|
104 | 35 |
|
36 |
||
291 | 37 |
\section{Priority grammars} \label{sec:priority_grammars} |
38 |
\index{grammars!priority|(} |
|
104 | 39 |
|
291 | 40 |
The syntax of an Isabelle logic is specified by a {\bf priority grammar}. |
41 |
A context-free grammar\index{grammars!context-free} contains a set of |
|
42 |
productions of the form $A=\gamma$, where $A$ is a nonterminal and |
|
43 |
$\gamma$, the right-hand side, is a string of terminals and nonterminals. |
|
44 |
Isabelle uses an extended format permitting {\bf priorities}, or |
|
45 |
precedences. Each nonterminal is decorated by an integer priority, as |
|
46 |
in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be replaced |
|
47 |
using a production $A^{(q)} = \gamma$ only if $p \le q$. Any priority |
|
48 |
grammar can be translated into a normal context free grammar by introducing |
|
49 |
new nonterminals and productions. |
|
104 | 50 |
|
51 |
Formally, a set of context free productions $G$ induces a derivation |
|
291 | 52 |
relation $\rew@G$. Let $\alpha$ and $\beta$ denote strings of terminal or |
53 |
nonterminal symbols. Then |
|
54 |
\[ \alpha\, A^{(p)}\, \beta ~\rew@G~ \alpha\,\gamma\,\beta \] |
|
55 |
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$q\ge p$. |
|
104 | 56 |
|
57 |
The following simple grammar for arithmetic expressions demonstrates how |
|
291 | 58 |
binding power and associativity of operators can be enforced by priorities. |
104 | 59 |
\begin{center} |
60 |
\begin{tabular}{rclr} |
|
291 | 61 |
$A^{(9)}$ & = & {\tt0} \\ |
62 |
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\ |
|
63 |
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\ |
|
64 |
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\ |
|
65 |
$A^{(3)}$ & = & {\tt-} $A^{(3)}$ |
|
104 | 66 |
\end{tabular} |
67 |
\end{center} |
|
291 | 68 |
The choice of priorities determines that {\tt -} binds tighter than {\tt *}, |
69 |
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the |
|
70 |
left and {\tt *} to the right. |
|
104 | 71 |
|
72 |
To minimize the number of subscripts, we adopt the following conventions: |
|
73 |
\begin{itemize} |
|
291 | 74 |
\item All priorities $p$ must be in the range $0 \leq p \leq max_pri$ for |
75 |
some fixed integer $max_pri$. |
|
76 |
\item Priority $0$ on the right-hand side and priority $max_pri$ on the |
|
104 | 77 |
left-hand side may be omitted. |
78 |
\end{itemize} |
|
291 | 79 |
The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; |
80 |
the priority of the left-hand side actually appears in a column on the far |
|
81 |
right. Finally, alternatives may be separated by $|$, and repetition |
|
104 | 82 |
indicated by \dots. |
83 |
||
291 | 84 |
Using these conventions and assuming $max_pri=9$, the grammar takes the form |
104 | 85 |
\begin{center} |
86 |
\begin{tabular}{rclc} |
|
87 |
$A$ & = & {\tt0} & \hspace*{4em} \\ |
|
88 |
& $|$ & {\tt(} $A$ {\tt)} \\ |
|
291 | 89 |
& $|$ & $A$ {\tt+} $A^{(1)}$ & (0) \\ |
90 |
& $|$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\ |
|
91 |
& $|$ & {\tt-} $A^{(3)}$ & (3) |
|
104 | 92 |
\end{tabular} |
93 |
\end{center} |
|
291 | 94 |
\index{grammars!priority|)} |
104 | 95 |
|
96 |
||
291 | 97 |
\begin{figure} |
104 | 98 |
\begin{center} |
99 |
\begin{tabular}{rclc} |
|
100 |
$prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\ |
|
291 | 101 |
&$|$& $logic^{(3)}$ \ttindex{==} $logic^{(2)}$ & (2) \\ |
102 |
&$|$& $logic^{(3)}$ \ttindex{=?=} $logic^{(2)}$ & (2) \\ |
|
103 |
&$|$& $prop^{(2)}$ \ttindex{==>} $prop^{(1)}$ & (1) \\ |
|
104 |
&$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\ |
|
104 | 105 |
&$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\ |
106 |
$logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\ |
|
107 |
$aprop$ &=& $id$ ~~$|$~~ $var$ |
|
291 | 108 |
~~$|$~~ $fun^{(max_pri)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\ |
104 | 109 |
$fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\ |
291 | 110 |
&$|$& $fun^{(max_pri)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\ |
111 |
&$|$& $fun^{(max_pri)}$ {\tt::} $type$ \\ |
|
104 | 112 |
&$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\ |
291 | 113 |
$idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\ |
104 | 114 |
$idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\ |
115 |
&$|$& $id$ \ttindex{::} $type$ & (0) \\\\ |
|
291 | 116 |
$type$ &=& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$ |
135 | 117 |
~~$|$~~ $tvar$ {\tt::} $sort$ \\ |
291 | 118 |
&$|$& $id$ ~~$|$~~ $type^{(max_pri)}$ $id$ |
104 | 119 |
~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ |
291 | 120 |
&$|$& $type^{(1)}$ \ttindex{=>} $type$ & (0) \\ |
104 | 121 |
&$|$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\ |
122 |
&$|$& {\tt(} $type$ {\tt)} \\\\ |
|
123 |
$sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace} |
|
124 |
~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace} |
|
125 |
\end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]} |
|
126 |
\indexbold{type@$type$} \indexbold{sort@$sort$} \indexbold{idt@$idt$} |
|
127 |
\indexbold{idts@$idts$} \indexbold{logic@$logic$} \indexbold{prop@$prop$} |
|
128 |
\indexbold{fun@$fun$} |
|
129 |
\end{center} |
|
291 | 130 |
\caption{Meta-logic syntax}\label{fig:pure_gram} |
104 | 131 |
\end{figure} |
132 |
||
291 | 133 |
|
134 |
\section{The Pure syntax} \label{sec:basic_syntax} |
|
135 |
\index{syntax!Pure|(} |
|
104 | 136 |
|
291 | 137 |
At the root of all object-logics lies the Pure theory,\index{theory!Pure} |
138 |
bound to the \ML{} identifier \ttindex{Pure.thy}. It contains, among many |
|
139 |
other things, the Pure syntax. An informal account of this basic syntax |
|
140 |
(meta-logic, types, \ldots) may be found in {\em Introduction to Isabelle}. |
|
141 |
A more precise description using a priority grammar is shown in |
|
142 |
Fig.\ts\ref{fig:pure_gram}. The following nonterminals are defined: |
|
143 |
\begin{description} |
|
144 |
\item[$prop$] Terms of type $prop$. These are formulae of the meta-logic. |
|
104 | 145 |
|
291 | 146 |
\item[$aprop$] Atomic propositions. These typically include the |
147 |
judgement forms of the object-logic; its definition introduces a |
|
148 |
meta-level predicate for each judgement form. |
|
149 |
||
150 |
\item[$logic$] Terms whose type belongs to class $logic$. Initially, |
|
151 |
this category contains just $prop$. As the syntax is extended by new |
|
152 |
object-logics, more productions for $logic$ are added automatically |
|
153 |
(see below). |
|
104 | 154 |
|
155 |
\item[$fun$] Terms potentially of function type. |
|
156 |
||
291 | 157 |
\item[$type$] Types of the meta-logic. |
104 | 158 |
|
291 | 159 |
\item[$idts$] A list of identifiers, possibly constrained by types. |
104 | 160 |
\end{description} |
161 |
||
291 | 162 |
\begin{warn} |
163 |
Note that \verb|x::nat y| is parsed as \verb|x::(nat y)|, treating {\tt |
|
164 |
y} like a type constructor applied to {\tt nat}. The likely result is |
|
165 |
an error message. To avoid this interpretation, use parentheses and |
|
166 |
write \verb|(x::nat) y|. |
|
104 | 167 |
|
291 | 168 |
Similarly, \verb|x::nat y::nat| is parsed as \verb|x::(nat y::nat)| and |
169 |
yields a syntax error. The correct form is \verb|(x::nat) (y::nat)|. |
|
170 |
\end{warn} |
|
171 |
||
172 |
\subsection{Logical types and default syntax}\label{logical-types} |
|
173 |
Isabelle's representation of mathematical languages is based on the typed |
|
174 |
$\lambda$-calculus. All logical types, namely those of class $logic$, are |
|
175 |
automatically equipped with a basic syntax of types, identifiers, |
|
176 |
variables, parentheses, $\lambda$-abstractions and applications. |
|
177 |
||
178 |
More precisely, for each type constructor $ty$ with arity $(\vec{s})c$, |
|
179 |
where $c$ is a subclass of $logic$, several productions are added: |
|
104 | 180 |
\begin{center} |
181 |
\begin{tabular}{rclc} |
|
182 |
$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\ |
|
291 | 183 |
&$|$& $fun^{(max_pri)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\ |
184 |
&$|$& $ty^{(max_pri)}$ {\tt::} $type$\\\\ |
|
104 | 185 |
$logic$ &=& $ty$ |
186 |
\end{tabular} |
|
187 |
\end{center} |
|
188 |
||
189 |
||
291 | 190 |
\subsection{Lexical matters} |
191 |
The parser does not process input strings directly. It operates on token |
|
192 |
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of |
|
193 |
tokens: \bfindex{delimiters} and \bfindex{name tokens}. |
|
104 | 194 |
|
291 | 195 |
Delimiters can be regarded as reserved words of the syntax. You can |
196 |
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they |
|
197 |
appear in typewriter font, for example {\tt ==}, {\tt =?=} and |
|
198 |
{\tt PROP}\@. |
|
104 | 199 |
|
291 | 200 |
Name tokens have a predefined syntax. The lexer distinguishes four |
201 |
disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type |
|
202 |
identifiers\index{identifiers!type}, type unknowns\index{unknowns!type}. |
|
203 |
They are denoted by $id$\index{id@$id$}, $var$\index{var@$var$}, |
|
204 |
$tid$\index{tid@$tid$}, $tvar$\index{tvar@$tvar$}, respectively. Typical |
|
205 |
examples are {\tt x}, {\tt ?x7}, {\tt 'a}, {\tt ?'a3}. Here is the precise |
|
206 |
syntax: |
|
188
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
207 |
\begin{eqnarray*} |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
208 |
id & = & letter~quasiletter^* \\ |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
209 |
var & = & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ |
291 | 210 |
tid & = & \mbox{\tt '}id \\ |
211 |
tvar & = & \mbox{\tt ?}tid ~~|~~ |
|
212 |
\mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex] |
|
188
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
213 |
letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\ |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
214 |
digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\ |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
215 |
quasiletter & = & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\ |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
216 |
nat & = & digit^+ |
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
217 |
\end{eqnarray*} |
291 | 218 |
A $var$ or $tvar$ describes an unknown, which is internally a pair |
188
6be0856cdf49
last minute changes, eg literal tokens -> delimiters and valued tokens ->
nipkow
parents:
142
diff
changeset
|
219 |
of base name and index (\ML\ type \ttindex{indexname}). These components are |
291 | 220 |
either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or |
221 |
run together as in {\tt ?x1}. The latter form is possible if the |
|
222 |
base name does not end with digits. If the index is 0, it may be dropped |
|
223 |
altogether: {\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}. |
|
104 | 224 |
|
291 | 225 |
The lexer repeatedly takes the maximal prefix of the input string that |
226 |
forms a valid token. A maximal prefix that is both a delimiter and a name |
|
227 |
is treated as a delimiter. Spaces, tabs and newlines are separators; they |
|
228 |
never occur within tokens. |
|
104 | 229 |
|
291 | 230 |
Delimiters need not be separated by white space. For example, if {\tt -} |
231 |
is a delimiter but {\tt --} is not, then the string {\tt --} is treated as |
|
232 |
two consecutive occurrences of the token~{\tt -}. In contrast, \ML\ |
|
233 |
treats {\tt --} as a single symbolic name. The consequence of Isabelle's |
|
234 |
more liberal scheme is that the same string may be parsed in different ways |
|
235 |
after extending the syntax: after adding {\tt --} as a delimiter, the input |
|
236 |
{\tt --} is treated as a single token. |
|
104 | 237 |
|
291 | 238 |
Name tokens are terminal symbols, strictly speaking, but we can generally |
239 |
regard them as nonterminals. This is because a name token carries with it |
|
240 |
useful information, the name. Delimiters, on the other hand, are nothing |
|
241 |
but than syntactic sugar. |
|
104 | 242 |
|
243 |
||
291 | 244 |
\subsection{*Inspecting the syntax} |
104 | 245 |
\begin{ttbox} |
291 | 246 |
syn_of : theory -> Syntax.syntax |
247 |
Syntax.print_syntax : Syntax.syntax -> unit |
|
248 |
Syntax.print_gram : Syntax.syntax -> unit |
|
249 |
Syntax.print_trans : Syntax.syntax -> unit |
|
104 | 250 |
\end{ttbox} |
291 | 251 |
The abstract type \ttindex{Syntax.syntax} allows manipulation of syntaxes |
252 |
in \ML. You can display values of this type by calling the following |
|
253 |
functions: |
|
254 |
\begin{description} |
|
255 |
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle |
|
256 |
theory~{\it thy} as an \ML\ value. |
|
257 |
||
258 |
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all |
|
259 |
information contained in the syntax {\it syn}. The displayed output can |
|
260 |
be large. The following two functions are more selective. |
|
104 | 261 |
|
291 | 262 |
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part |
263 |
of~{\it syn}, namely the lexicon, roots and productions. |
|
104 | 264 |
|
291 | 265 |
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation |
266 |
part of~{\it syn}, namely the constants, parse/print macros and |
|
267 |
parse/print translations. |
|
268 |
\end{description} |
|
269 |
||
270 |
Let us demonstrate these functions by inspecting Pure's syntax. Even that |
|
271 |
is too verbose to display in full. |
|
104 | 272 |
\begin{ttbox} |
273 |
Syntax.print_syntax (syn_of Pure.thy); |
|
291 | 274 |
{\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots} |
104 | 275 |
{\out roots: logic type fun prop} |
276 |
{\out prods:} |
|
291 | 277 |
{\out type = tid (1000)} |
104 | 278 |
{\out type = tvar (1000)} |
279 |
{\out type = id (1000)} |
|
291 | 280 |
{\out type = tid "::" sort[0] => "_ofsort" (1000)} |
104 | 281 |
{\out type = tvar "::" sort[0] => "_ofsort" (1000)} |
282 |
{\out \vdots} |
|
291 | 283 |
\ttbreak |
104 | 284 |
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots} |
285 |
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"} |
|
286 |
{\out "_idtyp" "_lambda" "_tapp" "_tappl"} |
|
287 |
{\out parse_rules:} |
|
288 |
{\out parse_translation: "!!" "_K" "_abs" "_aprop"} |
|
289 |
{\out print_translation: "all"} |
|
290 |
{\out print_rules:} |
|
291 |
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"} |
|
292 |
\end{ttbox} |
|
293 |
||
291 | 294 |
As you can see, the output is divided into labeled sections. The grammar |
295 |
is represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest |
|
296 |
refers to syntactic translations and macro expansion. Here is an |
|
297 |
explanation of the various sections. |
|
104 | 298 |
\begin{description} |
291 | 299 |
\item[\ttindex{lexicon}] lists the delimiters used for lexical |
300 |
analysis.\index{delimiters} |
|
104 | 301 |
|
291 | 302 |
\item[\ttindex{roots}] lists the grammar's nonterminal symbols. You must |
303 |
name the desired root when calling lower level functions or specifying |
|
304 |
macros. Higher level functions usually expect a type and derive the |
|
305 |
actual root as described in~\S\ref{sec:grammar}. |
|
104 | 306 |
|
291 | 307 |
\item[\ttindex{prods}] lists the productions of the priority grammar. |
308 |
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}. |
|
309 |
Each delimiter is quoted. Some productions are shown with {\tt =>} and |
|
310 |
an attached string. These strings later become the heads of parse |
|
311 |
trees; they also play a vital role when terms are printed (see |
|
312 |
\S\ref{sec:asts}). |
|
104 | 313 |
|
291 | 314 |
Productions with no strings attached are called {\bf copy |
315 |
productions}\indexbold{productions!copy}. Their right-hand side must |
|
316 |
have exactly one nonterminal symbol (or name token). The parser does |
|
317 |
not create a new parse tree node for copy productions, but simply |
|
318 |
returns the parse tree of the right-hand symbol. |
|
104 | 319 |
|
291 | 320 |
If the right-hand side consists of a single nonterminal with no |
321 |
delimiters, then the copy production is called a {\bf chain |
|
322 |
production}\indexbold{productions!chain}. Chain productions should |
|
323 |
be seen as abbreviations: conceptually, they are removed from the |
|
324 |
grammar by adding new productions. Priority information |
|
325 |
attached to chain productions is ignored, only the dummy value $-1$ is |
|
326 |
displayed. |
|
104 | 327 |
|
328 |
\item[\ttindex{consts}, \ttindex{parse_rules}, \ttindex{print_rules}] |
|
291 | 329 |
relate to macros (see \S\ref{sec:macros}). |
104 | 330 |
|
291 | 331 |
\item[\ttindex{parse_ast_translation}, \ttindex{print_ast_translation}] |
332 |
list sets of constants that invoke translation functions for abstract |
|
333 |
syntax trees. Section \S\ref{sec:asts} below discusses this obscure |
|
334 |
matter. |
|
104 | 335 |
|
291 | 336 |
\item[\ttindex{parse_translation}, \ttindex{print_translation}] list sets |
337 |
of constants that invoke translation functions for terms (see |
|
338 |
\S\ref{sec:tr_funs}). |
|
339 |
\end{description} |
|
340 |
\index{syntax!Pure|)} |
|
104 | 341 |
|
342 |
||
343 |
\section{Mixfix declarations} \label{sec:mixfix} |
|
291 | 344 |
\index{mixfix declaration|(} |
104 | 345 |
|
291 | 346 |
When defining a theory, you declare new constants by giving their names, |
347 |
their type, and an optional {\bf mixfix annotation}. Mixfix annotations |
|
348 |
allow you to extend Isabelle's basic $\lambda$-calculus syntax with |
|
349 |
readable notation. They can express any context-free priority grammar. |
|
350 |
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more |
|
351 |
general than the priority declarations of \ML\ and Prolog. |
|
352 |
||
353 |
A mixfix annotation defines a production of the priority grammar. It |
|
354 |
describes the concrete syntax, the translation to abstract syntax, and the |
|
355 |
pretty printing. Special case annotations provide a simple means of |
|
356 |
specifying infix operators, binders and so forth. |
|
104 | 357 |
|
291 | 358 |
\subsection{Grammar productions}\label{sec:grammar} |
359 |
Let us examine the treatment of the production |
|
360 |
\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\, |
|
361 |
A@n^{(p@n)}\, w@n. \] |
|
362 |
Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$, |
|
363 |
\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals. |
|
364 |
In the corresponding mixfix annotation, the priorities are given separately |
|
365 |
as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are identified with |
|
366 |
types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's |
|
367 |
effect on nonterminals is expressed as the function type |
|
368 |
\[ [\tau@1, \ldots, \tau@n]\To \tau. \] |
|
369 |
Finally, the template |
|
370 |
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \] |
|
371 |
describes the strings of terminals. |
|
104 | 372 |
|
291 | 373 |
A simple type is typically declared for each nonterminal symbol. In |
374 |
first-order logic, type~$i$ stands for terms and~$o$ for formulae. Only |
|
375 |
the outermost type constructor is taken into account. For example, any |
|
376 |
type of the form $\sigma list$ stands for a list; productions may refer |
|
377 |
to the symbol {\tt list} and will apply lists of any type. |
|
378 |
||
379 |
The symbol associated with a type is called its {\bf root} since it may |
|
380 |
serve as the root of a parse tree. Precisely, the root of $(\tau@1, \dots, |
|
381 |
\tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is |
|
382 |
a type constructor. Type infixes are a special case of this; in |
|
383 |
particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}. Finally, the |
|
384 |
root of a type variable is {\tt logic}; general productions might |
|
385 |
refer to this nonterminal. |
|
104 | 386 |
|
291 | 387 |
Identifying nonterminals with types allows a constant's type to specify |
388 |
syntax as well. We can declare the function~$f$ to have type $[\tau@1, |
|
389 |
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the |
|
390 |
layout of the function's $n$ arguments. The constant's name, in this |
|
391 |
case~$f$, will also serve as the label in the abstract syntax tree. There |
|
392 |
are two exceptions to this treatment of constants: |
|
393 |
\begin{enumerate} |
|
394 |
\item A production need not map directly to a logical function. In this |
|
395 |
case, you must declare a constant whose purpose is purely syntactic. |
|
396 |
By convention such constants begin with the symbol~{\tt\at}, |
|
397 |
ensuring that they can never be written in formulae. |
|
398 |
||
399 |
\item A copy production has no associated constant. |
|
400 |
\end{enumerate} |
|
401 |
There is something artificial about this representation of productions, |
|
402 |
but it is convenient, particularly for simple theory extensions. |
|
104 | 403 |
|
291 | 404 |
\subsection{The general mixfix form} |
405 |
Here is a detailed account of the general \bfindex{mixfix declaration} as |
|
406 |
it may occur within the {\tt consts} section of a {\tt .thy} file. |
|
407 |
\begin{center} |
|
408 |
{\tt "$c$" ::\ "$\sigma$" ("$template$" $ps$ $p$)} |
|
409 |
\end{center} |
|
410 |
This constant declaration and mixfix annotation is interpreted as follows: |
|
411 |
\begin{itemize} |
|
412 |
\item The string {\tt "$c$"} is the name of the constant associated with |
|
413 |
the production. If $c$ is empty (given as~{\tt ""}) then this is a copy |
|
414 |
production.\index{productions!copy} Otherwise, parsing an instance of the |
|
415 |
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$ |
|
416 |
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th |
|
417 |
argument. |
|
418 |
||
419 |
\item The constant $c$, if non-empty, is declared to have type $\sigma$. |
|
104 | 420 |
|
291 | 421 |
\item The string $template$ specifies the right-hand side of |
422 |
the production. It has the form |
|
423 |
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] |
|
424 |
where each occurrence of \ttindex{_} denotes an |
|
425 |
argument\index{argument!mixfix} position and the~$w@i$ do not |
|
426 |
contain~{\tt _}. (If you want a literal~{\tt _} in the concrete |
|
427 |
syntax, you must escape it as described below.) The $w@i$ may |
|
428 |
consist of \rmindex{delimiters}, spaces or \rmindex{pretty |
|
429 |
printing} annotations (see below). |
|
104 | 430 |
|
291 | 431 |
\item The type $\sigma$ specifies the production's nonterminal symbols (or name |
432 |
tokens). If $template$ is of the form above then $\sigma$ must be a |
|
433 |
function type with at least~$n$ argument positions, say $\sigma = |
|
434 |
[\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are derived |
|
435 |
from the type $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described above. |
|
436 |
Any of these may be function types; the corresponding root is then {\tt |
|
437 |
fun}. |
|
104 | 438 |
|
291 | 439 |
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt |
440 |
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal |
|
441 |
priority\indexbold{priorities} required of any phrase that may appear |
|
442 |
as the $i$-th argument. Missing priorities default to~$0$. |
|
443 |
||
444 |
\item The integer $p$ is the priority of this production. If omitted, it |
|
445 |
defaults to the maximal priority. |
|
446 |
||
447 |
Priorities, or precedences, range between $0$ and |
|
448 |
$max_pri$\indexbold{max_pri@$max_pri$} (= 1000). |
|
104 | 449 |
\end{itemize} |
450 |
||
291 | 451 |
The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no |
452 |
priorities. The resulting production puts no priority constraints on any |
|
453 |
of its arguments and has maximal priority itself. Omitting priorities in |
|
454 |
this manner will introduce syntactic ambiguities unless the production's |
|
455 |
right-hand side is fully bracketed, as in \verb|"if _ then _ else _ fi"|. |
|
104 | 456 |
|
291 | 457 |
\begin{warn} |
458 |
Theories must sometimes declare types for purely syntactic purposes. One |
|
459 |
example is {\tt type}, the built-in type of types. This is a `type of |
|
460 |
all types' in the syntactic sense only. Do not declare such types under |
|
461 |
{\tt arities} as belonging to class $logic$, for that would allow their |
|
462 |
use in arbitrary Isabelle expressions~(\S\ref{logical-types}). |
|
463 |
\end{warn} |
|
464 |
||
465 |
\subsection{Example: arithmetic expressions} |
|
466 |
This theory specification contains a {\tt consts} section with mixfix |
|
467 |
declarations encoding the priority grammar from |
|
468 |
\S\ref{sec:priority_grammars}: |
|
104 | 469 |
\begin{ttbox} |
470 |
EXP = Pure + |
|
471 |
types |
|
291 | 472 |
exp |
104 | 473 |
arities |
474 |
exp :: logic |
|
475 |
consts |
|
291 | 476 |
"0" :: "exp" ("0" 9) |
477 |
"+" :: "[exp, exp] => exp" ("_ + _" [0, 1] 0) |
|
478 |
"*" :: "[exp, exp] => exp" ("_ * _" [3, 2] 2) |
|
479 |
"-" :: "exp => exp" ("- _" [3] 3) |
|
104 | 480 |
end |
481 |
\end{ttbox} |
|
482 |
Note that the {\tt arities} declaration causes {\tt exp} to be added to the |
|
291 | 483 |
syntax' roots. If you put the text above into a file {\tt exp.thy} and load |
135 | 484 |
it via {\tt use_thy "EXP"}, you can run some tests: |
104 | 485 |
\begin{ttbox} |
486 |
val read_exp = Syntax.test_read (syn_of EXP.thy) "exp"; |
|
291 | 487 |
{\out val it = fn : string -> unit} |
104 | 488 |
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0"; |
489 |
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"} |
|
490 |
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")} |
|
491 |
{\out \vdots} |
|
492 |
read_exp "0 + - 0 + 0"; |
|
493 |
{\out tokens: "0" "+" "-" "0" "+" "0"} |
|
494 |
{\out raw: ("+" ("+" "0" ("-" "0")) "0")} |
|
495 |
{\out \vdots} |
|
496 |
\end{ttbox} |
|
497 |
The output of \ttindex{Syntax.test_read} includes the token list ({\tt |
|
291 | 498 |
tokens}) and the raw \AST{} directly derived from the parse tree, |
499 |
ignoring parse \AST{} translations. The rest is tracing information |
|
500 |
provided by the macro expander (see \S\ref{sec:macros}). |
|
104 | 501 |
|
291 | 502 |
Executing {\tt Syntax.print_gram} reveals the productions derived |
503 |
from our mixfix declarations (lots of additional information deleted): |
|
104 | 504 |
\begin{ttbox} |
291 | 505 |
Syntax.print_gram (syn_of EXP.thy); |
506 |
{\out exp = "0" => "0" (9)} |
|
507 |
{\out exp = exp[0] "+" exp[1] => "+" (0)} |
|
508 |
{\out exp = exp[3] "*" exp[2] => "*" (2)} |
|
509 |
{\out exp = "-" exp[3] => "-" (3)} |
|
104 | 510 |
\end{ttbox} |
511 |
||
291 | 512 |
|
513 |
\subsection{The mixfix template} |
|
514 |
Let us take a closer look at the string $template$ appearing in mixfix |
|
515 |
annotations. This string specifies a list of parsing and printing |
|
516 |
directives: delimiters\index{delimiter}, arguments\index{argument!mixfix}, |
|
517 |
spaces, blocks of indentation and line breaks. These are encoded via the |
|
104 | 518 |
following character sequences: |
519 |
||
291 | 520 |
\index{pretty printing|(} |
104 | 521 |
\begin{description} |
291 | 522 |
\item[~\ttindex_~] An argument\index{argument!mixfix} position, which |
523 |
stands for a nonterminal symbol or name token. |
|
104 | 524 |
|
291 | 525 |
\item[~$d$~] A \rmindex{delimiter}, namely a non-empty sequence of |
526 |
non-special or escaped characters. Escaping a character\index{escape |
|
527 |
character} means preceding it with a {\tt '} (single quote). Thus |
|
528 |
you have to write {\tt ''} if you really want a single quote. You must |
|
529 |
also escape {\tt _}, {\tt (}, {\tt )} and {\tt /}. Delimiters may |
|
530 |
never contain white space, though. |
|
104 | 531 |
|
291 | 532 |
\item[~$s$~] A non-empty sequence of spaces for printing. This |
533 |
and the following specifications do not affect parsing at all. |
|
104 | 534 |
|
291 | 535 |
\item[~{\ttindex($n$}~] Open a pretty printing block. The optional |
536 |
number $n$ specifies how much indentation to add when a line break |
|
537 |
occurs within the block. If {\tt(} is not followed by digits, the |
|
538 |
indentation defaults to~$0$. |
|
104 | 539 |
|
291 | 540 |
\item[~\ttindex)~] Close a pretty printing block. |
104 | 541 |
|
291 | 542 |
\item[~\ttindex{//}~] Force a line break. |
104 | 543 |
|
291 | 544 |
\item[~\ttindex/$s$~] Allow a line break. Here $s$ stands for the string |
545 |
of spaces (zero or more) right after the {\tt /} character. These |
|
546 |
spaces are printed if the break is not taken. |
|
547 |
\end{description} |
|
548 |
Isabelle's pretty printer resembles the one described in |
|
549 |
Paulson~\cite{paulson91}. \index{pretty printing|)} |
|
104 | 550 |
|
551 |
||
552 |
\subsection{Infixes} |
|
291 | 553 |
\indexbold{infix operators} |
554 |
Infix operators associating to the left or right can be declared |
|
555 |
using {\tt infixl} or {\tt infixr}. |
|
556 |
Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)} |
|
557 |
abbreviates the declarations |
|
104 | 558 |
\begin{ttbox} |
291 | 559 |
"op \(c\)" :: "\(\sigma\)" ("op \(c\)") |
560 |
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) |
|
104 | 561 |
\end{ttbox} |
291 | 562 |
and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the declarations |
104 | 563 |
\begin{ttbox} |
291 | 564 |
"op \(c\)" :: "\(\sigma\)" ("op \(c\)") |
565 |
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) |
|
104 | 566 |
\end{ttbox} |
291 | 567 |
The infix operator is declared as a constant with the prefix {\tt op}. |
104 | 568 |
Thus, prefixing infixes with \ttindex{op} makes them behave like ordinary |
291 | 569 |
function symbols, as in \ML. Special characters occurring in~$c$ must be |
570 |
escaped, as in delimiters, using a single quote. |
|
571 |
||
572 |
The expanded forms above would be actually illegal in a {\tt .thy} file |
|
573 |
because they declare the constant \hbox{\tt"op \(c\)"} twice. |
|
104 | 574 |
|
575 |
||
576 |
\subsection{Binders} |
|
291 | 577 |
\indexbold{binders} |
578 |
\begingroup |
|
579 |
\def\Q{{\cal Q}} |
|
580 |
A {\bf binder} is a variable-binding construct such as a quantifier. The |
|
581 |
binder declaration \indexbold{*binder} |
|
104 | 582 |
\begin{ttbox} |
291 | 583 |
\(c\) :: "\(\sigma\)" (binder "\(\Q\)" \(p\)) |
104 | 584 |
\end{ttbox} |
291 | 585 |
introduces a constant~$c$ of type~$\sigma$, which must have the form |
586 |
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where |
|
587 |
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$ |
|
588 |
and the whole term has type~$\tau@3$. Special characters in $\Q$ must be |
|
589 |
escaped using a single quote. |
|
590 |
||
591 |
Let us declare the quantifier~$\forall$: |
|
104 | 592 |
\begin{ttbox} |
593 |
All :: "('a => o) => o" (binder "ALL " 10) |
|
594 |
\end{ttbox} |
|
291 | 595 |
This let us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL |
596 |
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall |
|
597 |
back on $\mtt{All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL |
|
598 |
$x$.$P$} have type~$o$, the type of formulae, while the bound variable |
|
599 |
can be polymorphic. |
|
104 | 600 |
|
291 | 601 |
The binder~$c$ of type $(\sigma \To \tau) \To \tau$ can be nested. The |
602 |
external form $\Q~x@1~x@2 \ldots x@n. P$ corresponds to the internal form |
|
603 |
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)) \] |
|
104 | 604 |
|
605 |
\medskip |
|
606 |
The general binder declaration |
|
607 |
\begin{ttbox} |
|
291 | 608 |
\(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)" (binder "\(\Q\)" \(p\)) |
104 | 609 |
\end{ttbox} |
610 |
is internally expanded to |
|
611 |
\begin{ttbox} |
|
291 | 612 |
\(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)" |
613 |
"\(\Q\)"\hskip-3pt :: "[idts, \(\tau@2\)] => \(\tau@3\)" ("(3\(\Q\)_./ _)" \(p\)) |
|
104 | 614 |
\end{ttbox} |
291 | 615 |
with $idts$ being the nonterminal symbol for a list of $id$s optionally |
616 |
constrained (see Fig.\ts\ref{fig:pure_gram}). The declaration also |
|
617 |
installs a parse translation\index{translations!parse} for~$\Q$ and a print |
|
618 |
translation\index{translations!print} for~$c$ to translate between the |
|
619 |
internal and external forms. |
|
620 |
\endgroup |
|
104 | 621 |
|
291 | 622 |
\index{mixfix declaration|)} |
104 | 623 |
|
624 |
||
625 |
\section{Example: some minimal logics} \label{sec:min_logics} |
|
291 | 626 |
This section presents some examples that have a simple syntax. They |
627 |
demonstrate how to define new object-logics from scratch. |
|
104 | 628 |
|
291 | 629 |
First we must define how an object-logic syntax embedded into the |
630 |
meta-logic. Since all theorems must conform to the syntax for~$prop$ (see |
|
631 |
Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the |
|
632 |
object-level syntax. Assume that the syntax of your object-logic defines a |
|
633 |
nonterminal symbol~$o$ of formulae. These formulae can now appear in |
|
634 |
axioms and theorems wherever $prop$ does if you add the production |
|
104 | 635 |
\[ prop ~=~ o. \] |
291 | 636 |
This is not a copy production but a coercion from formulae to propositions: |
104 | 637 |
\begin{ttbox} |
638 |
Base = Pure + |
|
639 |
types |
|
291 | 640 |
o |
104 | 641 |
arities |
642 |
o :: logic |
|
643 |
consts |
|
644 |
Trueprop :: "o => prop" ("_" 5) |
|
645 |
end |
|
646 |
\end{ttbox} |
|
647 |
The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible |
|
291 | 648 |
coercion function. Assuming this definition resides in a file {\tt base.thy}, |
135 | 649 |
you have to load it with the command {\tt use_thy "Base"}. |
104 | 650 |
|
291 | 651 |
One of the simplest nontrivial logics is {\bf minimal logic} of |
652 |
implication. Its definition in Isabelle needs no advanced features but |
|
653 |
illustrates the overall mechanism nicely: |
|
104 | 654 |
\begin{ttbox} |
655 |
Hilbert = Base + |
|
656 |
consts |
|
657 |
"-->" :: "[o, o] => o" (infixr 10) |
|
658 |
rules |
|
659 |
K "P --> Q --> P" |
|
660 |
S "(P --> Q --> R) --> (P --> Q) --> P --> R" |
|
661 |
MP "[| P --> Q; P |] ==> Q" |
|
662 |
end |
|
663 |
\end{ttbox} |
|
291 | 664 |
After loading this definition from the file {\tt hilbert.thy}, you can |
665 |
start to prove theorems in the logic: |
|
104 | 666 |
\begin{ttbox} |
667 |
goal Hilbert.thy "P --> P"; |
|
668 |
{\out Level 0} |
|
669 |
{\out P --> P} |
|
670 |
{\out 1. P --> P} |
|
291 | 671 |
\ttbreak |
104 | 672 |
by (resolve_tac [Hilbert.MP] 1); |
673 |
{\out Level 1} |
|
674 |
{\out P --> P} |
|
675 |
{\out 1. ?P --> P --> P} |
|
676 |
{\out 2. ?P} |
|
291 | 677 |
\ttbreak |
104 | 678 |
by (resolve_tac [Hilbert.MP] 1); |
679 |
{\out Level 2} |
|
680 |
{\out P --> P} |
|
681 |
{\out 1. ?P1 --> ?P --> P --> P} |
|
682 |
{\out 2. ?P1} |
|
683 |
{\out 3. ?P} |
|
291 | 684 |
\ttbreak |
104 | 685 |
by (resolve_tac [Hilbert.S] 1); |
686 |
{\out Level 3} |
|
687 |
{\out P --> P} |
|
688 |
{\out 1. P --> ?Q2 --> P} |
|
689 |
{\out 2. P --> ?Q2} |
|
291 | 690 |
\ttbreak |
104 | 691 |
by (resolve_tac [Hilbert.K] 1); |
692 |
{\out Level 4} |
|
693 |
{\out P --> P} |
|
694 |
{\out 1. P --> ?Q2} |
|
291 | 695 |
\ttbreak |
104 | 696 |
by (resolve_tac [Hilbert.K] 1); |
697 |
{\out Level 5} |
|
698 |
{\out P --> P} |
|
699 |
{\out No subgoals!} |
|
700 |
\end{ttbox} |
|
291 | 701 |
As we can see, this Hilbert-style formulation of minimal logic is easy to |
702 |
define but difficult to use. The following natural deduction formulation is |
|
703 |
better: |
|
104 | 704 |
\begin{ttbox} |
705 |
MinI = Base + |
|
706 |
consts |
|
707 |
"-->" :: "[o, o] => o" (infixr 10) |
|
708 |
rules |
|
709 |
impI "(P ==> Q) ==> P --> Q" |
|
710 |
impE "[| P --> Q; P |] ==> Q" |
|
711 |
end |
|
712 |
\end{ttbox} |
|
291 | 713 |
Note, however, that although the two systems are equivalent, this fact |
714 |
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be |
|
715 |
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt |
|
716 |
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule |
|
717 |
in {\tt Hilbert}, something that can only be shown by induction over all |
|
718 |
possible proofs in {\tt Hilbert}. |
|
104 | 719 |
|
291 | 720 |
We may easily extend minimal logic with falsity: |
104 | 721 |
\begin{ttbox} |
722 |
MinIF = MinI + |
|
723 |
consts |
|
724 |
False :: "o" |
|
725 |
rules |
|
726 |
FalseE "False ==> P" |
|
727 |
end |
|
728 |
\end{ttbox} |
|
729 |
On the other hand, we may wish to introduce conjunction only: |
|
730 |
\begin{ttbox} |
|
731 |
MinC = Base + |
|
732 |
consts |
|
733 |
"&" :: "[o, o] => o" (infixr 30) |
|
291 | 734 |
\ttbreak |
104 | 735 |
rules |
736 |
conjI "[| P; Q |] ==> P & Q" |
|
737 |
conjE1 "P & Q ==> P" |
|
738 |
conjE2 "P & Q ==> Q" |
|
739 |
end |
|
740 |
\end{ttbox} |
|
291 | 741 |
And if we want to have all three connectives together, we create and load a |
742 |
theory file consisting of a single line:\footnote{We can combine the |
|
743 |
theories without creating a theory file using the ML declaration |
|
744 |
\begin{ttbox} |
|
745 |
val MinIFC_thy = merge_theories(MinIF,MinC) |
|
746 |
\end{ttbox} |
|
747 |
\index{*merge_theories|fnote}} |
|
104 | 748 |
\begin{ttbox} |
749 |
MinIFC = MinIF + MinC |
|
750 |
\end{ttbox} |
|
751 |
Now we can prove mixed theorems like |
|
752 |
\begin{ttbox} |
|
753 |
goal MinIFC.thy "P & False --> Q"; |
|
754 |
by (resolve_tac [MinI.impI] 1); |
|
755 |
by (dresolve_tac [MinC.conjE2] 1); |
|
756 |
by (eresolve_tac [MinIF.FalseE] 1); |
|
757 |
\end{ttbox} |
|
758 |
Try this as an exercise! |
|
759 |
||
291 | 760 |
\medskip |
761 |
Unless you need to define macros or syntax translation functions, you may |
|
762 |
skip the rest of this chapter. |
|
763 |
||
764 |
||
765 |
\section{*Abstract syntax trees} \label{sec:asts} |
|
766 |
\index{trees!abstract syntax|(} The parser, given a token list from the |
|
767 |
lexer, applies productions to yield a parse tree\index{trees!parse}. By |
|
768 |
applying some internal transformations the parse tree becomes an abstract |
|
769 |
syntax tree, or \AST{}. Macro expansion, further translations and finally |
|
770 |
type inference yields a well-typed term\index{terms!obtained from ASTs}. |
|
771 |
The printing process is the reverse, except for some subtleties to be |
|
772 |
discussed later. |
|
773 |
||
774 |
Figure~\ref{fig:parse_print} outlines the parsing and printing process. |
|
775 |
Much of the complexity is due to the macro mechanism. Using macros, you |
|
776 |
can specify most forms of concrete syntax without writing any \ML{} code. |
|
777 |
||
778 |
\begin{figure} |
|
779 |
\begin{center} |
|
780 |
\begin{tabular}{cl} |
|
781 |
string & \\ |
|
782 |
$\downarrow$ & parser \\ |
|
783 |
parse tree & \\ |
|
784 |
$\downarrow$ & parse \AST{} translation \\ |
|
785 |
\AST{} & \\ |
|
786 |
$\downarrow$ & \AST{} rewriting (macros) \\ |
|
787 |
\AST{} & \\ |
|
788 |
$\downarrow$ & parse translation, type inference \\ |
|
789 |
--- well-typed term --- & \\ |
|
790 |
$\downarrow$ & print translation \\ |
|
791 |
\AST{} & \\ |
|
792 |
$\downarrow$ & \AST{} rewriting (macros) \\ |
|
793 |
\AST{} & \\ |
|
794 |
$\downarrow$ & print \AST{} translation, printer \\ |
|
795 |
string & |
|
796 |
\end{tabular} |
|
797 |
\index{translations!parse}\index{translations!parse AST} |
|
798 |
\index{translations!print}\index{translations!print AST} |
|
799 |
||
800 |
\end{center} |
|
801 |
\caption{Parsing and printing}\label{fig:parse_print} |
|
802 |
\end{figure} |
|
803 |
||
804 |
Abstract syntax trees are an intermediate form between the raw parse trees |
|
805 |
and the typed $\lambda$-terms. An \AST{} is either an atom (constant or |
|
806 |
variable) or a list of {\em at least two\/} subtrees. Internally, they |
|
807 |
have type \ttindex{Syntax.ast}: \index{*Constant} \index{*Variable} |
|
808 |
\index{*Appl} |
|
809 |
\begin{ttbox} |
|
810 |
datatype ast = Constant of string |
|
811 |
| Variable of string |
|
812 |
| Appl of ast list |
|
813 |
\end{ttbox} |
|
814 |
||
815 |
Isabelle uses an S-expression syntax for abstract syntax trees. Constant |
|
816 |
atoms are shown as quoted strings, variable atoms as non-quoted strings and |
|
817 |
applications as a parenthesized list of subtrees. For example, the \AST |
|
818 |
\begin{ttbox} |
|
819 |
Appl [Constant "_constrain", |
|
820 |
Appl [Constant "_abs", Variable "x", Variable "t"], |
|
821 |
Appl [Constant "fun", Variable "'a", Variable "'b"]] |
|
822 |
\end{ttbox} |
|
823 |
is shown as {\tt ("_constrain" ("_abs" x t) ("fun" 'a 'b))}. |
|
824 |
Both {\tt ()} and {\tt (f)} are illegal because they have too few |
|
825 |
subtrees. |
|
826 |
||
827 |
The resemblance of Lisp's S-expressions is intentional, but there are two |
|
828 |
kinds of atomic symbols: $\Constant x$ and $\Variable x$. Do not take the |
|
829 |
names ``{\tt Constant}'' and ``{\tt Variable}'' too literally; in the later |
|
830 |
translation to terms, $\Variable x$ may become a constant, free or bound |
|
831 |
variable, even a type constructor or class name; the actual outcome depends |
|
832 |
on the context. |
|
833 |
||
834 |
Similarly, you can think of ${\tt (} f~x@1~\ldots~x@n{\tt )}$ as the |
|
835 |
application of~$f$ to the arguments $x@1, \ldots, x@n$. But the kind of |
|
836 |
application is determined later by context; it could be a type constructor |
|
837 |
applied to types. |
|
838 |
||
839 |
Forms like {\tt (("_abs" x $t$) $u$)} are legal, but \AST{}s are |
|
840 |
first-order: the {\tt "_abs"} does not bind the {\tt x} in any way. Later |
|
841 |
at the term level, {\tt ("_abs" x $t$)} will become an {\tt Abs} node and |
|
842 |
occurrences of {\tt x} in $t$ will be replaced by bound variables (the term |
|
843 |
constructor \ttindex{Bound}). |
|
844 |
||
845 |
||
846 |
\subsection{Transforming parse trees to \AST{}s} |
|
847 |
The parse tree is the raw output of the parser. Translation functions, |
|
848 |
called {\bf parse AST translations}\indexbold{translations!parse AST}, |
|
849 |
transform the parse tree into an abstract syntax tree. |
|
850 |
||
851 |
The parse tree is constructed by nesting the right-hand sides of the |
|
852 |
productions used to recognize the input. Such parse trees are simply lists |
|
853 |
of tokens and constituent parse trees, the latter representing the |
|
854 |
nonterminals of the productions. Let us refer to the actual productions in |
|
855 |
the form displayed by {\tt Syntax.print_syntax}. |
|
856 |
||
857 |
Ignoring parse \AST{} translations, parse trees are transformed to \AST{}s |
|
858 |
by stripping out delimiters and copy productions. More precisely, the |
|
859 |
mapping $ast_of_pt$\index{ast_of_pt@$ast_of_pt$} is derived from the |
|
860 |
productions as follows: |
|
861 |
\begin{itemize} |
|
862 |
\item Name tokens: $ast_of_pt(t) = \Variable s$, where $t$ is an $id$, |
|
863 |
$var$, $tid$ or $tvar$ token, and $s$ its associated string. |
|
864 |
||
865 |
\item Copy productions: $ast_of_pt(\ldots P \ldots) = ast_of_pt(P)$. |
|
866 |
Here $\ldots$ stands for strings of delimiters, which are |
|
867 |
discarded. $P$ stands for the single constituent that is not a |
|
868 |
delimiter; it is either a nonterminal symbol or a name token. |
|
869 |
||
870 |
\item $0$-ary productions: $ast_of_pt(\ldots \mtt{=>} c) = \Constant c$. |
|
871 |
Here there are no constituents other than delimiters, which are |
|
872 |
discarded. |
|
873 |
||
874 |
\item $n$-ary productions, where $n \ge 1$: delimiters are discarded and |
|
875 |
the remaining constituents $P@1$, \ldots, $P@n$ are built into an |
|
876 |
application whose head constant is~$c$: |
|
877 |
\begin{eqnarray*} |
|
878 |
\lefteqn{ast_of_pt(\ldots P@1 \ldots P@n \ldots \mtt{=>} c)} \\ |
|
879 |
&&\qquad{}= \Appl{\Constant c, ast_of_pt(P@1), \ldots, ast_of_pt(P@n)} |
|
880 |
\end{eqnarray*} |
|
881 |
\end{itemize} |
|
882 |
Figure~\ref{fig:parse_ast} presents some simple examples, where {\tt ==}, |
|
883 |
{\tt _appl}, {\tt _args}, and so forth name productions of the Pure syntax. |
|
884 |
These examples illustrate the need for further translations to make \AST{}s |
|
885 |
closer to the typed $\lambda$-calculus. The Pure syntax provides |
|
886 |
predefined parse \AST{} translations\index{translations!parse AST} for |
|
887 |
ordinary applications, type applications, nested abstractions, meta |
|
888 |
implications and function types. Figure~\ref{fig:parse_ast_tr} shows their |
|
889 |
effect on some representative input strings. |
|
890 |
||
891 |
||
892 |
\begin{figure} |
|
893 |
\begin{center} |
|
894 |
\tt\begin{tabular}{ll} |
|
895 |
\rm input string & \rm \AST \\\hline |
|
896 |
"f" & f \\ |
|
897 |
"'a" & 'a \\ |
|
898 |
"t == u" & ("==" t u) \\ |
|
899 |
"f(x)" & ("_appl" f x) \\ |
|
900 |
"f(x, y)" & ("_appl" f ("_args" x y)) \\ |
|
901 |
"f(x, y, z)" & ("_appl" f ("_args" x ("_args" y z))) \\ |
|
902 |
"\%x y.\ t" & ("_lambda" ("_idts" x y) t) \\ |
|
903 |
\end{tabular} |
|
904 |
\end{center} |
|
905 |
\caption{Parsing examples using the Pure syntax}\label{fig:parse_ast} |
|
906 |
\end{figure} |
|
907 |
||
908 |
\begin{figure} |
|
909 |
\begin{center} |
|
910 |
\tt\begin{tabular}{ll} |
|
911 |
\rm input string & \rm \AST{} \\\hline |
|
912 |
"f(x, y, z)" & (f x y z) \\ |
|
913 |
"'a ty" & (ty 'a) \\ |
|
914 |
"('a, 'b) ty" & (ty 'a 'b) \\ |
|
915 |
"\%x y z.\ t" & ("_abs" x ("_abs" y ("_abs" z t))) \\ |
|
916 |
"\%x ::\ 'a.\ t" & ("_abs" ("_constrain" x 'a) t) \\ |
|
917 |
"[| P; Q; R |] => S" & ("==>" P ("==>" Q ("==>" R S))) \\ |
|
918 |
"['a, 'b, 'c] => 'd" & ("fun" 'a ("fun" 'b ("fun" 'c 'd))) |
|
919 |
\end{tabular} |
|
920 |
\end{center} |
|
921 |
\caption{Built-in parse \AST{} translations}\label{fig:parse_ast_tr} |
|
922 |
\end{figure} |
|
923 |
||
924 |
The names of constant heads in the \AST{} control the translation process. |
|
925 |
The list of constants invoking parse \AST{} translations appears in the |
|
926 |
output of {\tt Syntax.print_syntax} under {\tt parse_ast_translation}. |
|
927 |
||
928 |
||
929 |
\subsection{Transforming \AST{}s to terms} |
|
930 |
The \AST{}, after application of macros (see \S\ref{sec:macros}), is |
|
931 |
transformed into a term. This term is probably ill-typed since type |
|
932 |
inference has not occurred yet. The term may contain type constraints |
|
933 |
consisting of applications with head {\tt "_constrain"}; the second |
|
934 |
argument is a type encoded as a term. Type inference later introduces |
|
935 |
correct types or rejects the input. |
|
936 |
||
937 |
Another set of translation functions, namely parse |
|
938 |
translations,\index{translations!parse}, may affect this process. If we |
|
939 |
ignore parse translations for the time being, then \AST{}s are transformed |
|
940 |
to terms by mapping \AST{} constants to constants, \AST{} variables to |
|
941 |
schematic or free variables and \AST{} applications to applications. |
|
942 |
||
943 |
More precisely, the mapping $term_of_ast$\index{term_of_ast@$term_of_ast$} |
|
944 |
is defined by |
|
945 |
\begin{itemize} |
|
946 |
\item Constants: $term_of_ast(\Constant x) = \ttfct{Const} (x, |
|
947 |
\mtt{dummyT})$. |
|
948 |
||
949 |
\item Schematic variables: $term_of_ast(\Variable \mtt{"?}xi\mtt") = |
|
950 |
\ttfct{Var} ((x, i), \mtt{dummyT})$, where $x$ is the base name and $i$ |
|
951 |
the index extracted from $xi$. |
|
952 |
||
953 |
\item Free variables: $term_of_ast(\Variable x) = \ttfct{Free} (x, |
|
954 |
\mtt{dummyT})$. |
|
955 |
||
956 |
\item Function applications with $n$ arguments: |
|
957 |
\begin{eqnarray*} |
|
958 |
\lefteqn{term_of_ast(\Appl{f, x@1, \ldots, x@n})} \\ |
|
959 |
&&\qquad{}= term_of_ast(f) \ttapp |
|
960 |
term_of_ast(x@1) \ttapp \ldots \ttapp term_of_ast(x@n) |
|
961 |
\end{eqnarray*} |
|
962 |
\end{itemize} |
|
963 |
Here \ttindex{Const}, \ttindex{Var}, \ttindex{Free} and |
|
964 |
\verb|$|\index{$@{\tt\$}} are constructors of the datatype {\tt term}, |
|
965 |
while \ttindex{dummyT} stands for some dummy type that is ignored during |
|
966 |
type inference. |
|
967 |
||
968 |
So far the outcome is still a first-order term. Abstractions and bound |
|
969 |
variables (constructors \ttindex{Abs} and \ttindex{Bound}) are introduced |
|
970 |
by parse translations. Such translations are attached to {\tt "_abs"}, |
|
971 |
{\tt "!!"} and user-defined binders. |
|
972 |
||
973 |
||
974 |
\subsection{Printing of terms} |
|
975 |
The output phase is essentially the inverse of the input phase. Terms are |
|
976 |
translated via abstract syntax trees into strings. Finally the strings are |
|
977 |
pretty printed. |
|
978 |
||
979 |
Print translations (\S\ref{sec:tr_funs}) may affect the transformation of |
|
980 |
terms into \AST{}s. Ignoring those, the transformation maps |
|
981 |
term constants, variables and applications to the corresponding constructs |
|
982 |
on \AST{}s. Abstractions are mapped to applications of the special |
|
983 |
constant {\tt _abs}. |
|
984 |
||
985 |
More precisely, the mapping $ast_of_term$\index{ast_of_term@$ast_of_term$} |
|
986 |
is defined as follows: |
|
987 |
\begin{itemize} |
|
988 |
\item $ast_of_term(\ttfct{Const} (x, \tau)) = \Constant x$. |
|
989 |
||
990 |
\item $ast_of_term(\ttfct{Free} (x, \tau)) = constrain (\Variable x, |
|
991 |
\tau)$. |
|
992 |
||
993 |
\item $ast_of_term(\ttfct{Var} ((x, i), \tau)) = constrain (\Variable |
|
994 |
\mtt{"?}xi\mtt", \tau)$, where $\mtt?xi$ is the string representation of |
|
995 |
the {\tt indexname} $(x, i)$. |
|
996 |
||
997 |
\item For the abstraction $\lambda x::\tau.t$, let $x'$ be a variant |
|
998 |
of~$x$ renamed to differ from all names occurring in~$t$, and let $t'$ |
|
999 |
be obtained from~$t$ by replacing all bound occurrences of~$x$ by |
|
1000 |
the free variable $x'$. This replaces corresponding occurrences of the |
|
1001 |
constructor \ttindex{Bound} by the term $\ttfct{Free} (x', |
|
1002 |
\mtt{dummyT})$: |
|
1003 |
\begin{eqnarray*} |
|
1004 |
\lefteqn{ast_of_term(\ttfct{Abs} (x, \tau, t))} \\ |
|
1005 |
&&\qquad{}= \ttfct{Appl} |
|
1006 |
\mathopen{\mtt[} |
|
1007 |
\Constant \mtt{"_abs"}, constrain(\Variable x', \tau), \\ |
|
1008 |
&&\qquad\qquad\qquad ast_of_term(t') \mathclose{\mtt]}. |
|
1009 |
\end{eqnarray*} |
|
1010 |
||
1011 |
\item $ast_of_term(\ttfct{Bound} i) = \Variable \mtt{"B.}i\mtt"$. |
|
1012 |
The occurrence of constructor \ttindex{Bound} should never happen |
|
1013 |
when printing well-typed terms; it indicates a de Bruijn index with no |
|
1014 |
matching abstraction. |
|
1015 |
||
1016 |
\item Where $f$ is not an application, |
|
1017 |
\begin{eqnarray*} |
|
1018 |
\lefteqn{ast_of_term(f \ttapp x@1 \ttapp \ldots \ttapp x@n)} \\ |
|
1019 |
&&\qquad{}= \ttfct{Appl} |
|
1020 |
\mathopen{\mtt[} ast_of_term(f), |
|
1021 |
ast_of_term(x@1), \ldots,ast_of_term(x@n) |
|
1022 |
\mathclose{\mtt]} |
|
1023 |
\end{eqnarray*} |
|
1024 |
\end{itemize} |
|
1025 |
||
1026 |
Type constraints are inserted to allow the printing of types, which is |
|
1027 |
governed by the boolean variable \ttindex{show_types}. Constraints are |
|
1028 |
treated as follows: |
|
1029 |
\begin{itemize} |
|
1030 |
\item $constrain(x, \tau) = x$, if $\tau = \mtt{dummyT}$ \index{*dummyT} or |
|
1031 |
\ttindex{show_types} not set to {\tt true}. |
|
1032 |
||
1033 |
\item $constrain(x, \tau) = \Appl{\Constant \mtt{"_constrain"}, x, ty}$, |
|
1034 |
where $ty$ is the \AST{} encoding of $\tau$. That is, type constructors as |
|
1035 |
{\tt Constant}s, type identifiers as {\tt Variable}s and type applications |
|
1036 |
as {\tt Appl}s with the head type constructor as first element. |
|
1037 |
Additionally, if \ttindex{show_sorts} is set to {\tt true}, some type |
|
1038 |
variables are decorated with an \AST{} encoding of their sort. |
|
1039 |
\end{itemize} |
|
1040 |
||
1041 |
The \AST{}, after application of macros (see \S\ref{sec:macros}), is |
|
1042 |
transformed into the final output string. The built-in {\bf print AST |
|
1043 |
translations}\indexbold{translations!print AST} effectively reverse the |
|
1044 |
parse \AST{} translations of Fig.\ts\ref{fig:parse_ast_tr}. |
|
1045 |
||
1046 |
For the actual printing process, the names attached to productions |
|
1047 |
of the form $\ldots A^{(p@1)}@1 \ldots A^{(p@n)}@n \ldots \mtt{=>} c$ play |
|
1048 |
a vital role. Each \AST{} with constant head $c$, namely $\mtt"c\mtt"$ or |
|
1049 |
$(\mtt"c\mtt"~ x@1 \ldots x@n)$, is printed according to the production |
|
1050 |
for~$c$. Each argument~$x@i$ is converted to a string, and put in |
|
1051 |
parentheses if its priority~$(p@i)$ requires this. The resulting strings |
|
1052 |
and their syntactic sugar (denoted by ``\dots'' above) are joined to make a |
|
1053 |
single string. |
|
1054 |
||
1055 |
If an application $(\mtt"c\mtt"~ x@1 \ldots x@m)$ has more arguments than the |
|
1056 |
corresponding production, it is first split into $((\mtt"c\mtt"~ x@1 \ldots |
|
1057 |
x@n) ~ x@{n+1} \ldots x@m)$. Applications with too few arguments or with |
|
1058 |
non-constant head or without a corresponding production are printed as |
|
1059 |
$f(x@1, \ldots, x@l)$ or $(\alpha@1, \ldots, \alpha@l) ty$. An occurrence of |
|
1060 |
$\Variable x$ is simply printed as~$x$. |
|
1061 |
||
1062 |
Blanks are {\em not\/} inserted automatically. If blanks are required to |
|
1063 |
separate tokens, specify them in the mixfix declaration, possibly preceeded |
|
1064 |
by a slash~({\tt/}) to allow a line break. |
|
1065 |
\index{trees!abstract syntax|)} |
|
1066 |
||
1067 |
||
1068 |
||
1069 |
\section{*Macros: Syntactic rewriting} \label{sec:macros} |
|
1070 |
\index{macros|(}\index{rewriting!syntactic|(} |
|
1071 |
||
1072 |
Mixfix declarations alone can handle situations where there is a direct |
|
1073 |
connection between the concrete syntax and the underlying term. Sometimes |
|
1074 |
we require a more elaborate concrete syntax, such as quantifiers and list |
|
1075 |
notation. Isabelle's {\bf macros} and {\bf translation functions} can |
|
1076 |
perform translations such as |
|
1077 |
\begin{center}\tt |
|
1078 |
\begin{tabular}{r@{$\quad\protect\rightleftharpoons\quad$}l} |
|
1079 |
ALL x:A.P & Ball(A, \%x.P) \\ \relax |
|
1080 |
[x, y, z] & Cons(x, Cons(y, Cons(z, Nil))) |
|
1081 |
\end{tabular} |
|
1082 |
\end{center} |
|
1083 |
Translation functions (see \S\ref{sec:tr_funs}) must be coded in ML; they |
|
1084 |
are the most powerful translation mechanism but are difficult to read or |
|
1085 |
write. Macros are specified by first-order rewriting systems that operate |
|
1086 |
on abstract syntax trees. They are usually easy to read and write, and can |
|
1087 |
express all but the most obscure translations. |
|
1088 |
||
1089 |
Figure~\ref{fig:set_trans} defines a fragment of first-order logic and set |
|
1090 |
theory.\footnote{This and the following theories are complete working |
|
1091 |
examples, though they specify only syntax, no axioms. The file {\tt |
|
1092 |
ZF/zf.thy} presents the full set theory definition, including many |
|
1093 |
macro rules.} Theory {\tt SET} defines constants for set comprehension |
|
1094 |
({\tt Collect}), replacement ({\tt Replace}) and bounded universal |
|
1095 |
quantification ({\tt Ball}). Each of these binds some variables. Without |
|
1096 |
additional syntax we should have to express $\forall x \in A. P$ as {\tt |
|
1097 |
Ball(A,\%x.P)}, and similarly for the others. |
|
1098 |
||
1099 |
\begin{figure} |
|
1100 |
\begin{ttbox} |
|
1101 |
SET = Pure + |
|
1102 |
types |
|
1103 |
i, o |
|
1104 |
arities |
|
1105 |
i, o :: logic |
|
1106 |
consts |
|
1107 |
Trueprop :: "o => prop" ("_" 5) |
|
1108 |
Collect :: "[i, i => o] => i" |
|
1109 |
"{\at}Collect" :: "[idt, i, o] => i" ("(1{\ttlbrace}_:_./ _{\ttrbrace})") |
|
1110 |
Replace :: "[i, [i, i] => o] => i" |
|
1111 |
"{\at}Replace" :: "[idt, idt, i, o] => i" ("(1{\ttlbrace}_./ _:_, _{\ttrbrace})") |
|
1112 |
Ball :: "[i, i => o] => o" |
|
1113 |
"{\at}Ball" :: "[idt, i, o] => o" ("(3ALL _:_./ _)" 10) |
|
1114 |
translations |
|
1115 |
"{\ttlbrace}x:A. P{\ttrbrace}" == "Collect(A, \%x. P)" |
|
1116 |
"{\ttlbrace}y. x:A, Q{\ttrbrace}" == "Replace(A, \%x y. Q)" |
|
1117 |
"ALL x:A. P" == "Ball(A, \%x. P)" |
|
1118 |
end |
|
1119 |
\end{ttbox} |
|
1120 |
\caption{Macro example: set theory}\label{fig:set_trans} |
|
1121 |
\end{figure} |
|
1122 |
||
1123 |
The theory specifies a variable-binding syntax through additional |
|
1124 |
productions that have mixfix declarations. Each non-copy production must |
|
1125 |
specify some constant, which is used for building \AST{}s. The additional |
|
1126 |
constants are decorated with {\tt\at} to stress their purely syntactic |
|
1127 |
purpose; they should never occur within the final well-typed terms. |
|
1128 |
Furthermore, they cannot be written in formulae because they are not legal |
|
1129 |
identifiers. |
|
1130 |
||
1131 |
The translations cause the replacement of external forms by internal forms |
|
1132 |
after parsing, and vice versa before printing of terms. As a specification |
|
1133 |
of the set theory notation, they should be largely self-explanatory. The |
|
1134 |
syntactic constants, {\tt\at Collect}, {\tt\at Replace} and {\tt\at Ball}, |
|
1135 |
appear implicitly in the macro rules via their mixfix forms. |
|
1136 |
||
1137 |
Macros can define variable-binding syntax because they operate on \AST{}s, |
|
1138 |
which have no inbuilt notion of bound variable. The macro variables {\tt |
|
1139 |
x} and~{\tt y} have type~{\tt idt} and therefore range over identifiers, |
|
1140 |
in this case bound variables. The macro variables {\tt P} and~{\tt Q} |
|
1141 |
range over formulae containing bound variable occurrences. |
|
1142 |
||
1143 |
Other applications of the macro system can be less straightforward, and |
|
1144 |
there are peculiarities. The rest of this section will describe in detail |
|
1145 |
how Isabelle macros are preprocessed and applied. |
|
1146 |
||
1147 |
||
1148 |
\subsection{Specifying macros} |
|
1149 |
Macros are basically rewrite rules on \AST{}s. But unlike other macro |
|
1150 |
systems found in programming languages, Isabelle's macros work in both |
|
1151 |
directions. Therefore a syntax contains two lists of rewrites: one for |
|
1152 |
parsing and one for printing. |
|
1153 |
||
1154 |
The {\tt translations} section\index{translations section@{\tt translations} |
|
1155 |
section} specifies macros. The syntax for a macro is |
|
1156 |
\[ (root)\; string \quad |
|
1157 |
\left\{\begin{array}[c]{c} \mtt{=>} \\ \mtt{<=} \\ \mtt{==} \end{array} |
|
1158 |
\right\} \quad |
|
1159 |
(root)\; string |
|
1160 |
\] |
|
1161 |
% |
|
1162 |
This specifies a parse rule ({\tt =>}), a print rule ({\tt <=}), or both |
|
1163 |
({\tt ==}). The two strings specify the left and right-hand sides of the |
|
1164 |
macro rule. The $(root)$ specification is optional; it specifies the |
|
1165 |
nonterminal for parsing the $string$ and if omitted defaults to {\tt |
|
1166 |
logic}. \AST{} rewrite rules $(l, r)$ must obey certain conditions: |
|
1167 |
\begin{itemize} |
|
1168 |
\item Rules must be left linear: $l$ must not contain repeated variables. |
|
1169 |
||
1170 |
\item Rules must have constant heads, namely $l = \mtt"c\mtt"$ or $l = |
|
1171 |
(\mtt"c\mtt" ~ x@1 \ldots x@n)$. |
|
1172 |
||
1173 |
\item Every variable in~$r$ must also occur in~$l$. |
|
1174 |
\end{itemize} |
|
1175 |
||
1176 |
Macro rules may refer to any syntax from the parent theories. They may |
|
1177 |
also refer to anything defined before the the {\tt .thy} file's {\tt |
|
1178 |
translations} section --- including any mixfix declarations. |
|
1179 |
||
1180 |
Upon declaration, both sides of the macro rule undergo parsing and parse |
|
1181 |
\AST{} translations (see \S\ref{sec:asts}), but do not themselves undergo |
|
1182 |
macro expansion. The lexer runs in a different mode that additionally |
|
1183 |
accepts identifiers of the form $\_~letter~quasiletter^*$ (like {\tt _idt}, |
|
1184 |
{\tt _K}). Thus, a constant whose name starts with an underscore can |
|
1185 |
appear in macro rules but not in ordinary terms. |
|
1186 |
||
1187 |
Some atoms of the macro rule's \AST{} are designated as constants for |
|
1188 |
matching. These are all names that have been declared as classes, types or |
|
1189 |
constants. |
|
1190 |
||
1191 |
The result of this preprocessing is two lists of macro rules, each stored |
|
1192 |
as a pair of \AST{}s. They can be viewed using {\tt Syntax.print_syntax} |
|
1193 |
(sections \ttindex{parse_rules} and \ttindex{print_rules}). For |
|
1194 |
theory~{\tt SET} of Fig.~\ref{fig:set_trans} these are |
|
1195 |
\begin{ttbox} |
|
1196 |
parse_rules: |
|
1197 |
("{\at}Collect" x A P) -> ("Collect" A ("_abs" x P)) |
|
1198 |
("{\at}Replace" y x A Q) -> ("Replace" A ("_abs" x ("_abs" y Q))) |
|
1199 |
("{\at}Ball" x A P) -> ("Ball" A ("_abs" x P)) |
|
1200 |
print_rules: |
|
1201 |
("Collect" A ("_abs" x P)) -> ("{\at}Collect" x A P) |
|
1202 |
("Replace" A ("_abs" x ("_abs" y Q))) -> ("{\at}Replace" y x A Q) |
|
1203 |
("Ball" A ("_abs" x P)) -> ("{\at}Ball" x A P) |
|
1204 |
\end{ttbox} |
|
1205 |
||
1206 |
\begin{warn} |
|
1207 |
Avoid choosing variable names that have previously been used as |
|
1208 |
constants, types or type classes; the {\tt consts} section in the output |
|
1209 |
of {\tt Syntax.print_syntax} lists all such names. If a macro rule works |
|
1210 |
incorrectly, inspect its internal form as shown above, recalling that |
|
1211 |
constants appear as quoted strings and variables without quotes. |
|
1212 |
\end{warn} |
|
1213 |
||
1214 |
\begin{warn} |
|
1215 |
If \ttindex{eta_contract} is set to {\tt true}, terms will be |
|
1216 |
$\eta$-contracted {\em before\/} the \AST{} rewriter sees them. Thus some |
|
1217 |
abstraction nodes needed for print rules to match may vanish. For example, |
|
1218 |
\verb|Ball(A, %x. P(x))| contracts {\tt Ball(A, P)}; the print rule does |
|
1219 |
not apply and the output will be {\tt Ball(A, P)}. This problem would not |
|
1220 |
occur if \ML{} translation functions were used instead of macros (as is |
|
1221 |
done for binder declarations). |
|
1222 |
\end{warn} |
|
1223 |
||
1224 |
||
1225 |
\begin{warn} |
|
1226 |
Another trap concerns type constraints. If \ttindex{show_types} is set to |
|
1227 |
{\tt true}, bound variables will be decorated by their meta types at the |
|
1228 |
binding place (but not at occurrences in the body). Matching with |
|
1229 |
\verb|Collect(A, %x. P)| binds {\tt x} to something like {\tt ("_constrain" y |
|
1230 |
"i")} rather than only {\tt y}. \AST{} rewriting will cause the constraint to |
|
1231 |
appear in the external form, say \verb|{y::i:A::i. P::o}|. |
|
1232 |
||
1233 |
To allow such constraints to be re-read, your syntax should specify bound |
|
1234 |
variables using the nonterminal~\ttindex{idt}. This is the case in our |
|
1235 |
example. Choosing {\tt id} instead of {\tt idt} is a common error, |
|
1236 |
especially since it appears in former versions of most of Isabelle's |
|
1237 |
object-logics. |
|
1238 |
\end{warn} |
|
1239 |
||
1240 |
||
1241 |
||
1242 |
\subsection{Applying rules} |
|
1243 |
As a term is being parsed or printed, an \AST{} is generated as an |
|
1244 |
intermediate form (recall Fig.\ts\ref{fig:parse_print}). The \AST{} is |
|
1245 |
normalized by applying macro rules in the manner of a traditional term |
|
1246 |
rewriting system. We first examine how a single rule is applied. |
|
1247 |
||
1248 |
Let $t$ be the abstract syntax tree to be normalized and $(l, r)$ some |
|
1249 |
translation rule. A subtree~$u$ of $t$ is a {\bf redex} if it is an |
|
1250 |
instance of~$l$; in this case $l$ is said to {\bf match}~$u$. A redex |
|
1251 |
matched by $l$ may be replaced by the corresponding instance of~$r$, thus |
|
1252 |
{\bf rewriting} the \AST~$t$. Matching requires some notion of {\bf |
|
1253 |
place-holders} that may occur in rule patterns but not in ordinary |
|
1254 |
\AST{}s; {\tt Variable} atoms serve this purpose. |
|
1255 |
||
1256 |
The matching of the object~$u$ by the pattern~$l$ is performed as follows: |
|
1257 |
\begin{itemize} |
|
1258 |
\item Every constant matches itself. |
|
1259 |
||
1260 |
\item $\Variable x$ in the object matches $\Constant x$ in the pattern. |
|
1261 |
This point is discussed further below. |
|
1262 |
||
1263 |
\item Every \AST{} in the object matches $\Variable x$ in the pattern, |
|
1264 |
binding~$x$ to~$u$. |
|
1265 |
||
1266 |
\item One application matches another if they have the same number of |
|
1267 |
subtrees and corresponding subtrees match. |
|
1268 |
||
1269 |
\item In every other case, matching fails. In particular, {\tt |
|
1270 |
Constant}~$x$ can only match itself. |
|
1271 |
\end{itemize} |
|
1272 |
A successful match yields a substitution that is applied to~$r$, generating |
|
1273 |
the instance that replaces~$u$. |
|
1274 |
||
1275 |
The second case above may look odd. This is where {\tt Variable}s of |
|
1276 |
non-rule \AST{}s behave like {\tt Constant}s. Recall that \AST{}s are not |
|
1277 |
far removed from parse trees; at this level it is not yet known which |
|
1278 |
identifiers will become constants, bounds, frees, types or classes. As |
|
1279 |
\S\ref{sec:asts} describes, former parse tree heads appear in \AST{}s as |
|
1280 |
{\tt Constant}s, while $id$s, $var$s, $tid$s and $tvar$s become {\tt |
|
1281 |
Variable}s. On the other hand, when \AST{}s generated from terms for |
|
1282 |
printing, all constants and type constructors become {\tt Constant}s; see |
|
1283 |
\S\ref{sec:asts}. Thus \AST{}s may contain a messy mixture of {\tt |
|
1284 |
Variable}s and {\tt Constant}s. This is insignificant at macro level |
|
1285 |
because matching treats them alike. |
|
1286 |
||
1287 |
Because of this behaviour, different kinds of atoms with the same name are |
|
1288 |
indistinguishable, which may make some rules prone to misbehaviour. Example: |
|
1289 |
\begin{ttbox} |
|
1290 |
types |
|
1291 |
Nil |
|
1292 |
consts |
|
1293 |
Nil :: "'a list" |
|
1294 |
"[]" :: "'a list" ("[]") |
|
1295 |
translations |
|
1296 |
"[]" == "Nil" |
|
1297 |
\end{ttbox} |
|
1298 |
The term {\tt Nil} will be printed as {\tt []}, just as expected. What |
|
1299 |
happens with \verb|%Nil.t| or {\tt x::Nil} is left as an exercise. |
|
1300 |
||
1301 |
Normalizing an \AST{} involves repeatedly applying macro rules until none |
|
1302 |
is applicable. Macro rules are chosen in the order that they appear in the |
|
1303 |
{\tt translations} section. You can watch the normalization of \AST{}s |
|
1304 |
during parsing and printing by setting \ttindex{Syntax.trace_norm_ast} to |
|
1305 |
{\tt true}.\index{tracing!of macros} Alternatively, use |
|
1306 |
\ttindex{Syntax.test_read}. The information displayed when tracing |
|
1307 |
includes the \AST{} before normalization ({\tt pre}), redexes with results |
|
1308 |
({\tt rewrote}), the normal form finally reached ({\tt post}) and some |
|
1309 |
statistics ({\tt normalize}). If tracing is off, |
|
1310 |
\ttindex{Syntax.stat_norm_ast} can be set to {\tt true} in order to enable |
|
1311 |
printing of the normal form and statistics only. |
|
1312 |
||
1313 |
||
1314 |
\subsection{Example: the syntax of finite sets} |
|
1315 |
This example demonstrates the use of recursive macros to implement a |
|
1316 |
convenient notation for finite sets. |
|
1317 |
\begin{ttbox} |
|
1318 |
FINSET = SET + |
|
1319 |
types |
|
1320 |
is |
|
1321 |
consts |
|
1322 |
"" :: "i => is" ("_") |
|
1323 |
"{\at}Enum" :: "[i, is] => is" ("_,/ _") |
|
1324 |
empty :: "i" ("{\ttlbrace}{\ttrbrace}") |
|
1325 |
insert :: "[i, i] => i" |
|
1326 |
"{\at}Finset" :: "is => i" ("{\ttlbrace}(_){\ttrbrace}") |
|
1327 |
translations |
|
1328 |
"{\ttlbrace}x, xs{\ttrbrace}" == "insert(x, {\ttlbrace}xs{\ttrbrace})" |
|
1329 |
"{\ttlbrace}x{\ttrbrace}" == "insert(x, {\ttlbrace}{\ttrbrace})" |
|
1330 |
end |
|
1331 |
\end{ttbox} |
|
1332 |
Finite sets are internally built up by {\tt empty} and {\tt insert}. The |
|
1333 |
declarations above specify \verb|{x, y, z}| as the external representation |
|
1334 |
of |
|
1335 |
\begin{ttbox} |
|
1336 |
insert(x, insert(y, insert(z, empty))) |
|
1337 |
\end{ttbox} |
|
1338 |
||
1339 |
The nonterminal symbol~{\tt is} stands for one or more objects of type~{\tt |
|
1340 |
i} separated by commas. The mixfix declaration \hbox{\verb|"_,/ _"|} |
|
1341 |
allows a line break after the comma for pretty printing; if no line break |
|
1342 |
is required then a space is printed instead. |
|
1343 |
||
1344 |
The nonterminal is declared as the type~{\tt is}, but with no {\tt arities} |
|
1345 |
declaration. Hence {\tt is} is not a logical type and no default |
|
1346 |
productions are added. If we had needed enumerations of the nonterminal |
|
1347 |
{\tt logic}, which would include all the logical types, we could have used |
|
1348 |
the predefined nonterminal symbol \ttindex{args} and skipped this part |
|
1349 |
altogether. The nonterminal~{\tt is} can later be reused for other |
|
1350 |
enumerations of type~{\tt i} like lists or tuples. |
|
1351 |
||
1352 |
Next follows {\tt empty}, which is already equipped with its syntax |
|
1353 |
\verb|{}|, and {\tt insert} without concrete syntax. The syntactic |
|
1354 |
constant {\tt\at Finset} provides concrete syntax for enumerations of~{\tt |
|
1355 |
i} enclosed in curly braces. Remember that a pair of parentheses, as in |
|
1356 |
\verb|"{(_)}"|, specifies a block of indentation for pretty printing. |
|
1357 |
||
1358 |
The translations may look strange at first. Macro rules are best |
|
1359 |
understood in their internal forms: |
|
1360 |
\begin{ttbox} |
|
1361 |
parse_rules: |
|
1362 |
("{\at}Finset" ("{\at}Enum" x xs)) -> ("insert" x ("{\at}Finset" xs)) |
|
1363 |
("{\at}Finset" x) -> ("insert" x "empty") |
|
1364 |
print_rules: |
|
1365 |
("insert" x ("{\at}Finset" xs)) -> ("{\at}Finset" ("{\at}Enum" x xs)) |
|
1366 |
("insert" x "empty") -> ("{\at}Finset" x) |
|
1367 |
\end{ttbox} |
|
1368 |
This shows that \verb|{x, xs}| indeed matches any set enumeration of at least |
|
1369 |
two elements, binding the first to {\tt x} and the rest to {\tt xs}. |
|
1370 |
Likewise, \verb|{xs}| and \verb|{x}| represent any set enumeration. |
|
1371 |
The parse rules only work in the order given. |
|
1372 |
||
1373 |
\begin{warn} |
|
1374 |
The \AST{} rewriter cannot discern constants from variables and looks |
|
1375 |
only for names of atoms. Thus the names of {\tt Constant}s occurring in |
|
1376 |
the (internal) left-hand side of translation rules should be regarded as |
|
1377 |
reserved keywords. Choose non-identifiers like {\tt\at Finset} or |
|
1378 |
sufficiently long and strange names. If a bound variable's name gets |
|
1379 |
rewritten, the result will be incorrect; for example, the term |
|
1380 |
\begin{ttbox} |
|
1381 |
\%empty insert. insert(x, empty) |
|
1382 |
\end{ttbox} |
|
1383 |
gets printed as \verb|%empty insert. {x}|. |
|
1384 |
\end{warn} |
|
1385 |
||
1386 |
||
1387 |
\subsection{Example: a parse macro for dependent types}\label{prod_trans} |
|
1388 |
As stated earlier, a macro rule may not introduce new {\tt Variable}s on |
|
1389 |
the right-hand side. Something like \verb|"K(B)" => "%x. B"| is illegal; |
|
1390 |
it allowed, it could cause variable capture. In such cases you usually |
|
1391 |
must fall back on translation functions. But a trick can make things |
|
1392 |
readable in some cases: {\em calling translation functions by parse |
|
1393 |
macros}: |
|
1394 |
\begin{ttbox} |
|
1395 |
PROD = FINSET + |
|
1396 |
consts |
|
1397 |
Pi :: "[i, i => i] => i" |
|
1398 |
"{\at}PROD" :: "[idt, i, i] => i" ("(3PROD _:_./ _)" 10) |
|
1399 |
"{\at}->" :: "[i, i] => i" ("(_ ->/ _)" [51, 50] 50) |
|
1400 |
\ttbreak |
|
1401 |
translations |
|
1402 |
"PROD x:A. B" => "Pi(A, \%x. B)" |
|
1403 |
"A -> B" => "Pi(A, _K(B))" |
|
1404 |
end |
|
1405 |
ML |
|
1406 |
val print_translation = [("Pi", dependent_tr' ("{\at}PROD", "{\at}->"))]; |
|
1407 |
\end{ttbox} |
|
1408 |
||
1409 |
Here {\tt Pi} is an internal constant for constructing general products. |
|
1410 |
Two external forms exist: the general case {\tt PROD x:A.B} and the |
|
1411 |
function space {\tt A -> B}, which abbreviates \verb|Pi(A, %x.B)| when {\tt B} |
|
1412 |
does not depend on~{\tt x}. |
|
1413 |
||
1414 |
The second parse macro introduces {\tt _K(B)}, which later becomes \verb|%x.B| |
|
1415 |
due to a parse translation associated with \ttindex{_K}. The order of the |
|
1416 |
parse rules is critical. Unfortunately there is no such trick for |
|
1417 |
printing, so we have to add a {\tt ML} section for the print translation |
|
1418 |
\ttindex{dependent_tr'}. |
|
1419 |
||
1420 |
Recall that identifiers with a leading {\tt _} are allowed in translation |
|
1421 |
rules, but not in ordinary terms. Thus we can create \AST{}s containing |
|
1422 |
names that are not directly expressible. |
|
1423 |
||
1424 |
The parse translation for {\tt _K} is already installed in Pure, and {\tt |
|
1425 |
dependent_tr'} is exported by the syntax module for public use. See |
|
1426 |
\S\ref{sec:tr_funs} below for more of the arcane lore of translation functions. |
|
1427 |
\index{macros|)}\index{rewriting!syntactic|)} |
|
1428 |
||
1429 |
||
1430 |
||
1431 |
\section{*Translation functions} \label{sec:tr_funs} |
|
1432 |
\index{translations|(} |
|
1433 |
% |
|
1434 |
This section describes the translation function mechanism. By writing |
|
1435 |
\ML{} functions, you can do almost everything with terms or \AST{}s during |
|
1436 |
parsing and printing. The logic \LK\ is a good example of sophisticated |
|
1437 |
transformations between internal and external representations of |
|
1438 |
associative sequences; here, macros would be useless. |
|
1439 |
||
1440 |
A full understanding of translations requires some familiarity |
|
1441 |
with Isabelle's internals, especially the datatypes {\tt term}, {\tt typ}, |
|
1442 |
{\tt Syntax.ast} and the encodings of types and terms as such at the various |
|
1443 |
stages of the parsing or printing process. Most users should never need to |
|
1444 |
use translation functions. |
|
1445 |
||
1446 |
\subsection{Declaring translation functions} |
|
1447 |
There are four kinds of translation functions. Each such function is |
|
1448 |
associated with a name, which triggers calls to it. Such names can be |
|
1449 |
constants (logical or syntactic) or type constructors. |
|
1450 |
||
1451 |
{\tt Syntax.print_syntax} displays the sets of names associated with the |
|
1452 |
translation functions of a {\tt Syntax.syntax} under |
|
1453 |
\ttindex{parse_ast_translation}, \ttindex{parse_translation}, |
|
1454 |
\ttindex{print_translation} and \ttindex{print_ast_translation}. You can |
|
1455 |
add new ones via the {\tt ML} section\index{ML section@{\tt ML} section} of |
|
1456 |
a {\tt .thy} file. There may never be more than one function of the same |
|
1457 |
kind per name. Conceptually, the {\tt ML} section should appear between |
|
1458 |
{\tt consts} and {\tt translations}; newly installed translation functions |
|
1459 |
are already effective when macros and logical rules are parsed. |
|
1460 |
||
1461 |
The {\tt ML} section is copied verbatim into the \ML\ file generated from a |
|
1462 |
{\tt .thy} file. Definitions made here are accessible as components of an |
|
1463 |
\ML\ structure; to make some definitions private, use an \ML{} {\tt local} |
|
1464 |
declaration. The {\tt ML} section may install translation functions by |
|
1465 |
declaring any of the following identifiers: |
|
1466 |
\begin{ttbox} |
|
1467 |
val parse_ast_translation : (string * (ast list -> ast)) list |
|
1468 |
val print_ast_translation : (string * (ast list -> ast)) list |
|
1469 |
val parse_translation : (string * (term list -> term)) list |
|
1470 |
val print_translation : (string * (term list -> term)) list |
|
1471 |
\end{ttbox} |
|
1472 |
||
1473 |
\subsection{The translation strategy} |
|
1474 |
All four kinds of translation functions are treated similarly. They are |
|
1475 |
called during the transformations between parse trees, \AST{}s and terms |
|
1476 |
(recall Fig.\ts\ref{fig:parse_print}). Whenever a combination of the form |
|
1477 |
$(\mtt"c\mtt"~x@1 \ldots x@n)$ is encountered, and a translation function |
|
1478 |
$f$ of appropriate kind exists for $c$, the result is computed by the \ML{} |
|
1479 |
function call $f \mtt[ x@1, \ldots, x@n \mtt]$. |
|
1480 |
||
1481 |
For \AST{} translations, the arguments $x@1, \ldots, x@n$ are \AST{}s. A |
|
1482 |
combination has the form $\Constant c$ or $\Appl{\Constant c, x@1, \ldots, |
|
1483 |
x@n}$. For term translations, the arguments are terms and a combination |
|
1484 |
has the form $\ttfct{Const} (c, \tau)$ or $\ttfct{Const} (c, \tau) \ttapp |
|
1485 |
x@1 \ttapp \ldots \ttapp x@n$. Terms allow more sophisticated |
|
1486 |
transformations than \AST{}s do, typically involving abstractions and bound |
|
1487 |
variables. |
|
1488 |
||
1489 |
Regardless of whether they act on terms or \AST{}s, |
|
1490 |
parse translations differ from print translations fundamentally: |
|
1491 |
\begin{description} |
|
1492 |
\item[Parse translations] are applied bottom-up. The arguments are already |
|
1493 |
in translated form. The translations must not fail; exceptions trigger |
|
1494 |
an error message. |
|
1495 |
||
1496 |
\item[Print translations] are applied top-down. They are supplied with |
|
1497 |
arguments that are partly still in internal form. The result again |
|
1498 |
undergoes translation; therefore a print translation should not introduce |
|
1499 |
as head the very constant that invoked it. The function may raise |
|
1500 |
exception \ttindex{Match} to indicate failure; in this event it has no |
|
1501 |
effect. |
|
1502 |
\end{description} |
|
1503 |
||
1504 |
Only constant atoms --- constructor \ttindex{Constant} for \AST{}s and |
|
1505 |
\ttindex{Const} for terms --- can invoke translation functions. This |
|
1506 |
causes another difference between parsing and printing. |
|
1507 |
||
1508 |
Parsing starts with a string and the constants are not yet identified. |
|
1509 |
Only parse tree heads create {\tt Constant}s in the resulting \AST; recall |
|
1510 |
$ast_of_pt$ in \S\ref{sec:asts}. Macros and parse \AST{} translations may |
|
1511 |
introduce further {\tt Constant}s. When the final \AST{} is converted to a |
|
1512 |
term, all {\tt Constant}s become {\tt Const}s; recall $term_of_ast$ in |
|
1513 |
\S\ref{sec:asts}. |
|
1514 |
||
1515 |
Printing starts with a well-typed term and all the constants are known. So |
|
1516 |
all logical constants and type constructors may invoke print translations. |
|
1517 |
These, and macros, may introduce further constants. |
|
1518 |
||
1519 |
||
1520 |
\subsection{Example: a print translation for dependent types} |
|
1521 |
\indexbold{*_K}\indexbold{*dependent_tr'} |
|
1522 |
Let us continue the dependent type example (page~\pageref{prod_trans}) by |
|
1523 |
examining the parse translation for {\tt _K} and the print translation |
|
1524 |
{\tt dependent_tr'}, which are both built-in. By convention, parse |
|
1525 |
translations have names ending with {\tt _tr} and print translations have |
|
1526 |
names ending with {\tt _tr'}. Search for such names in the Isabelle |
|
1527 |
sources to locate more examples. |
|
1528 |
||
1529 |
Here is the parse translation for {\tt _K}: |
|
1530 |
\begin{ttbox} |
|
1531 |
fun k_tr [t] = Abs ("x", dummyT, incr_boundvars 1 t) |
|
1532 |
| k_tr ts = raise TERM("k_tr",ts); |
|
1533 |
\end{ttbox} |
|
1534 |
If {\tt k_tr} is called with exactly one argument~$t$, it creates a new |
|
1535 |
{\tt Abs} node with a body derived from $t$. Since terms given to parse |
|
1536 |
translations are not yet typed, the type of the bound variable in the new |
|
1537 |
{\tt Abs} is simply {\tt dummyT}. The function increments all {\tt Bound} |
|
1538 |
nodes referring to outer abstractions by calling \ttindex{incr_boundvars}, |
|
1539 |
a basic term manipulation function defined in {\tt Pure/term.ML}. |
|
1540 |
||
1541 |
Here is the print translation for dependent types: |
|
1542 |
\begin{ttbox} |
|
1543 |
fun dependent_tr' (q,r) (A :: Abs (x, T, B) :: ts) = |
|
1544 |
if 0 mem (loose_bnos B) then |
|
1545 |
let val (x', B') = variant_abs (x, dummyT, B); |
|
1546 |
in list_comb (Const (q, dummyT) $ Free (x', T) $ A $ B', ts) |
|
1547 |
end |
|
1548 |
else list_comb (Const (r, dummyT) $ A $ B, ts) |
|
1549 |
| dependent_tr' _ _ = raise Match; |
|
1550 |
\end{ttbox} |
|
1551 |
The argument {\tt (q,r)} is supplied to {\tt dependent_tr'} by a curried |
|
1552 |
function application during its installation. We could set up print |
|
1553 |
translations for both {\tt Pi} and {\tt Sigma} by including |
|
1554 |
\begin{ttbox} |
|
1555 |
val print_translation = |
|
1556 |
[("Pi", dependent_tr' ("{\at}PROD", "{\at}->")), |
|
1557 |
("Sigma", dependent_tr' ("{\at}SUM", "{\at}*"))]; |
|
1558 |
\end{ttbox} |
|
1559 |
within the {\tt ML} section. The first of these transforms ${\tt Pi}(A, |
|
1560 |
\mtt{Abs}(x, T, B))$ into $\hbox{\tt{\at}PROD}(x', A, B')$ or |
|
1561 |
$\hbox{\tt{\at}->}r(A, B)$, choosing the latter form if $B$ does not depend |
|
1562 |
on~$x$. It checks this using \ttindex{loose_bnos}, yet another function |
|
1563 |
from {\tt Pure/term.ML}. Note that $x'$ is a version of $x$ renamed away |
|
1564 |
from all names in $B$, and $B'$ the body $B$ with {\tt Bound} nodes |
|
1565 |
referring to our {\tt Abs} node replaced by $\ttfct{Free} (x', |
|
1566 |
\mtt{dummyT})$. |
|
1567 |
||
1568 |
We must be careful with types here. While types of {\tt Const}s are |
|
1569 |
ignored, type constraints may be printed for some {\tt Free}s and |
|
1570 |
{\tt Var}s if \ttindex{show_types} is set to {\tt true}. Variables of type |
|
1571 |
\ttindex{dummyT} are never printed with constraint, though. The line |
|
1572 |
\begin{ttbox} |
|
1573 |
let val (x', B') = variant_abs (x, dummyT, B); |
|
1574 |
\end{ttbox}\index{*variant_abs} |
|
1575 |
replaces bound variable occurrences in~$B$ by the free variable $x'$ with |
|
1576 |
type {\tt dummyT}. Only the binding occurrence of~$x'$ is given the |
|
1577 |
correct type~{\tt T}, so this is the only place where a type |
|
1578 |
constraint might appear. |
|
1579 |
\index{translations|)} |
|
1580 |
||
1581 |
||
1582 |