author | clasohm |
Mon, 14 Nov 1994 14:29:20 +0100 | |
changeset 711 | bb868a30e66f |
parent 452 | 395bbf6e55f9 |
child 864 | d63b111b917a |
permissions | -rw-r--r-- |
320 | 1 |
%% $Id$ |
2 |
\chapter{Defining Logics} \label{Defining-Logics} |
|
3 |
This chapter explains how to define new formal systems --- in particular, |
|
4 |
their concrete syntax. While Isabelle can be regarded as a theorem prover |
|
5 |
for set theory, higher-order logic or the sequent calculus, its |
|
6 |
distinguishing feature is support for the definition of new logics. |
|
7 |
||
8 |
Isabelle logics are hierarchies of theories, which are described and |
|
9 |
illustrated in |
|
10 |
\iflabelundefined{sec:defining-theories}{{\em Introduction to Isabelle}}% |
|
11 |
{\S\ref{sec:defining-theories}}. That material, together with the theory |
|
12 |
files provided in the examples directories, should suffice for all simple |
|
13 |
applications. The easiest way to define a new theory is by modifying a |
|
14 |
copy of an existing theory. |
|
15 |
||
16 |
This chapter documents the meta-logic syntax, mixfix declarations and |
|
17 |
pretty printing. The extended examples in \S\ref{sec:min_logics} |
|
18 |
demonstrate the logical aspects of the definition of theories. |
|
19 |
||
20 |
||
21 |
\section{Priority grammars} \label{sec:priority_grammars} |
|
22 |
\index{priority grammars|(} |
|
23 |
||
24 |
A context-free grammar contains a set of {\bf nonterminal symbols}, a set of |
|
25 |
{\bf terminal symbols} and a set of {\bf productions}\index{productions}. |
|
26 |
Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and |
|
27 |
$\gamma$ is a string of terminals and nonterminals. One designated |
|
28 |
nonterminal is called the {\bf start symbol}. The language defined by the |
|
29 |
grammar consists of all strings of terminals that can be derived from the |
|
30 |
start symbol by applying productions as rewrite rules. |
|
31 |
||
32 |
The syntax of an Isabelle logic is specified by a {\bf priority |
|
33 |
grammar}.\index{priorities} Each nonterminal is decorated by an integer |
|
34 |
priority, as in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be |
|
35 |
rewritten using a production $A^{(q)} = \gamma$ only if~$p \le q$. Any |
|
36 |
priority grammar can be translated into a normal context free grammar by |
|
37 |
introducing new nonterminals and productions. |
|
38 |
||
39 |
Formally, a set of context free productions $G$ induces a derivation |
|
40 |
relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of |
|
41 |
terminal or nonterminal symbols. Then |
|
42 |
\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] |
|
43 |
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$. |
|
44 |
||
45 |
The following simple grammar for arithmetic expressions demonstrates how |
|
46 |
binding power and associativity of operators can be enforced by priorities. |
|
47 |
\begin{center} |
|
48 |
\begin{tabular}{rclr} |
|
49 |
$A^{(9)}$ & = & {\tt0} \\ |
|
50 |
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\ |
|
51 |
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\ |
|
52 |
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\ |
|
53 |
$A^{(3)}$ & = & {\tt-} $A^{(3)}$ |
|
54 |
\end{tabular} |
|
55 |
\end{center} |
|
56 |
The choice of priorities determines that {\tt -} binds tighter than {\tt *}, |
|
57 |
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the |
|
58 |
left and {\tt *} to the right. |
|
59 |
||
60 |
For clarity, grammars obey these conventions: |
|
61 |
\begin{itemize} |
|
62 |
\item All priorities must lie between~0 and \ttindex{max_pri}, which is a |
|
63 |
some fixed integer. Sometimes {\tt max_pri} is written as $\infty$. |
|
64 |
\item Priority 0 on the right-hand side and priority \ttindex{max_pri} on |
|
65 |
the left-hand side may be omitted. |
|
66 |
\item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the |
|
67 |
priority of the left-hand side actually appears in a column on the far |
|
68 |
right. |
|
69 |
\item Alternatives are separated by~$|$. |
|
70 |
\item Repetition is indicated by dots~(\dots) in an informal but obvious |
|
71 |
way. |
|
72 |
\end{itemize} |
|
73 |
||
74 |
Using these conventions and assuming $\infty=9$, the grammar |
|
75 |
takes the form |
|
76 |
\begin{center} |
|
77 |
\begin{tabular}{rclc} |
|
78 |
$A$ & = & {\tt0} & \hspace*{4em} \\ |
|
79 |
& $|$ & {\tt(} $A$ {\tt)} \\ |
|
80 |
& $|$ & $A$ {\tt+} $A^{(1)}$ & (0) \\ |
|
81 |
& $|$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\ |
|
82 |
& $|$ & {\tt-} $A^{(3)}$ & (3) |
|
83 |
\end{tabular} |
|
84 |
\end{center} |
|
85 |
\index{priority grammars|)} |
|
86 |
||
87 |
||
88 |
\begin{figure} |
|
89 |
\begin{center} |
|
90 |
\begin{tabular}{rclc} |
|
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
91 |
$any$ &=& $prop$ ~~$|$~~ $logic$ \\\\ |
320 | 92 |
$prop$ &=& {\tt PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\ |
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
93 |
&$|$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\ |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
94 |
&$|$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\ |
320 | 95 |
&$|$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\ |
96 |
&$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop^{(1)}$ & (1) \\ |
|
97 |
&$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\ |
|
98 |
$aprop$ &=& $id$ ~~$|$~~ $var$ |
|
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
99 |
~~$|$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\ |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
100 |
$logic$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $logic$ {\tt)} \\ |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
101 |
&$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\ |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
102 |
&$|$& $logic^{(4)}$ {\tt::} $type$ & (4) \\ |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
103 |
&$|$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\ |
320 | 104 |
$idts$ &=& $idt$ ~~$|$~~ $idt^{(1)}$ $idts$ \\\\ |
105 |
$idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\ |
|
106 |
&$|$& $id$ {\tt ::} $type$ & (0) \\\\ |
|
107 |
$type$ &=& $tid$ ~~$|$~~ $tvar$ ~~$|$~~ $tid$ {\tt::} $sort$ |
|
108 |
~~$|$~~ $tvar$ {\tt::} $sort$ \\ |
|
109 |
&$|$& $id$ ~~$|$~~ $type^{(\infty)}$ $id$ |
|
110 |
~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ |
|
111 |
&$|$& $type^{(1)}$ {\tt =>} $type$ & (0) \\ |
|
112 |
&$|$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\ |
|
113 |
&$|$& {\tt(} $type$ {\tt)} \\\\ |
|
114 |
$sort$ &=& $id$ ~~$|$~~ {\tt\ttlbrace\ttrbrace} |
|
115 |
~~$|$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace} |
|
116 |
\end{tabular} |
|
117 |
\index{*PROP symbol} |
|
118 |
\index{*== symbol}\index{*=?= symbol}\index{*==> symbol} |
|
119 |
\index{*:: symbol}\index{*=> symbol} |
|
332 | 120 |
\index{sort constraints} |
121 |
%the index command: a percent is permitted, but braces must match! |
|
320 | 122 |
\index{%@{\tt\%} symbol} |
123 |
\index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol} |
|
124 |
\index{*[ symbol}\index{*] symbol} |
|
125 |
\index{*"!"! symbol} |
|
126 |
\index{*"["| symbol} |
|
127 |
\index{*"|"] symbol} |
|
128 |
\end{center} |
|
129 |
\caption{Meta-logic syntax}\label{fig:pure_gram} |
|
130 |
\end{figure} |
|
131 |
||
132 |
||
133 |
\section{The Pure syntax} \label{sec:basic_syntax} |
|
134 |
\index{syntax!Pure|(} |
|
135 |
||
136 |
At the root of all object-logics lies the theory \thydx{Pure}. It |
|
137 |
contains, among many other things, the Pure syntax. An informal account of |
|
138 |
this basic syntax (types, terms and formulae) appears in |
|
139 |
\iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}% |
|
140 |
{\S\ref{sec:forward}}. A more precise description using a priority grammar |
|
141 |
appears in Fig.\ts\ref{fig:pure_gram}. It defines the following |
|
142 |
nonterminals: |
|
143 |
\begin{ttdescription} |
|
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
144 |
\item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae |
320 | 145 |
of the meta-logic. |
146 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
147 |
\item[\ndxbold{aprop}] denotes atomic propositions. These typically |
320 | 148 |
include the judgement forms of the object-logic; its definition |
149 |
introduces a meta-level predicate for each judgement form. |
|
150 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
151 |
\item[\ndxbold{logic}] denotes terms whose type belongs to class |
320 | 152 |
\cldx{logic}. As the syntax is extended by new object-logics, more |
153 |
productions for {\tt logic} are added automatically (see below). |
|
154 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
155 |
\item[\ndxbold{any}] denotes terms that either belong to {\tt prop} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
156 |
or {\tt logic}. |
320 | 157 |
|
158 |
\item[\ndxbold{type}] denotes types of the meta-logic. |
|
159 |
||
160 |
\item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained |
|
161 |
by types. |
|
162 |
\end{ttdescription} |
|
163 |
||
164 |
\begin{warn} |
|
165 |
In {\tt idts}, note that \verb|x::nat y| is parsed as \verb|x::(nat y)|, |
|
166 |
treating {\tt y} like a type constructor applied to {\tt nat}. The |
|
167 |
likely result is an error message. To avoid this interpretation, use |
|
168 |
parentheses and write \verb|(x::nat) y|. |
|
332 | 169 |
\index{type constraints}\index{*:: symbol} |
320 | 170 |
|
171 |
Similarly, \verb|x::nat y::nat| is parsed as \verb|x::(nat y::nat)| and |
|
172 |
yields an error. The correct form is \verb|(x::nat) (y::nat)|. |
|
173 |
\end{warn} |
|
174 |
||
175 |
\subsection{Logical types and default syntax}\label{logical-types} |
|
176 |
\index{lambda calc@$\lambda$-calculus} |
|
177 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
178 |
Isabelle's representation of mathematical languages is based on the |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
179 |
simply typed $\lambda$-calculus. All logical types, namely those of |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
180 |
class \cldx{logic}, are automatically equipped with a basic syntax of |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
181 |
types, identifiers, variables, parentheses, $\lambda$-abstractions and |
320 | 182 |
applications. |
183 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
184 |
More precisely, Isabelle internally replaces every nonterminal by |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
185 |
$logic$ if it belongs to a subclass of \cldx{logic}. Thereby these |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
186 |
productions (which actually are productions of the nonterminal |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
187 |
$logic$) can be used for $ty$: |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
188 |
|
320 | 189 |
\begin{center} |
190 |
\begin{tabular}{rclc} |
|
191 |
$ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\ |
|
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
192 |
&$|$& $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)}\\ |
452 | 193 |
&$|$& $ty^{(4)}$ {\tt::} $type$ ~~~~~~~ (3) \\\\ |
320 | 194 |
\end{tabular} |
195 |
\end{center} |
|
196 |
||
452 | 197 |
\begin{warn} |
198 |
Type constraints bind very weakly. For example, \verb!x<y::nat! is normally |
|
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
199 |
parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in |
452 | 200 |
which case the string is likely to be ambiguous. The correct form is |
201 |
\verb!x<(y::nat)!. |
|
202 |
\end{warn} |
|
320 | 203 |
|
204 |
\subsection{Lexical matters} |
|
205 |
The parser does not process input strings directly. It operates on token |
|
206 |
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of |
|
207 |
tokens: \bfindex{delimiters} and \bfindex{name tokens}. |
|
208 |
||
209 |
\index{reserved words} |
|
210 |
Delimiters can be regarded as reserved words of the syntax. You can |
|
211 |
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they |
|
212 |
appear in typewriter font, for example {\tt ==}, {\tt =?=} and |
|
213 |
{\tt PROP}\@. |
|
214 |
||
215 |
Name tokens have a predefined syntax. The lexer distinguishes four |
|
216 |
disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type |
|
217 |
identifiers\index{type identifiers}, type unknowns\index{type unknowns}. |
|
218 |
They are denoted by \ndxbold{id}, \ndxbold{var}, \ndxbold{tid}, |
|
219 |
\ndxbold{tvar}, respectively. Typical examples are {\tt x}, {\tt ?x7}, |
|
220 |
{\tt 'a}, {\tt ?'a3}. Here is the precise syntax: |
|
221 |
\begin{eqnarray*} |
|
222 |
id & = & letter~quasiletter^* \\ |
|
223 |
var & = & \mbox{\tt ?}id ~~|~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ |
|
224 |
tid & = & \mbox{\tt '}id \\ |
|
225 |
tvar & = & \mbox{\tt ?}tid ~~|~~ |
|
226 |
\mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex] |
|
227 |
letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\ |
|
228 |
digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\ |
|
229 |
quasiletter & = & letter ~~|~~ digit ~~|~~ \mbox{\tt _} ~~|~~ \mbox{\tt '} \\ |
|
230 |
nat & = & digit^+ |
|
231 |
\end{eqnarray*} |
|
232 |
A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally |
|
233 |
a pair of base name and index (\ML\ type \mltydx{indexname}). These |
|
234 |
components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or |
|
235 |
run together as in {\tt ?x1}. The latter form is possible if the base name |
|
236 |
does not end with digits. If the index is 0, it may be dropped altogether: |
|
237 |
{\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}. |
|
238 |
||
239 |
The lexer repeatedly takes the maximal prefix of the input string that |
|
240 |
forms a valid token. A maximal prefix that is both a delimiter and a name |
|
241 |
is treated as a delimiter. Spaces, tabs and newlines are separators; they |
|
242 |
never occur within tokens. |
|
243 |
||
244 |
Delimiters need not be separated by white space. For example, if {\tt -} |
|
245 |
is a delimiter but {\tt --} is not, then the string {\tt --} is treated as |
|
246 |
two consecutive occurrences of the token~{\tt -}. In contrast, \ML\ |
|
247 |
treats {\tt --} as a single symbolic name. The consequence of Isabelle's |
|
248 |
more liberal scheme is that the same string may be parsed in different ways |
|
249 |
after extending the syntax: after adding {\tt --} as a delimiter, the input |
|
250 |
{\tt --} is treated as a single token. |
|
251 |
||
252 |
Although name tokens are returned from the lexer rather than the parser, it |
|
253 |
is more logical to regard them as nonterminals. Delimiters, however, are |
|
254 |
terminals; they are just syntactic sugar and contribute nothing to the |
|
255 |
abstract syntax tree. |
|
256 |
||
257 |
||
258 |
\subsection{*Inspecting the syntax} |
|
259 |
\begin{ttbox} |
|
260 |
syn_of : theory -> Syntax.syntax |
|
261 |
Syntax.print_syntax : Syntax.syntax -> unit |
|
262 |
Syntax.print_gram : Syntax.syntax -> unit |
|
263 |
Syntax.print_trans : Syntax.syntax -> unit |
|
264 |
\end{ttbox} |
|
265 |
The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes |
|
266 |
in \ML. You can display values of this type by calling the following |
|
267 |
functions: |
|
268 |
\begin{ttdescription} |
|
269 |
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle |
|
270 |
theory~{\it thy} as an \ML\ value. |
|
271 |
||
272 |
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all |
|
273 |
information contained in the syntax {\it syn}. The displayed output can |
|
274 |
be large. The following two functions are more selective. |
|
275 |
||
276 |
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part |
|
277 |
of~{\it syn}, namely the lexicon, roots and productions. These are |
|
278 |
discussed below. |
|
279 |
||
280 |
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation |
|
281 |
part of~{\it syn}, namely the constants, parse/print macros and |
|
282 |
parse/print translations. |
|
283 |
\end{ttdescription} |
|
284 |
||
285 |
Let us demonstrate these functions by inspecting Pure's syntax. Even that |
|
286 |
is too verbose to display in full. |
|
287 |
\begin{ttbox}\index{*Pure theory} |
|
288 |
Syntax.print_syntax (syn_of Pure.thy); |
|
289 |
{\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots} |
|
290 |
{\out roots: logic type fun prop} |
|
291 |
{\out prods:} |
|
292 |
{\out type = tid (1000)} |
|
293 |
{\out type = tvar (1000)} |
|
294 |
{\out type = id (1000)} |
|
295 |
{\out type = tid "::" sort[0] => "_ofsort" (1000)} |
|
296 |
{\out type = tvar "::" sort[0] => "_ofsort" (1000)} |
|
297 |
{\out \vdots} |
|
298 |
\ttbreak |
|
299 |
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots} |
|
300 |
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"} |
|
301 |
{\out "_idtyp" "_lambda" "_tapp" "_tappl"} |
|
302 |
{\out parse_rules:} |
|
303 |
{\out parse_translation: "!!" "_K" "_abs" "_aprop"} |
|
304 |
{\out print_translation: "all"} |
|
305 |
{\out print_rules:} |
|
306 |
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"} |
|
307 |
\end{ttbox} |
|
308 |
||
332 | 309 |
As you can see, the output is divided into labelled sections. The grammar |
320 | 310 |
is represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest |
311 |
refers to syntactic translations and macro expansion. Here is an |
|
312 |
explanation of the various sections. |
|
313 |
\begin{description} |
|
314 |
\item[{\tt lexicon}] lists the delimiters used for lexical |
|
315 |
analysis.\index{delimiters} |
|
316 |
||
317 |
\item[{\tt roots}] lists the grammar's nonterminal symbols. You must |
|
318 |
name the desired root when calling lower level functions or specifying |
|
319 |
macros. Higher level functions usually expect a type and derive the |
|
320 |
actual root as described in~\S\ref{sec:grammar}. |
|
321 |
||
322 |
\item[{\tt prods}] lists the \rmindex{productions} of the priority grammar. |
|
323 |
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}. |
|
324 |
Each delimiter is quoted. Some productions are shown with {\tt =>} and |
|
325 |
an attached string. These strings later become the heads of parse |
|
326 |
trees; they also play a vital role when terms are printed (see |
|
327 |
\S\ref{sec:asts}). |
|
328 |
||
329 |
Productions with no strings attached are called {\bf copy |
|
330 |
productions}\indexbold{productions!copy}. Their right-hand side must |
|
331 |
have exactly one nonterminal symbol (or name token). The parser does |
|
332 |
not create a new parse tree node for copy productions, but simply |
|
333 |
returns the parse tree of the right-hand symbol. |
|
334 |
||
335 |
If the right-hand side consists of a single nonterminal with no |
|
336 |
delimiters, then the copy production is called a {\bf chain |
|
337 |
production}. Chain productions act as abbreviations: |
|
338 |
conceptually, they are removed from the grammar by adding new |
|
339 |
productions. Priority information attached to chain productions is |
|
340 |
ignored; only the dummy value $-1$ is displayed. |
|
341 |
||
342 |
\item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}] |
|
343 |
relate to macros (see \S\ref{sec:macros}). |
|
344 |
||
345 |
\item[{\tt parse_ast_translation}, {\tt print_ast_translation}] |
|
346 |
list sets of constants that invoke translation functions for abstract |
|
347 |
syntax trees. Section \S\ref{sec:asts} below discusses this obscure |
|
348 |
matter.\index{constants!for translations} |
|
349 |
||
350 |
\item[{\tt parse_translation}, {\tt print_translation}] list sets |
|
351 |
of constants that invoke translation functions for terms (see |
|
352 |
\S\ref{sec:tr_funs}). |
|
353 |
\end{description} |
|
354 |
\index{syntax!Pure|)} |
|
355 |
||
356 |
||
357 |
\section{Mixfix declarations} \label{sec:mixfix} |
|
358 |
\index{mixfix declarations|(} |
|
359 |
||
360 |
When defining a theory, you declare new constants by giving their names, |
|
361 |
their type, and an optional {\bf mixfix annotation}. Mixfix annotations |
|
362 |
allow you to extend Isabelle's basic $\lambda$-calculus syntax with |
|
363 |
readable notation. They can express any context-free priority grammar. |
|
364 |
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more |
|
365 |
general than the priority declarations of \ML\ and Prolog. |
|
366 |
||
367 |
A mixfix annotation defines a production of the priority grammar. It |
|
368 |
describes the concrete syntax, the translation to abstract syntax, and the |
|
369 |
pretty printing. Special case annotations provide a simple means of |
|
370 |
specifying infix operators, binders and so forth. |
|
371 |
||
372 |
\subsection{Grammar productions}\label{sec:grammar}\index{productions} |
|
373 |
||
374 |
Let us examine the treatment of the production |
|
375 |
\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\, |
|
376 |
A@n^{(p@n)}\, w@n. \] |
|
377 |
Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$, |
|
378 |
\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals. |
|
379 |
In the corresponding mixfix annotation, the priorities are given separately |
|
380 |
as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are identified with |
|
381 |
types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's |
|
382 |
effect on nonterminals is expressed as the function type |
|
383 |
\[ [\tau@1, \ldots, \tau@n]\To \tau. \] |
|
384 |
Finally, the template |
|
385 |
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \] |
|
386 |
describes the strings of terminals. |
|
387 |
||
388 |
A simple type is typically declared for each nonterminal symbol. In |
|
389 |
first-order logic, type~$i$ stands for terms and~$o$ for formulae. Only |
|
390 |
the outermost type constructor is taken into account. For example, any |
|
391 |
type of the form $\sigma list$ stands for a list; productions may refer |
|
332 | 392 |
to the symbol {\tt list} and will apply to lists of any type. |
320 | 393 |
|
394 |
The symbol associated with a type is called its {\bf root} since it may |
|
395 |
serve as the root of a parse tree. Precisely, the root of $(\tau@1, \dots, |
|
396 |
\tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is |
|
397 |
a type constructor. Type infixes are a special case of this; in |
|
398 |
particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}. Finally, the |
|
399 |
root of a type variable is {\tt logic}; general productions might |
|
400 |
refer to this nonterminal. |
|
401 |
||
402 |
Identifying nonterminals with types allows a constant's type to specify |
|
403 |
syntax as well. We can declare the function~$f$ to have type $[\tau@1, |
|
404 |
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the |
|
405 |
layout of the function's $n$ arguments. The constant's name, in this |
|
406 |
case~$f$, will also serve as the label in the abstract syntax tree. There |
|
407 |
are two exceptions to this treatment of constants: |
|
408 |
\begin{enumerate}\index{constants!syntactic} |
|
409 |
\item A production need not map directly to a logical function. In this |
|
410 |
case, you must declare a constant whose purpose is purely syntactic. |
|
411 |
By convention such constants begin with the symbol~{\tt\at}, |
|
412 |
ensuring that they can never be written in formulae. |
|
413 |
||
414 |
\item A copy production has no associated constant.\index{productions!copy} |
|
415 |
\end{enumerate} |
|
416 |
There is something artificial about this representation of productions, |
|
417 |
but it is convenient, particularly for simple theory extensions. |
|
418 |
||
419 |
\subsection{The general mixfix form} |
|
420 |
Here is a detailed account of mixfix declarations. Suppose the following |
|
421 |
line occurs within the {\tt consts} section of a {\tt .thy} file: |
|
422 |
\begin{center} |
|
423 |
{\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)} |
|
424 |
\end{center} |
|
332 | 425 |
This constant declaration and mixfix annotation are interpreted as follows: |
320 | 426 |
\begin{itemize}\index{productions} |
427 |
\item The string {\tt $c$} is the name of the constant associated with the |
|
428 |
production; unless it is a valid identifier, it must be enclosed in |
|
429 |
quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy |
|
430 |
production.\index{productions!copy} Otherwise, parsing an instance of the |
|
431 |
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$ |
|
432 |
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$-th |
|
433 |
argument. |
|
434 |
||
435 |
\item The constant $c$, if non-empty, is declared to have type $\sigma$. |
|
436 |
||
437 |
\item The string $template$ specifies the right-hand side of |
|
438 |
the production. It has the form |
|
439 |
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] |
|
440 |
where each occurrence of {\tt_} denotes an argument position and |
|
441 |
the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in |
|
442 |
the concrete syntax, you must escape it as described below.) The $w@i$ |
|
443 |
may consist of \rmindex{delimiters}, spaces or |
|
444 |
\rmindex{pretty printing} annotations (see below). |
|
445 |
||
446 |
\item The type $\sigma$ specifies the production's nonterminal symbols |
|
447 |
(or name tokens). If $template$ is of the form above then $\sigma$ |
|
448 |
must be a function type with at least~$n$ argument positions, say |
|
449 |
$\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are |
|
450 |
derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described |
|
451 |
above. Any of these may be function types; the corresponding root is |
|
452 |
then \tydx{fun}. |
|
453 |
||
454 |
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt |
|
455 |
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal |
|
456 |
priority\indexbold{priorities} required of any phrase that may appear |
|
457 |
as the $i$-th argument. Missing priorities default to~0. |
|
458 |
||
459 |
\item The integer $p$ is the priority of this production. If omitted, it |
|
460 |
defaults to the maximal priority. |
|
461 |
Priorities range between 0 and \ttindexbold{max_pri} (= 1000). |
|
462 |
\end{itemize} |
|
463 |
% |
|
464 |
The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no |
|
465 |
priorities. The resulting production puts no priority constraints on any |
|
466 |
of its arguments and has maximal priority itself. Omitting priorities in |
|
467 |
this manner will introduce syntactic ambiguities unless the production's |
|
468 |
right-hand side is fully bracketed, as in \verb|"if _ then _ else _ fi"|. |
|
469 |
||
470 |
Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"}, |
|
471 |
is sensible only if~$c$ is an identifier. Otherwise you will be unable to |
|
472 |
write terms involving~$c$. |
|
473 |
||
474 |
\begin{warn} |
|
475 |
Theories must sometimes declare types for purely syntactic purposes. One |
|
476 |
example is \tydx{type}, the built-in type of types. This is a `type of |
|
477 |
all types' in the syntactic sense only. Do not declare such types under |
|
478 |
{\tt arities} as belonging to class {\tt logic}\index{*logic class}, for |
|
479 |
that would allow their use in arbitrary Isabelle |
|
480 |
expressions~(\S\ref{logical-types}). |
|
481 |
\end{warn} |
|
482 |
||
483 |
\subsection{Example: arithmetic expressions} |
|
484 |
\index{examples!of mixfix declarations} |
|
485 |
This theory specification contains a {\tt consts} section with mixfix |
|
486 |
declarations encoding the priority grammar from |
|
487 |
\S\ref{sec:priority_grammars}: |
|
488 |
\begin{ttbox} |
|
489 |
EXP = Pure + |
|
490 |
types |
|
491 |
exp |
|
492 |
arities |
|
493 |
exp :: logic |
|
494 |
consts |
|
495 |
"0" :: "exp" ("0" 9) |
|
496 |
"+" :: "[exp, exp] => exp" ("_ + _" [0, 1] 0) |
|
497 |
"*" :: "[exp, exp] => exp" ("_ * _" [3, 2] 2) |
|
498 |
"-" :: "exp => exp" ("- _" [3] 3) |
|
499 |
end |
|
500 |
\end{ttbox} |
|
501 |
The {\tt arities} declaration causes {\tt exp} to be added as a new root. |
|
332 | 502 |
If you put this into a file {\tt EXP.thy} and load it via {\tt |
320 | 503 |
use_thy "EXP"}, you can run some tests: |
504 |
\begin{ttbox} |
|
505 |
val read_exp = Syntax.test_read (syn_of EXP.thy) "exp"; |
|
506 |
{\out val it = fn : string -> unit} |
|
507 |
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0"; |
|
508 |
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"} |
|
509 |
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")} |
|
510 |
{\out \vdots} |
|
511 |
read_exp "0 + - 0 + 0"; |
|
512 |
{\out tokens: "0" "+" "-" "0" "+" "0"} |
|
513 |
{\out raw: ("+" ("+" "0" ("-" "0")) "0")} |
|
514 |
{\out \vdots} |
|
515 |
\end{ttbox} |
|
516 |
The output of \ttindex{Syntax.test_read} includes the token list ({\tt |
|
517 |
tokens}) and the raw \AST{} directly derived from the parse tree, |
|
518 |
ignoring parse \AST{} translations. The rest is tracing information |
|
519 |
provided by the macro expander (see \S\ref{sec:macros}). |
|
520 |
||
521 |
Executing {\tt Syntax.print_gram} reveals the productions derived |
|
522 |
from our mixfix declarations (lots of additional information deleted): |
|
523 |
\begin{ttbox} |
|
524 |
Syntax.print_gram (syn_of EXP.thy); |
|
525 |
{\out exp = "0" => "0" (9)} |
|
526 |
{\out exp = exp[0] "+" exp[1] => "+" (0)} |
|
527 |
{\out exp = exp[3] "*" exp[2] => "*" (2)} |
|
528 |
{\out exp = "-" exp[3] => "-" (3)} |
|
529 |
\end{ttbox} |
|
530 |
||
531 |
||
532 |
\subsection{The mixfix template} |
|
533 |
Let us take a closer look at the string $template$ appearing in mixfix |
|
534 |
annotations. This string specifies a list of parsing and printing |
|
535 |
directives: delimiters\index{delimiters}, arguments, spaces, blocks of |
|
536 |
indentation and line breaks. These are encoded by the following character |
|
537 |
sequences: |
|
538 |
\index{pretty printing|(} |
|
539 |
\begin{description} |
|
540 |
\item[~$d$~] is a delimiter, namely a non-empty sequence of characters |
|
541 |
other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}. |
|
542 |
Even these characters may appear if escaped; this means preceding it with |
|
543 |
a~{\tt '} (single quote). Thus you have to write {\tt ''} if you really |
|
544 |
want a single quote. Delimiters may never contain spaces. |
|
545 |
||
546 |
\item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol |
|
547 |
or name token. |
|
548 |
||
549 |
\item[~$s$~] is a non-empty sequence of spaces for printing. This and the |
|
550 |
following specifications do not affect parsing at all. |
|
551 |
||
552 |
\item[~{\tt(}$n$~] opens a pretty printing block. The optional number $n$ |
|
553 |
specifies how much indentation to add when a line break occurs within the |
|
554 |
block. If {\tt(} is not followed by digits, the indentation defaults |
|
555 |
to~0. |
|
556 |
||
557 |
\item[~{\tt)}~] closes a pretty printing block. |
|
558 |
||
559 |
\item[~{\tt//}~] forces a line break. |
|
560 |
||
561 |
\item[~{\tt/}$s$~] allows a line break. Here $s$ stands for the string of |
|
562 |
spaces (zero or more) right after the {\tt /} character. These spaces |
|
563 |
are printed if the break is not taken. |
|
564 |
\end{description} |
|
565 |
For example, the template {\tt"(_ +/ _)"} specifies an infix operator. |
|
566 |
There are two argument positions; the delimiter~{\tt+} is preceded by a |
|
567 |
space and followed by a space or line break; the entire phrase is a pretty |
|
568 |
printing block. Other examples appear in Fig.\ts\ref{fig:set_trans} below. |
|
569 |
Isabelle's pretty printer resembles the one described in |
|
570 |
Paulson~\cite{paulson91}. |
|
571 |
||
572 |
\index{pretty printing|)} |
|
573 |
||
574 |
||
575 |
\subsection{Infixes} |
|
576 |
\indexbold{infixes} |
|
577 |
||
578 |
Infix operators associating to the left or right can be declared |
|
579 |
using {\tt infixl} or {\tt infixr}. |
|
580 |
Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)} |
|
581 |
abbreviates the constant declarations |
|
582 |
\begin{ttbox} |
|
583 |
"op \(c\)" :: "\(\sigma\)" ("op \(c\)") |
|
584 |
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) |
|
585 |
\end{ttbox} |
|
586 |
and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the constant declarations |
|
587 |
\begin{ttbox} |
|
588 |
"op \(c\)" :: "\(\sigma\)" ("op \(c\)") |
|
589 |
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) |
|
590 |
\end{ttbox} |
|
591 |
The infix operator is declared as a constant with the prefix {\tt op}. |
|
592 |
Thus, prefixing infixes with \sdx{op} makes them behave like ordinary |
|
593 |
function symbols, as in \ML. Special characters occurring in~$c$ must be |
|
594 |
escaped, as in delimiters, using a single quote. |
|
595 |
||
596 |
The expanded forms above would be actually illegal in a {\tt .thy} file |
|
597 |
because they declare the constant \hbox{\tt"op \(c\)"} twice. |
|
598 |
||
599 |
||
600 |
\subsection{Binders} |
|
601 |
\indexbold{binders} |
|
602 |
\begingroup |
|
603 |
\def\Q{{\cal Q}} |
|
604 |
A {\bf binder} is a variable-binding construct such as a quantifier. The |
|
605 |
constant declaration |
|
606 |
\begin{ttbox} |
|
607 |
\(c\) :: "\(\sigma\)" (binder "\(\Q\)" \(p\)) |
|
608 |
\end{ttbox} |
|
609 |
introduces a constant~$c$ of type~$\sigma$, which must have the form |
|
610 |
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where |
|
611 |
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$ |
|
612 |
and the whole term has type~$\tau@3$. Special characters in $\Q$ must be |
|
613 |
escaped using a single quote. |
|
614 |
||
615 |
The declaration is expanded internally to |
|
616 |
\begin{ttbox} |
|
617 |
\(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)" |
|
618 |
"\(\Q\)"\hskip-3pt :: "[idts, \(\tau@2\)] => \(\tau@3\)" ("(3\(\Q\)_./ _)" \(p\)) |
|
619 |
\end{ttbox} |
|
620 |
Here \ndx{idts} is the nonterminal symbol for a list of identifiers with |
|
332 | 621 |
\index{type constraints} |
320 | 622 |
optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The |
623 |
declaration also installs a parse translation\index{translations!parse} |
|
624 |
for~$\Q$ and a print translation\index{translations!print} for~$c$ to |
|
625 |
translate between the internal and external forms. |
|
626 |
||
627 |
A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a |
|
628 |
list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$ |
|
629 |
corresponds to the internal form |
|
630 |
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \] |
|
631 |
||
632 |
\medskip |
|
633 |
For example, let us declare the quantifier~$\forall$:\index{quantifiers} |
|
634 |
\begin{ttbox} |
|
635 |
All :: "('a => o) => o" (binder "ALL " 10) |
|
636 |
\end{ttbox} |
|
637 |
This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL |
|
638 |
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall |
|
639 |
back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL |
|
640 |
$x$.$P$} have type~$o$, the type of formulae, while the bound variable |
|
641 |
can be polymorphic. |
|
642 |
\endgroup |
|
643 |
||
644 |
\index{mixfix declarations|)} |
|
645 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
646 |
\section{Ambiguity of parsed expressions} \label{sec:ambiguity} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
647 |
\index{ambiguity!of parsed expressions} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
648 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
649 |
To keep the grammar small and allow common productions to be shared |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
650 |
all logical types are internally represented |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
651 |
by one nonterminal, namely {\tt logic}. This and omitted or too freely |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
652 |
chosen priorities may lead to ways of parsing an expression that were |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
653 |
not intended by the theory's maker. In most cases Isabelle is able to |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
654 |
select one of multiple parse trees that an expression has lead |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
655 |
to by checking which of them can be typed correctly. But this may not |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
656 |
work in every case and always slows down parsing. |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
657 |
The warning and error messages that can be produced during this process are |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
658 |
as follows: |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
659 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
660 |
If an ambiguity can be resolved by type inference this warning |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
661 |
is shown to remind the user that parsing is (unnecessarily) slowed |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
662 |
down: |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
663 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
664 |
\begin{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
665 |
{\out Warning: Ambiguous input "..."} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
666 |
{\out produces the following parse trees:} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
667 |
{\out ...} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
668 |
{\out Fortunately, only one parse tree is type correct.} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
669 |
{\out It helps (speed!) if you disambiguate your grammar or your input.} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
670 |
\end{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
671 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
672 |
The following message is normally caused by using the same |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
673 |
syntax in two different productions: |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
674 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
675 |
\begin{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
676 |
{\out Warning: Ambiguous input "..."} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
677 |
{\out produces the following parse trees:} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
678 |
{\out ...} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
679 |
{\out Error: More than one term is type correct:} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
680 |
{\out ...} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
681 |
\end{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
682 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
683 |
On the other hand it's also possible that none of the parse trees can be |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
684 |
typed correctly though the user did not make a mistake. By default Isabelle |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
685 |
assumes that the type of a syntax translation rule is {\tt logic} but does |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
686 |
not look at the type unless parsing the rule produces more than one parse |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
687 |
tree. In that case this message is output if the rule's type is different |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
688 |
from {\tt logic}: |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
689 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
690 |
\begin{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
691 |
{\out Warning: Ambiguous input "..."} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
692 |
{\out produces the following parse trees:} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
693 |
{\out ...} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
694 |
{\out This occured in syntax translation rule: "..." -> "..."} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
695 |
{\out Type checking error: Term does not have expected type} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
696 |
{\out ...} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
697 |
\end{ttbox} |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
698 |
|
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
699 |
To circumvent this the rule's type has to be stated. |
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
700 |
|
320 | 701 |
|
702 |
\section{Example: some minimal logics} \label{sec:min_logics} |
|
703 |
\index{examples!of logic definitions} |
|
704 |
||
705 |
This section presents some examples that have a simple syntax. They |
|
706 |
demonstrate how to define new object-logics from scratch. |
|
707 |
||
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset
|
708 |
First we must define how an object-logic syntax is embedded into the |
320 | 709 |
meta-logic. Since all theorems must conform to the syntax for~\ndx{prop} (see |
710 |
Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the |
|
711 |
object-level syntax. Assume that the syntax of your object-logic defines a |
|
712 |
nonterminal symbol~\ndx{o} of formulae. These formulae can now appear in |
|
713 |
axioms and theorems wherever \ndx{prop} does if you add the production |
|
714 |
\[ prop ~=~ o. \] |
|
715 |
This is not a copy production but a coercion from formulae to propositions: |
|
716 |
\begin{ttbox} |
|
717 |
Base = Pure + |
|
718 |
types |
|
719 |
o |
|
720 |
arities |
|
721 |
o :: logic |
|
722 |
consts |
|
723 |
Trueprop :: "o => prop" ("_" 5) |
|
724 |
end |
|
725 |
\end{ttbox} |
|
726 |
The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible |
|
332 | 727 |
coercion function. Assuming this definition resides in a file {\tt Base.thy}, |
320 | 728 |
you have to load it with the command {\tt use_thy "Base"}. |
729 |
||
730 |
One of the simplest nontrivial logics is {\bf minimal logic} of |
|
731 |
implication. Its definition in Isabelle needs no advanced features but |
|
732 |
illustrates the overall mechanism nicely: |
|
733 |
\begin{ttbox} |
|
734 |
Hilbert = Base + |
|
735 |
consts |
|
736 |
"-->" :: "[o, o] => o" (infixr 10) |
|
737 |
rules |
|
738 |
K "P --> Q --> P" |
|
739 |
S "(P --> Q --> R) --> (P --> Q) --> P --> R" |
|
740 |
MP "[| P --> Q; P |] ==> Q" |
|
741 |
end |
|
742 |
\end{ttbox} |
|
332 | 743 |
After loading this definition from the file {\tt Hilbert.thy}, you can |
320 | 744 |
start to prove theorems in the logic: |
745 |
\begin{ttbox} |
|
746 |
goal Hilbert.thy "P --> P"; |
|
747 |
{\out Level 0} |
|
748 |
{\out P --> P} |
|
749 |
{\out 1. P --> P} |
|
750 |
\ttbreak |
|
751 |
by (resolve_tac [Hilbert.MP] 1); |
|
752 |
{\out Level 1} |
|
753 |
{\out P --> P} |
|
754 |
{\out 1. ?P --> P --> P} |
|
755 |
{\out 2. ?P} |
|
756 |
\ttbreak |
|
757 |
by (resolve_tac [Hilbert.MP] 1); |
|
758 |
{\out Level 2} |
|
759 |
{\out P --> P} |
|
760 |
{\out 1. ?P1 --> ?P --> P --> P} |
|
761 |
{\out 2. ?P1} |
|
762 |
{\out 3. ?P} |
|
763 |
\ttbreak |
|
764 |
by (resolve_tac [Hilbert.S] 1); |
|
765 |
{\out Level 3} |
|
766 |
{\out P --> P} |
|
767 |
{\out 1. P --> ?Q2 --> P} |
|
768 |
{\out 2. P --> ?Q2} |
|
769 |
\ttbreak |
|
770 |
by (resolve_tac [Hilbert.K] 1); |
|
771 |
{\out Level 4} |
|
772 |
{\out P --> P} |
|
773 |
{\out 1. P --> ?Q2} |
|
774 |
\ttbreak |
|
775 |
by (resolve_tac [Hilbert.K] 1); |
|
776 |
{\out Level 5} |
|
777 |
{\out P --> P} |
|
778 |
{\out No subgoals!} |
|
779 |
\end{ttbox} |
|
780 |
As we can see, this Hilbert-style formulation of minimal logic is easy to |
|
781 |
define but difficult to use. The following natural deduction formulation is |
|
782 |
better: |
|
783 |
\begin{ttbox} |
|
784 |
MinI = Base + |
|
785 |
consts |
|
786 |
"-->" :: "[o, o] => o" (infixr 10) |
|
787 |
rules |
|
788 |
impI "(P ==> Q) ==> P --> Q" |
|
789 |
impE "[| P --> Q; P |] ==> Q" |
|
790 |
end |
|
791 |
\end{ttbox} |
|
792 |
Note, however, that although the two systems are equivalent, this fact |
|
793 |
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be |
|
794 |
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt |
|
795 |
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule |
|
796 |
in {\tt Hilbert}, something that can only be shown by induction over all |
|
797 |
possible proofs in {\tt Hilbert}. |
|
798 |
||
799 |
We may easily extend minimal logic with falsity: |
|
800 |
\begin{ttbox} |
|
801 |
MinIF = MinI + |
|
802 |
consts |
|
803 |
False :: "o" |
|
804 |
rules |
|
805 |
FalseE "False ==> P" |
|
806 |
end |
|
807 |
\end{ttbox} |
|
808 |
On the other hand, we may wish to introduce conjunction only: |
|
809 |
\begin{ttbox} |
|
810 |
MinC = Base + |
|
811 |
consts |
|
812 |
"&" :: "[o, o] => o" (infixr 30) |
|
813 |
\ttbreak |
|
814 |
rules |
|
815 |
conjI "[| P; Q |] ==> P & Q" |
|
816 |
conjE1 "P & Q ==> P" |
|
817 |
conjE2 "P & Q ==> Q" |
|
818 |
end |
|
819 |
\end{ttbox} |
|
820 |
And if we want to have all three connectives together, we create and load a |
|
821 |
theory file consisting of a single line:\footnote{We can combine the |
|
822 |
theories without creating a theory file using the ML declaration |
|
823 |
\begin{ttbox} |
|
824 |
val MinIFC_thy = merge_theories(MinIF,MinC) |
|
825 |
\end{ttbox} |
|
826 |
\index{*merge_theories|fnote}} |
|
827 |
\begin{ttbox} |
|
828 |
MinIFC = MinIF + MinC |
|
829 |
\end{ttbox} |
|
830 |
Now we can prove mixed theorems like |
|
831 |
\begin{ttbox} |
|
832 |
goal MinIFC.thy "P & False --> Q"; |
|
833 |
by (resolve_tac [MinI.impI] 1); |
|
834 |
by (dresolve_tac [MinC.conjE2] 1); |
|
835 |
by (eresolve_tac [MinIF.FalseE] 1); |
|
836 |
\end{ttbox} |
|
837 |
Try this as an exercise! |