author  wenzelm 
Tue, 20 May 1997 10:47:56 +0200  
changeset 3228  41ad2d5077be 
parent 3214  409382c0cc88 
child 3318  0cdbca0a2573 
permissions  rwrr 
320  1 
%% $Id$ 
2 
\chapter{Defining Logics} \label{DefiningLogics} 

3 
This chapter explains how to define new formal systems  in particular, 

4 
their concrete syntax. While Isabelle can be regarded as a theorem prover 

5 
for set theory, higherorder logic or the sequent calculus, its 

6 
distinguishing feature is support for the definition of new logics. 

7 

8 
Isabelle logics are hierarchies of theories, which are described and 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

9 
illustrated in 
320  10 
\iflabelundefined{sec:definingtheories}{{\em Introduction to Isabelle}}% 
11 
{\S\ref{sec:definingtheories}}. That material, together with the theory 

12 
files provided in the examples directories, should suffice for all simple 

13 
applications. The easiest way to define a new theory is by modifying a 

14 
copy of an existing theory. 

15 

16 
This chapter documents the metalogic syntax, mixfix declarations and 

17 
pretty printing. The extended examples in \S\ref{sec:min_logics} 

18 
demonstrate the logical aspects of the definition of theories. 

19 

20 

21 
\section{Priority grammars} \label{sec:priority_grammars} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

22 
\index{priority grammars(} 
320  23 

24 
A contextfree grammar contains a set of {\bf nonterminal symbols}, a set of 

25 
{\bf terminal symbols} and a set of {\bf productions}\index{productions}. 

26 
Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and 

27 
$\gamma$ is a string of terminals and nonterminals. One designated 

28 
nonterminal is called the {\bf start symbol}. The language defined by the 

29 
grammar consists of all strings of terminals that can be derived from the 

30 
start symbol by applying productions as rewrite rules. 

31 

32 
The syntax of an Isabelle logic is specified by a {\bf priority 

33 
grammar}.\index{priorities} Each nonterminal is decorated by an integer 

34 
priority, as in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be 

35 
rewritten using a production $A^{(q)} = \gamma$ only if~$p \le q$. Any 

36 
priority grammar can be translated into a normal context free grammar by 

37 
introducing new nonterminals and productions. 

38 

39 
Formally, a set of context free productions $G$ induces a derivation 

40 
relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of 

41 
terminal or nonterminal symbols. Then 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

42 
\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] 
320  43 
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$. 
44 

45 
The following simple grammar for arithmetic expressions demonstrates how 

46 
binding power and associativity of operators can be enforced by priorities. 

47 
\begin{center} 

48 
\begin{tabular}{rclr} 

49 
$A^{(9)}$ & = & {\tt0} \\ 

50 
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\ 

51 
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\ 

52 
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\ 

53 
$A^{(3)}$ & = & {\tt} $A^{(3)}$ 

54 
\end{tabular} 

55 
\end{center} 

56 
The choice of priorities determines that {\tt } binds tighter than {\tt *}, 

57 
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the 

58 
left and {\tt *} to the right. 

59 

60 
For clarity, grammars obey these conventions: 

61 
\begin{itemize} 

62 
\item All priorities must lie between~0 and \ttindex{max_pri}, which is a 

63 
some fixed integer. Sometimes {\tt max_pri} is written as $\infty$. 

64 
\item Priority 0 on the righthand side and priority \ttindex{max_pri} on 

65 
the lefthand side may be omitted. 

66 
\item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the 

67 
priority of the lefthand side actually appears in a column on the far 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

68 
right. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

69 
\item Alternatives are separated by~$$. 
320  70 
\item Repetition is indicated by dots~(\dots) in an informal but obvious 
71 
way. 

72 
\end{itemize} 

73 

74 
Using these conventions and assuming $\infty=9$, the grammar 

75 
takes the form 

76 
\begin{center} 

77 
\begin{tabular}{rclc} 

78 
$A$ & = & {\tt0} & \hspace*{4em} \\ 

79 
& $$ & {\tt(} $A$ {\tt)} \\ 

80 
& $$ & $A$ {\tt+} $A^{(1)}$ & (0) \\ 

81 
& $$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\ 

82 
& $$ & {\tt} $A^{(3)}$ & (3) 

83 
\end{tabular} 

84 
\end{center} 

85 
\index{priority grammars)} 

86 

87 

88 
\begin{figure} 

89 
\begin{center} 

90 
\begin{tabular}{rclc} 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

91 
$any$ &=& $prop$ ~~$$~~ $logic$ \\\\ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

92 
$prop$ &=& {\tt(} $prop$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

93 
&$$& $prop^{(4)}$ {\tt::} $type$ & (3) \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

94 
&$$& {\tt PROP} $aprop$ \\ 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

95 
&$$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\ 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

96 
&$$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\ 
320  97 
&$$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\ 
98 
&$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop^{(1)}$ & (1) \\ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

99 
&$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

100 
&$$& {\tt OFCLASS} {\tt(} $type$ {\tt,} $logic$ {\tt)} \\\\ 
320  101 
$aprop$ &=& $id$ ~~$$~~ $var$ 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

102 
~~$$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

103 
$logic$ &=& {\tt(} $logic$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

104 
&$$& $logic^{(4)}$ {\tt::} $type$ & (3) \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

105 
&$$& $id$ ~~$$~~ $var$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

106 
~~$$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

107 
&$$& {\tt \%} $idts$ {\tt.} $any$ & (0) \\\\ 
320  108 
$idts$ &=& $idt$ ~~$$~~ $idt^{(1)}$ $idts$ \\\\ 
109 
$idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\ 

110 
&$$& $id$ {\tt ::} $type$ & (0) \\\\ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

111 
$type$ &=& {\tt(} $type$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

112 
&$$& $tid$ ~~$$~~ $tvar$ ~~$$~~ $tid$ {\tt::} $sort$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

113 
~~$$~~ $tvar$ {\tt::} $sort$ \\ 
320  114 
&$$& $id$ ~~$$~~ $type^{(\infty)}$ $id$ 
115 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ 

116 
&$$& $type^{(1)}$ {\tt =>} $type$ & (0) \\ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

117 
&$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0) \\\\ 
320  118 
$sort$ &=& $id$ ~~$$~~ {\tt\ttlbrace\ttrbrace} 
119 
~~$$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace} 

120 
\end{tabular} 

121 
\index{*PROP symbol} 

122 
\index{*== symbol}\index{*=?= symbol}\index{*==> symbol} 

123 
\index{*:: symbol}\index{*=> symbol} 

332  124 
\index{sort constraints} 
125 
%the index command: a percent is permitted, but braces must match! 

320  126 
\index{%@{\tt\%} symbol} 
127 
\index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol} 

128 
\index{*[ symbol}\index{*] symbol} 

129 
\index{*"!"! symbol} 

130 
\index{*"[" symbol} 

131 
\index{*""] symbol} 

132 
\end{center} 

133 
\caption{Metalogic syntax}\label{fig:pure_gram} 

134 
\end{figure} 

135 

136 

137 
\section{The Pure syntax} \label{sec:basic_syntax} 

138 
\index{syntax!Pure(} 

139 

140 
At the root of all objectlogics lies the theory \thydx{Pure}. It 

141 
contains, among many other things, the Pure syntax. An informal account of 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

142 
this basic syntax (types, terms and formulae) appears in 
320  143 
\iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}% 
144 
{\S\ref{sec:forward}}. A more precise description using a priority grammar 

145 
appears in Fig.\ts\ref{fig:pure_gram}. It defines the following 

146 
nonterminals: 

147 
\begin{ttdescription} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

148 
\item[\ndxbold{any}] denotes any term. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

149 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

150 
\item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

151 
of the metalogic. Note that user constants of result type {\tt prop} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

152 
(i.e.\ $c :: \ldots \To prop$) should always provide concrete syntax. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

153 
Otherwise atomic propositions with head $c$ may be printed incorrectly. 
320  154 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

155 
\item[\ndxbold{aprop}] denotes atomic propositions. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

156 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

157 
%% FIXME huh!? 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

158 
% These typically 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

159 
% include the judgement forms of the objectlogic; its definition 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

160 
% introduces a metalevel predicate for each judgement form. 
320  161 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

162 
\item[\ndxbold{logic}] denotes terms whose type belongs to class 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

163 
\cldx{logic}, excluding type \tydx{prop}. 
320  164 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

165 
\item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

166 
by types. 
320  167 

168 
\item[\ndxbold{type}] denotes types of the metalogic. 

169 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

170 
\item[\ndxbold{sort}] denotes metalevel sorts. 
320  171 
\end{ttdescription} 
172 

173 
\begin{warn} 

174 
In {\tt idts}, note that \verbx::nat y is parsed as \verbx::(nat y), 

175 
treating {\tt y} like a type constructor applied to {\tt nat}. The 

176 
likely result is an error message. To avoid this interpretation, use 

177 
parentheses and write \verb(x::nat) y. 

332  178 
\index{type constraints}\index{*:: symbol} 
320  179 

180 
Similarly, \verbx::nat y::nat is parsed as \verbx::(nat y::nat) and 

181 
yields an error. The correct form is \verb(x::nat) (y::nat). 

182 
\end{warn} 

183 

452  184 
\begin{warn} 
185 
Type constraints bind very weakly. For example, \verb!x<y::nat! is normally 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

186 
parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in 
452  187 
which case the string is likely to be ambiguous. The correct form is 
188 
\verb!x<(y::nat)!. 

189 
\end{warn} 

320  190 

867  191 
\subsection{Logical types and default syntax}\label{logicaltypes} 
192 
\index{lambda calc@$\lambda$calculus} 

193 

194 
Isabelle's representation of mathematical languages is based on the 

195 
simply typed $\lambda$calculus. All logical types, namely those of 

196 
class \cldx{logic}, are automatically equipped with a basic syntax of 

197 
types, identifiers, variables, parentheses, $\lambda$abstraction and 

198 
application. 

199 
\begin{warn} 

200 
Isabelle combines the syntaxes for all types of class \cldx{logic} by 

201 
mapping all those types to the single nonterminal $logic$. Thus all 

202 
productions of $logic$, in particular $id$, $var$ etc, become available. 

203 
\end{warn} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

204 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

205 

320  206 
\subsection{Lexical matters} 
207 
The parser does not process input strings directly. It operates on token 

208 
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of 

209 
tokens: \bfindex{delimiters} and \bfindex{name tokens}. 

210 

211 
\index{reserved words} 

212 
Delimiters can be regarded as reserved words of the syntax. You can 

213 
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they 

214 
appear in typewriter font, for example {\tt ==}, {\tt =?=} and 

215 
{\tt PROP}\@. 

216 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

217 
Name tokens have a predefined syntax. The lexer distinguishes six disjoint 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

218 
classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

219 
identifiers\index{type identifiers}, type unknowns\index{type unknowns}, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

220 
\rmindex{numerals}, \rmindex{strings}. They are denoted by \ndxbold{id}, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

221 
\ndxbold{var}, \ndxbold{tid}, \ndxbold{tvar}, \ndxbold{xnum}, \ndxbold{xstr}, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

222 
respectively. Typical examples are {\tt x}, {\tt ?x7}, {\tt 'a}, {\tt ?'a3}, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

223 
{\tt \#42}, {\tt ''foo bar''}. Here is the precise syntax: 
320  224 
\begin{eqnarray*} 
225 
id & = & letter~quasiletter^* \\ 

226 
var & = & \mbox{\tt ?}id ~~~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ 

227 
tid & = & \mbox{\tt '}id \\ 

228 
tvar & = & \mbox{\tt ?}tid ~~~~ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

229 
\mbox{\tt ?}tid\mbox{\tt .}nat \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

230 
xnum & = & \mbox{\tt \#}nat ~~~~ \mbox{\tt \#\char`\~}nat \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

231 
xstr & = & \mbox{\tt ''\rm text\tt ''} \\[1ex] 
320  232 
letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\ 
233 
digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\ 

234 
quasiletter & = & letter ~~~~ digit ~~~~ \mbox{\tt _} ~~~~ \mbox{\tt '} \\ 

235 
nat & = & digit^+ 

236 
\end{eqnarray*} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

237 
The lexer repeatedly takes the maximal prefix of the input string that forms 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

238 
a valid token. A maximal prefix that is both a delimiter and a name is 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

239 
treated as a delimiter. Spaces, tabs, newlines and formfeeds are separators; 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

240 
they never occur within tokens, except those of class $xstr$. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

241 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

242 
\medskip 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

243 
Delimiters need not be separated by white space. For example, if {\tt } 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

244 
is a delimiter but {\tt } is not, then the string {\tt } is treated as 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

245 
two consecutive occurrences of the token~{\tt }. In contrast, \ML\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

246 
treats {\tt } as a single symbolic name. The consequence of Isabelle's 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

247 
more liberal scheme is that the same string may be parsed in different ways 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

248 
after extending the syntax: after adding {\tt } as a delimiter, the input 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

249 
{\tt } is treated as a single token. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

250 

320  251 
A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally 
252 
a pair of base name and index (\ML\ type \mltydx{indexname}). These 

253 
components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or 

254 
run together as in {\tt ?x1}. The latter form is possible if the base name 

255 
does not end with digits. If the index is 0, it may be dropped altogether: 

256 
{\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}. 

257 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

258 
Tokens of class $xnum$ or $xstr$ are not used by the metalogic. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

259 
Objectlogics may provide numerals and string constants by adding appropriate 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

260 
productions and translation functions. 
320  261 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

262 
\medskip 
320  263 
Although name tokens are returned from the lexer rather than the parser, it 
264 
is more logical to regard them as nonterminals. Delimiters, however, are 

265 
terminals; they are just syntactic sugar and contribute nothing to the 

266 
abstract syntax tree. 

267 

268 

3108  269 
\subsection{*Inspecting the syntax} \label{pg:print_syn} 
320  270 
\begin{ttbox} 
271 
syn_of : theory > Syntax.syntax 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

272 
print_syntax : theory > unit 
320  273 
Syntax.print_syntax : Syntax.syntax > unit 
274 
Syntax.print_gram : Syntax.syntax > unit 

275 
Syntax.print_trans : Syntax.syntax > unit 

276 
\end{ttbox} 

277 
The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes 

278 
in \ML. You can display values of this type by calling the following 

279 
functions: 

280 
\begin{ttdescription} 

281 
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle 

282 
theory~{\it thy} as an \ML\ value. 

283 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

284 
\item[\ttindexbold{print_syntax} $thy$] displays the syntax part of $thy$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

285 
using {\tt Syntax.print_syntax}. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

286 

320  287 
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all 
288 
information contained in the syntax {\it syn}. The displayed output can 

289 
be large. The following two functions are more selective. 

290 

291 
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

292 
of~{\it syn}, namely the lexicon, logical types and productions. These are 
320  293 
discussed below. 
294 

295 
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation 

296 
part of~{\it syn}, namely the constants, parse/print macros and 

297 
parse/print translations. 

298 
\end{ttdescription} 

299 

300 
Let us demonstrate these functions by inspecting Pure's syntax. Even that 

301 
is too verbose to display in full. 

302 
\begin{ttbox}\index{*Pure theory} 

303 
Syntax.print_syntax (syn_of Pure.thy); 

304 
{\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

305 
{\out logtypes: fun itself} 
320  306 
{\out prods:} 
307 
{\out type = tid (1000)} 

308 
{\out type = tvar (1000)} 

309 
{\out type = id (1000)} 

310 
{\out type = tid "::" sort[0] => "_ofsort" (1000)} 

311 
{\out type = tvar "::" sort[0] => "_ofsort" (1000)} 

312 
{\out \vdots} 

313 
\ttbreak 

3108  314 
{\out print modes: "symbols" "xterm"} 
320  315 
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots} 
316 
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"} 

317 
{\out "_idtyp" "_lambda" "_tapp" "_tappl"} 

318 
{\out parse_rules:} 

319 
{\out parse_translation: "!!" "_K" "_abs" "_aprop"} 

320 
{\out print_translation: "all"} 

321 
{\out print_rules:} 

322 
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"} 

323 
\end{ttbox} 

324 

332  325 
As you can see, the output is divided into labelled sections. The grammar 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

326 
is represented by {\tt lexicon}, {\tt logtypes} and {\tt prods}. The rest 
320  327 
refers to syntactic translations and macro expansion. Here is an 
328 
explanation of the various sections. 

329 
\begin{description} 

330 
\item[{\tt lexicon}] lists the delimiters used for lexical 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

331 
analysis.\index{delimiters} 
320  332 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

333 
\item[{\tt logtypes}] lists the types that are regarded the same as {\tt 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

334 
logic} syntactically. Thus types of objectlogics (e.g.\ {\tt nat}, say) 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

335 
will be automatically equipped with the standard syntax of 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

336 
$\lambda$calculus. 
320  337 

338 
\item[{\tt prods}] lists the \rmindex{productions} of the priority grammar. 

339 
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}. 

340 
Each delimiter is quoted. Some productions are shown with {\tt =>} and 

341 
an attached string. These strings later become the heads of parse 

342 
trees; they also play a vital role when terms are printed (see 

343 
\S\ref{sec:asts}). 

344 

345 
Productions with no strings attached are called {\bf copy 

346 
productions}\indexbold{productions!copy}. Their righthand side must 

347 
have exactly one nonterminal symbol (or name token). The parser does 

348 
not create a new parse tree node for copy productions, but simply 

349 
returns the parse tree of the righthand symbol. 

350 

351 
If the righthand side consists of a single nonterminal with no 

352 
delimiters, then the copy production is called a {\bf chain 

353 
production}. Chain productions act as abbreviations: 

354 
conceptually, they are removed from the grammar by adding new 

355 
productions. Priority information attached to chain productions is 

356 
ignored; only the dummy value $1$ is displayed. 

3108  357 

358 
\item[\ttindex{print_modes}] lists the alternative print modes 

359 
provided by this syntax (see \S\ref{sec:prmodes}). 

320  360 

361 
\item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}] 

362 
relate to macros (see \S\ref{sec:macros}). 

363 

364 
\item[{\tt parse_ast_translation}, {\tt print_ast_translation}] 

365 
list sets of constants that invoke translation functions for abstract 

366 
syntax trees. Section \S\ref{sec:asts} below discusses this obscure 

367 
matter.\index{constants!for translations} 

368 

369 
\item[{\tt parse_translation}, {\tt print_translation}] list sets 

370 
of constants that invoke translation functions for terms (see 

371 
\S\ref{sec:tr_funs}). 

372 
\end{description} 

373 
\index{syntax!Pure)} 

374 

375 

376 
\section{Mixfix declarations} \label{sec:mixfix} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

377 
\index{mixfix declarations(} 
320  378 

379 
When defining a theory, you declare new constants by giving their names, 

380 
their type, and an optional {\bf mixfix annotation}. Mixfix annotations 

381 
allow you to extend Isabelle's basic $\lambda$calculus syntax with 

382 
readable notation. They can express any contextfree priority grammar. 

383 
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

384 
general than the priority declarations of \ML\ and Prolog. 
320  385 

386 
A mixfix annotation defines a production of the priority grammar. It 

387 
describes the concrete syntax, the translation to abstract syntax, and the 

388 
pretty printing. Special case annotations provide a simple means of 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

389 
specifying infix operators and binders. 
320  390 

391 
\subsection{The general mixfix form} 

392 
Here is a detailed account of mixfix declarations. Suppose the following 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

393 
line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

394 
file: 
320  395 
\begin{center} 
396 
{\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)} 

397 
\end{center} 

332  398 
This constant declaration and mixfix annotation are interpreted as follows: 
320  399 
\begin{itemize}\index{productions} 
400 
\item The string {\tt $c$} is the name of the constant associated with the 

401 
production; unless it is a valid identifier, it must be enclosed in 

402 
quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy 

403 
production.\index{productions!copy} Otherwise, parsing an instance of the 

404 
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$ 

405 
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$th 

406 
argument. 

407 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

408 
\item The constant $c$, if nonempty, is declared to have type $\sigma$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

409 
({\tt consts} section only). 
320  410 

411 
\item The string $template$ specifies the righthand side of 

412 
the production. It has the form 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

413 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] 
320  414 
where each occurrence of {\tt_} denotes an argument position and 
415 
the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in 

416 
the concrete syntax, you must escape it as described below.) The $w@i$ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

417 
may consist of \rmindex{delimiters}, spaces or 
320  418 
\rmindex{pretty printing} annotations (see below). 
419 

420 
\item The type $\sigma$ specifies the production's nonterminal symbols 

421 
(or name tokens). If $template$ is of the form above then $\sigma$ 

422 
must be a function type with at least~$n$ argument positions, say 

423 
$\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are 

424 
derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

425 
below. Any of these may be function types. 
320  426 

427 
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt 

428 
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal 

429 
priority\indexbold{priorities} required of any phrase that may appear 

430 
as the $i$th argument. Missing priorities default to~0. 

431 

432 
\item The integer $p$ is the priority of this production. If omitted, it 

433 
defaults to the maximal priority. 

434 
Priorities range between 0 and \ttindexbold{max_pri} (= 1000). 

435 
\end{itemize} 

436 
% 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

437 
The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

438 
A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

439 
nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

440 
The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

441 
this is a logical type (namely one of class {\tt logic} excluding {\tt 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

442 
prop}). Otherwise it is $ty$ (note that only the outermost type constructor 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

443 
is taken into account). Finally, the nonterminal of a type variable is {\tt 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

444 
any}. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

445 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

446 
\begin{warn} 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

447 
Theories must sometimes declare types for purely syntactic purposes  
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

448 
merely playing the role of nonterminals. One example is \tydx{type}, the 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

449 
builtin type of types. This is a `type of all types' in the syntactic 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

450 
sense only. Do not declare such types under {\tt arities} as belonging to 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

451 
class {\tt logic}\index{*logic class}, for that would make them useless as 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

452 
separate nonterminal symbols. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

453 
\end{warn} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

454 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

455 
Associating nonterminals with types allows a constant's type to specify 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

456 
syntax as well. We can declare the function~$f$ to have type $[\tau@1, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

457 
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

458 
of the function's $n$ arguments. The constant's name, in this case~$f$, will 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

459 
also serve as the label in the abstract syntax tree. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

460 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

461 
You may also declare mixfix syntax without adding constants to the theory's 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

462 
signature, by using a {\tt syntax} section instead of {\tt consts}. Thus a 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

463 
production need not map directly to a logical function (this typically 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

464 
requires additional syntactic translations, see also 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

465 
Chapter~\ref{chap:syntax}). 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

466 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

467 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

468 
\medskip 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

469 
As a special case of the general mixfix declaration, the form 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

470 
\begin{center} 
911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

471 
{\tt $c$ ::\ "$\sigma$" ("$template$")} 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

472 
\end{center} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

473 
specifies no priorities. The resulting production puts no priority 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

474 
constraints on any of its arguments and has maximal priority itself. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

475 
Omitting priorities in this manner is prone to syntactic ambiguities unless 
3098  476 
the production's righthand side is fully bracketed, as in 
477 
\verb"if _ then _ else _ fi". 

320  478 

479 
Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"}, 

480 
is sensible only if~$c$ is an identifier. Otherwise you will be unable to 

481 
write terms involving~$c$. 

482 

483 

484 
\subsection{Example: arithmetic expressions} 

485 
\index{examples!of mixfix declarations} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

486 
This theory specification contains a {\tt syntax} section with mixfix 
320  487 
declarations encoding the priority grammar from 
488 
\S\ref{sec:priority_grammars}: 

489 
\begin{ttbox} 

3108  490 
ExpSyntax = Pure + 
320  491 
types 
492 
exp 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

493 
syntax 
1387  494 
"0" :: exp ("0" 9) 
495 
"+" :: [exp, exp] => exp ("_ + _" [0, 1] 0) 

496 
"*" :: [exp, exp] => exp ("_ * _" [3, 2] 2) 

497 
"" :: exp => exp (" _" [3] 3) 

320  498 
end 
499 
\end{ttbox} 

3108  500 
If you put this into a file {\tt ExpSyntax.thy} and load it via {\tt 
501 
use_thy"ExpSyntax"}, you can run some tests: 

320  502 
\begin{ttbox} 
3108  503 
val read_exp = Syntax.test_read (syn_of ExpSyntax.thy) "exp"; 
320  504 
{\out val it = fn : string > unit} 
505 
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0"; 

506 
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"} 

507 
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")} 

508 
{\out \vdots} 

509 
read_exp "0 +  0 + 0"; 

510 
{\out tokens: "0" "+" "" "0" "+" "0"} 

511 
{\out raw: ("+" ("+" "0" ("" "0")) "0")} 

512 
{\out \vdots} 

513 
\end{ttbox} 

514 
The output of \ttindex{Syntax.test_read} includes the token list ({\tt 

515 
tokens}) and the raw \AST{} directly derived from the parse tree, 

516 
ignoring parse \AST{} translations. The rest is tracing information 

517 
provided by the macro expander (see \S\ref{sec:macros}). 

518 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

519 
Executing {\tt Syntax.print_gram} reveals the productions derived from the 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

520 
above mixfix declarations (lots of additional information deleted): 
320  521 
\begin{ttbox} 
3108  522 
Syntax.print_gram (syn_of ExpSyntax.thy); 
320  523 
{\out exp = "0" => "0" (9)} 
524 
{\out exp = exp[0] "+" exp[1] => "+" (0)} 

525 
{\out exp = exp[3] "*" exp[2] => "*" (2)} 

526 
{\out exp = "" exp[3] => "" (3)} 

527 
\end{ttbox} 

528 

3108  529 
Note that because {\tt exp} is not of class {\tt logic}, it has been 
530 
retained as a separate nonterminal. This also entails that the syntax 

531 
does not provide for identifiers or paranthesized expressions. 

532 
Normally you would also want to add the declaration {\tt arities 

533 
exp::logic} after {\tt types} and use {\tt consts} instead of {\tt 

534 
syntax}. Try this as an exercise and study the changes in the 

867  535 
grammar. 
320  536 

537 
\subsection{The mixfix template} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

538 
Let us now take a closer look at the string $template$ appearing in mixfix 
320  539 
annotations. This string specifies a list of parsing and printing 
540 
directives: delimiters\index{delimiters}, arguments, spaces, blocks of 

541 
indentation and line breaks. These are encoded by the following character 

542 
sequences: 

543 
\index{pretty printing(} 

544 
\begin{description} 

545 
\item[~$d$~] is a delimiter, namely a nonempty sequence of characters 

546 
other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}. 

547 
Even these characters may appear if escaped; this means preceding it with 

548 
a~{\tt '} (single quote). Thus you have to write {\tt ''} if you really 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

549 
want a single quote. Furthermore, a~{\tt '} followed by a space separates 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

550 
delimiters without extra white space being added for printing. 
320  551 

552 
\item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol 

553 
or name token. 

554 

555 
\item[~$s$~] is a nonempty sequence of spaces for printing. This and the 

556 
following specifications do not affect parsing at all. 

557 

558 
\item[~{\tt(}$n$~] opens a pretty printing block. The optional number $n$ 

559 
specifies how much indentation to add when a line break occurs within the 

560 
block. If {\tt(} is not followed by digits, the indentation defaults 

561 
to~0. 

562 

563 
\item[~{\tt)}~] closes a pretty printing block. 

564 

565 
\item[~{\tt//}~] forces a line break. 

566 

567 
\item[~{\tt/}$s$~] allows a line break. Here $s$ stands for the string of 

568 
spaces (zero or more) right after the {\tt /} character. These spaces 

569 
are printed if the break is not taken. 

570 
\end{description} 

571 
For example, the template {\tt"(_ +/ _)"} specifies an infix operator. 

572 
There are two argument positions; the delimiter~{\tt+} is preceded by a 

573 
space and followed by a space or line break; the entire phrase is a pretty 

574 
printing block. Other examples appear in Fig.\ts\ref{fig:set_trans} below. 

575 
Isabelle's pretty printer resembles the one described in 

576 
Paulson~\cite{paulson91}. 

577 

578 
\index{pretty printing)} 

579 

580 

581 
\subsection{Infixes} 

582 
\indexbold{infixes} 

583 

3108  584 
Infix operators associating to the left or right can be declared using 
585 
{\tt infixl} or {\tt infixr}. Basically, the form {\tt $c$ ::\ 

586 
$\sigma$ (infixl $p$)} abbreviates the mixfix declarations 

320  587 
\begin{ttbox} 
1387  588 
"op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) 
589 
"op \(c\)" :: \(\sigma\) ("op \(c\)") 

320  590 
\end{ttbox} 
1387  591 
and {\tt $c$ ::\ $\sigma$ (infixr $p$)} abbreviates the mixfix declarations 
320  592 
\begin{ttbox} 
1387  593 
"op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) 
594 
"op \(c\)" :: \(\sigma\) ("op \(c\)") 

320  595 
\end{ttbox} 
596 
The infix operator is declared as a constant with the prefix {\tt op}. 

597 
Thus, prefixing infixes with \sdx{op} makes them behave like ordinary 

598 
function symbols, as in \ML. Special characters occurring in~$c$ must be 

599 
escaped, as in delimiters, using a single quote. 

600 

3108  601 
A slightly more general form of infix declarations allows constant 
602 
names to be independent from their concrete syntax, namely \texttt{$c$ 

603 
::\ $\sigma$\ (infixl "$sy$" $p$)}, the same for \texttt{infixr}. As 

604 
an example consider: 

605 
\begin{ttbox} 

606 
and :: [bool, bool] => bool (infixr "&" 35) 

607 
\end{ttbox} 

608 
The internal constant name will then be just \texttt{and}, without any 

609 
\texttt{op} prefixed. 

610 

320  611 

612 
\subsection{Binders} 

613 
\indexbold{binders} 

614 
\begingroup 

615 
\def\Q{{\cal Q}} 

616 
A {\bf binder} is a variablebinding construct such as a quantifier. The 

617 
constant declaration 

618 
\begin{ttbox} 

1387  619 
\(c\) :: \(\sigma\) (binder "\(\Q\)" [\(pb\)] \(p\)) 
320  620 
\end{ttbox} 
621 
introduces a constant~$c$ of type~$\sigma$, which must have the form 

622 
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where 

623 
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$ 

877  624 
and the whole term has type~$\tau@3$. The optional integer $pb$ 
1060
a122584b5cc5
In binders, the default body priority is now p instead of 0.
lcp
parents:
911
diff
changeset

625 
specifies the body's priority, by default~$p$. Special characters 
877  626 
in $\Q$ must be escaped using a single quote. 
320  627 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

628 
The declaration is expanded internally to something like 
320  629 
\begin{ttbox} 
3098  630 
\(c\)\hskip3pt :: (\(\tau@1\) => \(\tau@2\)) => \(\tau@3\) 
631 
"\(\Q\)" :: [idts, \(\tau@2\)] => \(\tau@3\) ("(3\(\Q\)_./ _)" [0, \(pb\)] \(p\)) 

320  632 
\end{ttbox} 
633 
Here \ndx{idts} is the nonterminal symbol for a list of identifiers with 

332  634 
\index{type constraints} 
320  635 
optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The 
636 
declaration also installs a parse translation\index{translations!parse} 

637 
for~$\Q$ and a print translation\index{translations!print} for~$c$ to 

638 
translate between the internal and external forms. 

639 

640 
A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a 

641 
list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$ 

642 
corresponds to the internal form 

643 
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \] 

644 

645 
\medskip 

646 
For example, let us declare the quantifier~$\forall$:\index{quantifiers} 

647 
\begin{ttbox} 

1387  648 
All :: ('a => o) => o (binder "ALL " 10) 
320  649 
\end{ttbox} 
650 
This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL 

651 
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall 

652 
back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL 

653 
$x$.$P$} have type~$o$, the type of formulae, while the bound variable 

654 
can be polymorphic. 

655 
\endgroup 

656 

657 
\index{mixfix declarations)} 

658 

3108  659 

660 
\section{*Alternative print modes} \label{sec:prmodes} 

661 
\index{print modes(} 

662 
% 

663 
Isabelle's pretty printer supports alternative output syntaxes. These 

664 
may be used independently or in cooperation. The currently active 

665 
print modes (with precedence from left to right) are determined by a 

666 
reference variable. 

667 
\begin{ttbox}\index{*print_mode} 

668 
print_mode: string list ref 

669 
\end{ttbox} 

670 
Initially this may already contain some print mode identifiers, 

671 
depending on how Isabelle has been invoked (e.g.\ by some user 

672 
interface). So changes should be incremental  adding or deleting 

673 
modes relative to the current value. 

674 

675 
Any \ML{} string is a legal print mode identifier, without any 

676 
predeclaration required. The following names should be considered 

677 
reserved, though: \texttt{""} (yes, the empty string), 

678 
\texttt{symbols}, \texttt{latex}, \texttt{xterm}. 

679 

680 
There is a separate table of mixfix productions for pretty printing 

681 
associated with each print mode. The currently active ones are 

682 
conceptually just concatenated from left to right, with the standard 

683 
syntax output table always coming last as default. Thus mixfix 

684 
productions of preceding modes in the list may override those of later 

685 
ones. Also note that token translations are always relative to some 

686 
print mode (see \S\ref{sec:tok_tr}). 

687 

688 
\medskip The canonical application of print modes is optional printing 

689 
of mathematical symbols from a special screen font instead of {\sc 

690 
ascii}. Another example is to reuse Isabelle's advanced 

691 
$\lambda$term printing mechanisms to generate completely different 

3228  692 
output, say for interfacing external tools like \rmindex{model 
693 
checkers} (see also \texttt{HOL/Modelcheck}). 

3108  694 

695 
\index{print modes)} 

696 

697 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

698 
\section{Ambiguity of parsed expressions} \label{sec:ambiguity} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

699 
\index{ambiguity!of parsed expressions} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

700 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

701 
To keep the grammar small and allow common productions to be shared 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

702 
all logical types (except {\tt prop}) are internally represented 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

703 
by one nonterminal, namely {\tt logic}. This and omitted or too freely 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

704 
chosen priorities may lead to ways of parsing an expression that were 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

705 
not intended by the theory's maker. In most cases Isabelle is able to 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

706 
select one of multiple parse trees that an expression has lead 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

707 
to by checking which of them can be typed correctly. But this may not 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

708 
work in every case and always slows down parsing. 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

709 
The warning and error messages that can be produced during this process are 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

710 
as follows: 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

711 

880  712 
If an ambiguity can be resolved by type inference the following 
713 
warning is shown to remind the user that parsing is (unnecessarily) 

714 
slowed down. In cases where it's not easily possible to eliminate the 

715 
ambiguity the frequency of the warning can be controlled by changing 

883
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
clasohm
parents:
880
diff
changeset

716 
the value of {\tt Syntax.ambiguity_level} which has type {\tt int 
880  717 
ref}. Its default value is 1 and by increasing it one can control how 
883
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
clasohm
parents:
880
diff
changeset

718 
many parse trees are necessary to generate the warning. 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

719 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

720 
\begin{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

721 
{\out Warning: Ambiguous input "..."} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

722 
{\out produces the following parse trees:} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

723 
{\out ...} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

724 
{\out Fortunately, only one parse tree is type correct.} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

725 
{\out It helps (speed!) if you disambiguate your grammar or your input.} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

726 
\end{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

727 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

728 
The following message is normally caused by using the same 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

729 
syntax in two different productions: 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

730 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

731 
\begin{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

732 
{\out Warning: Ambiguous input "..."} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

733 
{\out produces the following parse trees:} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

734 
{\out ...} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

735 
{\out Error: More than one term is type correct:} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

736 
{\out ...} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

737 
\end{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

738 

866
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

739 
Ambiguities occuring in syntax translation rules cannot be resolved by 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

740 
type inference because it is not necessary for these rules to be type 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

741 
correct. Therefore Isabelle always generates an error message and the 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

742 
ambiguity should be eliminated by changing the grammar or the rule. 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

743 

320  744 

745 
\section{Example: some minimal logics} \label{sec:min_logics} 

746 
\index{examples!of logic definitions} 

747 

748 
This section presents some examples that have a simple syntax. They 

749 
demonstrate how to define new objectlogics from scratch. 

750 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

751 
First we must define how an objectlogic syntax is embedded into the 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

752 
metalogic. Since all theorems must conform to the syntax for~\ndx{prop} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

753 
(see Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the 
320  754 
objectlevel syntax. Assume that the syntax of your objectlogic defines a 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

755 
metatype~\tydx{o} of formulae which refers to the nonterminal {\tt logic}. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

756 
These formulae can now appear in axioms and theorems wherever \ndx{prop} does 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

757 
if you add the production 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

758 
\[ prop ~=~ logic. \] 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

759 
This is not supposed to be a copy production but an implicit coercion from 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

760 
formulae to propositions: 
320  761 
\begin{ttbox} 
762 
Base = Pure + 

763 
types 

764 
o 

765 
arities 

766 
o :: logic 

767 
consts 

1387  768 
Trueprop :: o => prop ("_" 5) 
320  769 
end 
770 
\end{ttbox} 

771 
The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible 

332  772 
coercion function. Assuming this definition resides in a file {\tt Base.thy}, 
320  773 
you have to load it with the command {\tt use_thy "Base"}. 
774 

775 
One of the simplest nontrivial logics is {\bf minimal logic} of 

776 
implication. Its definition in Isabelle needs no advanced features but 

777 
illustrates the overall mechanism nicely: 

778 
\begin{ttbox} 

779 
Hilbert = Base + 

780 
consts 

1387  781 
">" :: [o, o] => o (infixr 10) 
320  782 
rules 
783 
K "P > Q > P" 

784 
S "(P > Q > R) > (P > Q) > P > R" 

785 
MP "[ P > Q; P ] ==> Q" 

786 
end 

787 
\end{ttbox} 

332  788 
After loading this definition from the file {\tt Hilbert.thy}, you can 
320  789 
start to prove theorems in the logic: 
790 
\begin{ttbox} 

791 
goal Hilbert.thy "P > P"; 

792 
{\out Level 0} 

793 
{\out P > P} 

794 
{\out 1. P > P} 

795 
\ttbreak 

796 
by (resolve_tac [Hilbert.MP] 1); 

797 
{\out Level 1} 

798 
{\out P > P} 

799 
{\out 1. ?P > P > P} 

800 
{\out 2. ?P} 

801 
\ttbreak 

802 
by (resolve_tac [Hilbert.MP] 1); 

803 
{\out Level 2} 

804 
{\out P > P} 

805 
{\out 1. ?P1 > ?P > P > P} 

806 
{\out 2. ?P1} 

807 
{\out 3. ?P} 

808 
\ttbreak 

809 
by (resolve_tac [Hilbert.S] 1); 

810 
{\out Level 3} 

811 
{\out P > P} 

812 
{\out 1. P > ?Q2 > P} 

813 
{\out 2. P > ?Q2} 

814 
\ttbreak 

815 
by (resolve_tac [Hilbert.K] 1); 

816 
{\out Level 4} 

817 
{\out P > P} 

818 
{\out 1. P > ?Q2} 

819 
\ttbreak 

820 
by (resolve_tac [Hilbert.K] 1); 

821 
{\out Level 5} 

822 
{\out P > P} 

823 
{\out No subgoals!} 

824 
\end{ttbox} 

825 
As we can see, this Hilbertstyle formulation of minimal logic is easy to 

826 
define but difficult to use. The following natural deduction formulation is 

827 
better: 

828 
\begin{ttbox} 

829 
MinI = Base + 

830 
consts 

1387  831 
">" :: [o, o] => o (infixr 10) 
320  832 
rules 
833 
impI "(P ==> Q) ==> P > Q" 

834 
impE "[ P > Q; P ] ==> Q" 

835 
end 

836 
\end{ttbox} 

837 
Note, however, that although the two systems are equivalent, this fact 

838 
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be 

839 
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt 

840 
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule 

841 
in {\tt Hilbert}, something that can only be shown by induction over all 

842 
possible proofs in {\tt Hilbert}. 

843 

844 
We may easily extend minimal logic with falsity: 

845 
\begin{ttbox} 

846 
MinIF = MinI + 

847 
consts 

1387  848 
False :: o 
320  849 
rules 
850 
FalseE "False ==> P" 

851 
end 

852 
\end{ttbox} 

853 
On the other hand, we may wish to introduce conjunction only: 

854 
\begin{ttbox} 

855 
MinC = Base + 

856 
consts 

1387  857 
"&" :: [o, o] => o (infixr 30) 
320  858 
\ttbreak 
859 
rules 

860 
conjI "[ P; Q ] ==> P & Q" 

861 
conjE1 "P & Q ==> P" 

862 
conjE2 "P & Q ==> Q" 

863 
end 

864 
\end{ttbox} 

865 
And if we want to have all three connectives together, we create and load a 

3108  866 
theory file consisting of a single line: 
320  867 
\begin{ttbox} 
868 
MinIFC = MinIF + MinC 

869 
\end{ttbox} 

870 
Now we can prove mixed theorems like 

871 
\begin{ttbox} 

872 
goal MinIFC.thy "P & False > Q"; 

873 
by (resolve_tac [MinI.impI] 1); 

874 
by (dresolve_tac [MinC.conjE2] 1); 

875 
by (eresolve_tac [MinIF.FalseE] 1); 

876 
\end{ttbox} 

877 
Try this as an exercise! 