author  obua 
Fri, 16 Sep 2005 21:02:15 +0200  
changeset 17440  df77edc4f5d0 
parent 14960  89cce4e95a22 
child 20093  164a42b7385f 
permissions  rwrr 
320  1 
%% $Id$ 
2 
\chapter{Defining Logics} \label{DefiningLogics} 

3 
This chapter explains how to define new formal systems  in particular, 

4 
their concrete syntax. While Isabelle can be regarded as a theorem prover 

5 
for set theory, higherorder logic or the sequent calculus, its 

6 
distinguishing feature is support for the definition of new logics. 

7 

8 
Isabelle logics are hierarchies of theories, which are described and 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

9 
illustrated in 
320  10 
\iflabelundefined{sec:definingtheories}{{\em Introduction to Isabelle}}% 
11 
{\S\ref{sec:definingtheories}}. That material, together with the theory 

12 
files provided in the examples directories, should suffice for all simple 

13 
applications. The easiest way to define a new theory is by modifying a 

14 
copy of an existing theory. 

15 

16 
This chapter documents the metalogic syntax, mixfix declarations and 

17 
pretty printing. The extended examples in \S\ref{sec:min_logics} 

18 
demonstrate the logical aspects of the definition of theories. 

19 

20 

21 
\section{Priority grammars} \label{sec:priority_grammars} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

22 
\index{priority grammars(} 
320  23 

24 
A contextfree grammar contains a set of {\bf nonterminal symbols}, a set of 

25 
{\bf terminal symbols} and a set of {\bf productions}\index{productions}. 

26 
Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and 

27 
$\gamma$ is a string of terminals and nonterminals. One designated 

28 
nonterminal is called the {\bf start symbol}. The language defined by the 

29 
grammar consists of all strings of terminals that can be derived from the 

30 
start symbol by applying productions as rewrite rules. 

31 

32 
The syntax of an Isabelle logic is specified by a {\bf priority 

33 
grammar}.\index{priorities} Each nonterminal is decorated by an integer 

34 
priority, as in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be 

14231  35 
rewritten using a production $A^{(q)} = \gamma$ only if~$p \leq q$. Any 
320  36 
priority grammar can be translated into a normal context free grammar by 
37 
introducing new nonterminals and productions. 

38 

39 
Formally, a set of context free productions $G$ induces a derivation 

40 
relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of 

41 
terminal or nonterminal symbols. Then 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

42 
\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \] 
14231  43 
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \leq q$. 
320  44 

45 
The following simple grammar for arithmetic expressions demonstrates how 

46 
binding power and associativity of operators can be enforced by priorities. 

47 
\begin{center} 

48 
\begin{tabular}{rclr} 

49 
$A^{(9)}$ & = & {\tt0} \\ 

50 
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\ 

51 
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\ 

52 
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\ 

53 
$A^{(3)}$ & = & {\tt} $A^{(3)}$ 

54 
\end{tabular} 

55 
\end{center} 

56 
The choice of priorities determines that {\tt } binds tighter than {\tt *}, 

57 
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the 

58 
left and {\tt *} to the right. 

59 

60 
For clarity, grammars obey these conventions: 

61 
\begin{itemize} 

62 
\item All priorities must lie between~0 and \ttindex{max_pri}, which is a 

63 
some fixed integer. Sometimes {\tt max_pri} is written as $\infty$. 

64 
\item Priority 0 on the righthand side and priority \ttindex{max_pri} on 

65 
the lefthand side may be omitted. 

66 
\item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the 

67 
priority of the lefthand side actually appears in a column on the far 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

68 
right. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

69 
\item Alternatives are separated by~$$. 
320  70 
\item Repetition is indicated by dots~(\dots) in an informal but obvious 
71 
way. 

72 
\end{itemize} 

73 

74 
Using these conventions and assuming $\infty=9$, the grammar 

75 
takes the form 

76 
\begin{center} 

77 
\begin{tabular}{rclc} 

78 
$A$ & = & {\tt0} & \hspace*{4em} \\ 

79 
& $$ & {\tt(} $A$ {\tt)} \\ 

80 
& $$ & $A$ {\tt+} $A^{(1)}$ & (0) \\ 

81 
& $$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\ 

82 
& $$ & {\tt} $A^{(3)}$ & (3) 

83 
\end{tabular} 

84 
\end{center} 

85 
\index{priority grammars)} 

86 

87 

4597
a0bdee64194c
Fixed a lot of overfull and underfull lines (hboxes)
paulson
parents:
4543
diff
changeset

88 
\begin{figure}\small 
320  89 
\begin{center} 
90 
\begin{tabular}{rclc} 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

91 
$any$ &=& $prop$ ~~$$~~ $logic$ \\\\ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

92 
$prop$ &=& {\tt(} $prop$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

93 
&$$& $prop^{(4)}$ {\tt::} $type$ & (3) \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

94 
&$$& {\tt PROP} $aprop$ \\ 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

95 
&$$& $any^{(3)}$ {\tt ==} $any^{(2)}$ & (2) \\ 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

96 
&$$& $any^{(3)}$ {\tt =?=} $any^{(2)}$ & (2) \\ 
320  97 
&$$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\ 
98 
&$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop^{(1)}$ & (1) \\ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

99 
&$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

100 
&$$& {\tt OFCLASS} {\tt(} $type$ {\tt,} $logic$ {\tt)} \\\\ 
4543  101 
$aprop$ &=& $id$ ~~$$~~ $longid$ ~~$$~~ $var$ 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

102 
~~$$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\\\ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

103 
$logic$ &=& {\tt(} $logic$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

104 
&$$& $logic^{(4)}$ {\tt::} $type$ & (3) \\ 
4543  105 
&$$& $id$ ~~$$~~ $longid$ ~~$$~~ $var$ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

106 
~~$$~~ $logic^{(\infty)}$ {\tt(} $any$ {\tt,} \dots {\tt,} $any$ {\tt)} \\ 
11621  107 
&$$& {\tt \%} $pttrns$ {\tt.} $any^{(3)}$ & (3) \\ 
108 
&$$& {\tt TYPE} {\tt(} $type$ {\tt)} \\\\ 

320  109 
$idts$ &=& $idt$ ~~$$~~ $idt^{(1)}$ $idts$ \\\\ 
110 
$idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\ 

111 
&$$& $id$ {\tt ::} $type$ & (0) \\\\ 

3694  112 
$pttrns$ &=& $pttrn$ ~~$$~~ $pttrn^{(1)}$ $pttrns$ \\\\ 
113 
$pttrn$ &=& $idt$ \\\\ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

114 
$type$ &=& {\tt(} $type$ {\tt)} \\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

115 
&$$& $tid$ ~~$$~~ $tvar$ ~~$$~~ $tid$ {\tt::} $sort$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

116 
~~$$~~ $tvar$ {\tt::} $sort$ \\ 
320  117 
&$$& $id$ ~~$$~~ $type^{(\infty)}$ $id$ 
118 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ 

4543  119 
&$$& $longid$ ~~$$~~ $type^{(\infty)}$ $longid$ 
120 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $longid$ \\ 

320  121 
&$$& $type^{(1)}$ {\tt =>} $type$ & (0) \\ 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

122 
&$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0) \\\\ 
4543  123 
$sort$ &=& $id$ ~~$$~~ $longid$ ~~$$~~ {\tt\ttlbrace\ttrbrace} ~~$$~~ 
8136  124 
{\tt\ttlbrace} $id$ ~$$~ $longid${\tt,}\dots{\tt,} $id$ ~$$~$longid$ {\tt\ttrbrace} 
320  125 
\end{tabular} 
126 
\index{*PROP symbol} 

127 
\index{*== symbol}\index{*=?= symbol}\index{*==> symbol} 

128 
\index{*:: symbol}\index{*=> symbol} 

332  129 
\index{sort constraints} 
130 
%the index command: a percent is permitted, but braces must match! 

320  131 
\index{%@{\tt\%} symbol} 
132 
\index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol} 

133 
\index{*[ symbol}\index{*] symbol} 

134 
\index{*"!"! symbol} 

135 
\index{*"[" symbol} 

136 
\index{*""] symbol} 

137 
\end{center} 

138 
\caption{Metalogic syntax}\label{fig:pure_gram} 

139 
\end{figure} 

140 

141 

142 
\section{The Pure syntax} \label{sec:basic_syntax} 

143 
\index{syntax!Pure(} 

144 

145 
At the root of all objectlogics lies the theory \thydx{Pure}. It 

146 
contains, among many other things, the Pure syntax. An informal account of 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

147 
this basic syntax (types, terms and formulae) appears in 
320  148 
\iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}% 
149 
{\S\ref{sec:forward}}. A more precise description using a priority grammar 

150 
appears in Fig.\ts\ref{fig:pure_gram}. It defines the following 

151 
nonterminals: 

152 
\begin{ttdescription} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

153 
\item[\ndxbold{any}] denotes any term. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

154 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

155 
\item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

156 
of the metalogic. Note that user constants of result type {\tt prop} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

157 
(i.e.\ $c :: \ldots \To prop$) should always provide concrete syntax. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

158 
Otherwise atomic propositions with head $c$ may be printed incorrectly. 
320  159 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

160 
\item[\ndxbold{aprop}] denotes atomic propositions. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

161 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

162 
%% FIXME huh!? 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

163 
% These typically 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

164 
% include the judgement forms of the objectlogic; its definition 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

165 
% introduces a metalevel predicate for each judgement form. 
320  166 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

167 
\item[\ndxbold{logic}] denotes terms whose type belongs to class 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

168 
\cldx{logic}, excluding type \tydx{prop}. 
320  169 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

170 
\item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

171 
by types. 
3694  172 

173 
\item[\ndxbold{pttrn}, \ndxbold{pttrns}] denote patterns for 

174 
abstraction, cases etc. Initially the same as $idt$ and $idts$, 

14231  175 
these are intended to be augmented by user extensions. 
320  176 

177 
\item[\ndxbold{type}] denotes types of the metalogic. 

178 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

179 
\item[\ndxbold{sort}] denotes metalevel sorts. 
320  180 
\end{ttdescription} 
181 

182 
\begin{warn} 

183 
In {\tt idts}, note that \verbx::nat y is parsed as \verbx::(nat y), 

184 
treating {\tt y} like a type constructor applied to {\tt nat}. The 

185 
likely result is an error message. To avoid this interpretation, use 

186 
parentheses and write \verb(x::nat) y. 

332  187 
\index{type constraints}\index{*:: symbol} 
320  188 

189 
Similarly, \verbx::nat y::nat is parsed as \verbx::(nat y::nat) and 

190 
yields an error. The correct form is \verb(x::nat) (y::nat). 

191 
\end{warn} 

192 

452  193 
\begin{warn} 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

194 
Type constraints bind very weakly. For example, \verb!x<y::nat! is normally 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

195 
parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 3 or less, in 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

196 
which case the string is likely to be ambiguous. The correct form is 
452  197 
\verb!x<(y::nat)!. 
198 
\end{warn} 

320  199 

867  200 
\subsection{Logical types and default syntax}\label{logicaltypes} 
201 
\index{lambda calc@$\lambda$calculus} 

202 

203 
Isabelle's representation of mathematical languages is based on the 

204 
simply typed $\lambda$calculus. All logical types, namely those of 

205 
class \cldx{logic}, are automatically equipped with a basic syntax of 

206 
types, identifiers, variables, parentheses, $\lambda$abstraction and 

207 
application. 

208 
\begin{warn} 

209 
Isabelle combines the syntaxes for all types of class \cldx{logic} by 

210 
mapping all those types to the single nonterminal $logic$. Thus all 

211 
productions of $logic$, in particular $id$, $var$ etc, become available. 

212 
\end{warn} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

213 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

214 

320  215 
\subsection{Lexical matters} 
216 
The parser does not process input strings directly. It operates on token 

217 
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of 

218 
tokens: \bfindex{delimiters} and \bfindex{name tokens}. 

219 

220 
\index{reserved words} 

221 
Delimiters can be regarded as reserved words of the syntax. You can 

222 
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they 

223 
appear in typewriter font, for example {\tt ==}, {\tt =?=} and 

224 
{\tt PROP}\@. 

225 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

226 
Name tokens have a predefined syntax. The lexer distinguishes six disjoint 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

227 
classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

228 
identifiers\index{type identifiers}, type unknowns\index{type unknowns}, 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

229 
\rmindex{numerals}, \rmindex{strings}. They are denoted by \ndxbold{id}, 
14948  230 
\ndxbold{var}, \ndxbold{tid}, \ndxbold{tvar}, \ndxbold{num}, \ndxbold{xnum}, 
231 
\ndxbold{xstr}, respectively. Typical examples are {\tt x}, {\tt ?x7}, {\tt 

232 
'a}, {\tt ?'a3}, {\tt \#42}, {\tt ''foo bar''}. Here is the precise syntax: 

320  233 
\begin{eqnarray*} 
14955  234 
id & = & letter\,quasiletter^* \\ 
235 
longid & = & id (\mbox{\tt .}id)^+ \\ 

320  236 
var & = & \mbox{\tt ?}id ~~~~ \mbox{\tt ?}id\mbox{\tt .}nat \\ 
237 
tid & = & \mbox{\tt '}id \\ 

238 
tvar & = & \mbox{\tt ?}tid ~~~~ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

239 
\mbox{\tt ?}tid\mbox{\tt .}nat \\ 
14948  240 
num & = & nat ~~~~ \mbox{\tt}nat \\ 
5542  241 
xnum & = & \mbox{\tt \#}nat ~~~~ \mbox{\tt \#}nat \\ 
14955  242 
xstr & = & \mbox{\tt ''~\dots~\tt ''} \\[1ex] 
14960  243 
letter & = & latin ~~ \verb,\<,latin\verb,>, ~~ \verb,\<,latin\,latin\verb,>, ~~ greek ~ \\ 
244 
& & \verb,\<^isub>, ~~ \verb,\<^isup>, \\ 

245 
quasiletter & = & letter ~~ digit ~~ \verb,_, ~~ \verb,', \\ 

14948  246 
latin & = & \verb,a, ~~ \dots ~~ \verb,z, ~~ \verb,A, ~~ \dots ~~ \verb,Z, \\ 
247 
digit & = & \verb,0, ~~ \dots ~~ \verb,9, \\ 

248 
nat & = & digit^+ \\ 

14955  249 
greek & = & \verb,\<alpha>, ~~ \verb,\<beta>, ~~ \verb,\<gamma>, ~~ \verb,\<delta>, ~ \\ 
250 
& & \verb,\<epsilon>, ~~ \verb,\<zeta>, ~~ \verb,\<eta>, ~~ \verb,\<theta>, ~ \\ 

251 
& & \verb,\<iota>, ~~ \verb,\<kappa>, ~~ \verb,\<mu>, ~~ \verb,\<nu>, ~ \\ 

252 
& & \verb,\<xi>, ~~ \verb,\<pi>, ~~ \verb,\<rho>, ~~ \verb,\<sigma>, ~ \\ 

253 
& & \verb,\<tau>, ~~ \verb,\<upsilon>, ~~ \verb,\<phi>, ~~ \verb,\<psi>, ~ \\ 

254 
& & \verb,\<omega>, ~~ \verb,\<Gamma>, ~~ \verb,\<Delta>, ~~ \verb,\<Theta>, ~ \\ 

255 
& & \verb,\<Lambda>, ~~ \verb,\<Xi>, ~~ \verb,\<Pi>, ~~ \verb,\<Sigma>, ~ \\ 

256 
& & \verb,\<Upsilon>, ~~ \verb,\<Phi>, ~~ \verb,\<Psi>, ~~ \verb,\<Omega>, \\ 

320  257 
\end{eqnarray*} 
4543  258 
The lexer repeatedly takes the longest prefix of the input string that 
259 
forms a valid token. A maximal prefix that is both a delimiter and a 

260 
name is treated as a delimiter. Spaces, tabs, newlines and formfeeds 

261 
are separators; they never occur within tokens, except those of class 

262 
$xstr$. 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

263 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

264 
\medskip 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

265 
Delimiters need not be separated by white space. For example, if {\tt } 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

266 
is a delimiter but {\tt } is not, then the string {\tt } is treated as 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

267 
two consecutive occurrences of the token~{\tt }. In contrast, \ML\ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

268 
treats {\tt } as a single symbolic name. The consequence of Isabelle's 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

269 
more liberal scheme is that the same string may be parsed in different ways 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

270 
after extending the syntax: after adding {\tt } as a delimiter, the input 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

271 
{\tt } is treated as a single token. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

272 

320  273 
A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally 
274 
a pair of base name and index (\ML\ type \mltydx{indexname}). These 

275 
components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or 

276 
run together as in {\tt ?x1}. The latter form is possible if the base name 

277 
does not end with digits. If the index is 0, it may be dropped altogether: 

278 
{\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}. 

279 

14948  280 
Tokens of class $num$, $xnum$ or $xstr$ are not used by the metalogic. 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

281 
Objectlogics may provide numerals and string constants by adding appropriate 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

282 
productions and translation functions. 
320  283 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

284 
\medskip 
320  285 
Although name tokens are returned from the lexer rather than the parser, it 
286 
is more logical to regard them as nonterminals. Delimiters, however, are 

287 
terminals; they are just syntactic sugar and contribute nothing to the 

288 
abstract syntax tree. 

289 

290 

3108  291 
\subsection{*Inspecting the syntax} \label{pg:print_syn} 
320  292 
\begin{ttbox} 
293 
syn_of : theory > Syntax.syntax 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

294 
print_syntax : theory > unit 
320  295 
Syntax.print_syntax : Syntax.syntax > unit 
296 
Syntax.print_gram : Syntax.syntax > unit 

297 
Syntax.print_trans : Syntax.syntax > unit 

298 
\end{ttbox} 

299 
The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes 

300 
in \ML. You can display values of this type by calling the following 

301 
functions: 

302 
\begin{ttdescription} 

303 
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle 

304 
theory~{\it thy} as an \ML\ value. 

305 

8136  306 
\item[\ttindexbold{print_syntax} $thy$] uses {\tt Syntax.print_syntax} 
307 
to display the syntax part of theory $thy$. 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

308 

320  309 
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all 
310 
information contained in the syntax {\it syn}. The displayed output can 

311 
be large. The following two functions are more selective. 

312 

313 
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

314 
of~{\it syn}, namely the lexicon, logical types and productions. These are 
320  315 
discussed below. 
316 

317 
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation 

318 
part of~{\it syn}, namely the constants, parse/print macros and 

319 
parse/print translations. 

320 
\end{ttdescription} 

321 

12465  322 
The output of the above print functions is divided into labelled sections. 
323 
The grammar is represented by {\tt lexicon}, {\tt logtypes} and {\tt prods}. 

324 
The rest refers to syntactic translations and macro expansion. Here is an 

320  325 
explanation of the various sections. 
326 
\begin{description} 

327 
\item[{\tt lexicon}] lists the delimiters used for lexical 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

328 
analysis.\index{delimiters} 
320  329 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

330 
\item[{\tt logtypes}] lists the types that are regarded the same as {\tt 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

331 
logic} syntactically. Thus types of objectlogics (e.g.\ {\tt nat}, say) 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

332 
will be automatically equipped with the standard syntax of 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

333 
$\lambda$calculus. 
320  334 

335 
\item[{\tt prods}] lists the \rmindex{productions} of the priority grammar. 

336 
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}. 

337 
Each delimiter is quoted. Some productions are shown with {\tt =>} and 

338 
an attached string. These strings later become the heads of parse 

339 
trees; they also play a vital role when terms are printed (see 

340 
\S\ref{sec:asts}). 

341 

342 
Productions with no strings attached are called {\bf copy 

343 
productions}\indexbold{productions!copy}. Their righthand side must 

344 
have exactly one nonterminal symbol (or name token). The parser does 

345 
not create a new parse tree node for copy productions, but simply 

346 
returns the parse tree of the righthand symbol. 

347 

348 
If the righthand side consists of a single nonterminal with no 

349 
delimiters, then the copy production is called a {\bf chain 

350 
production}. Chain productions act as abbreviations: 

351 
conceptually, they are removed from the grammar by adding new 

352 
productions. Priority information attached to chain productions is 

353 
ignored; only the dummy value $1$ is displayed. 

3108  354 

355 
\item[\ttindex{print_modes}] lists the alternative print modes 

356 
provided by this syntax (see \S\ref{sec:prmodes}). 

320  357 

358 
\item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}] 

359 
relate to macros (see \S\ref{sec:macros}). 

360 

361 
\item[{\tt parse_ast_translation}, {\tt print_ast_translation}] 

362 
list sets of constants that invoke translation functions for abstract 

363 
syntax trees. Section \S\ref{sec:asts} below discusses this obscure 

364 
matter.\index{constants!for translations} 

365 

4597
a0bdee64194c
Fixed a lot of overfull and underfull lines (hboxes)
paulson
parents:
4543
diff
changeset

366 
\item[{\tt parse_translation}, {\tt print_translation}] list the sets 
320  367 
of constants that invoke translation functions for terms (see 
368 
\S\ref{sec:tr_funs}). 

369 
\end{description} 

370 
\index{syntax!Pure)} 

371 

372 

373 
\section{Mixfix declarations} \label{sec:mixfix} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

374 
\index{mixfix declarations(} 
320  375 

376 
When defining a theory, you declare new constants by giving their names, 

377 
their type, and an optional {\bf mixfix annotation}. Mixfix annotations 

378 
allow you to extend Isabelle's basic $\lambda$calculus syntax with 

379 
readable notation. They can express any contextfree priority grammar. 

380 
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

381 
general than the priority declarations of \ML\ and Prolog. 
320  382 

383 
A mixfix annotation defines a production of the priority grammar. It 

384 
describes the concrete syntax, the translation to abstract syntax, and the 

385 
pretty printing. Special case annotations provide a simple means of 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

386 
specifying infix operators and binders. 
320  387 

388 
\subsection{The general mixfix form} 

389 
Here is a detailed account of mixfix declarations. Suppose the following 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

390 
line occurs within a {\tt consts} or {\tt syntax} section of a {\tt .thy} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

391 
file: 
320  392 
\begin{center} 
393 
{\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)} 

394 
\end{center} 

332  395 
This constant declaration and mixfix annotation are interpreted as follows: 
320  396 
\begin{itemize}\index{productions} 
397 
\item The string {\tt $c$} is the name of the constant associated with the 

398 
production; unless it is a valid identifier, it must be enclosed in 

399 
quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy 

400 
production.\index{productions!copy} Otherwise, parsing an instance of the 

401 
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$ 

402 
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$th 

403 
argument. 

404 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

405 
\item The constant $c$, if nonempty, is declared to have type $\sigma$ 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

406 
({\tt consts} section only). 
320  407 

408 
\item The string $template$ specifies the righthand side of 

409 
the production. It has the form 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

410 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \] 
320  411 
where each occurrence of {\tt_} denotes an argument position and 
412 
the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in 

413 
the concrete syntax, you must escape it as described below.) The $w@i$ 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

414 
may consist of \rmindex{delimiters}, spaces or 
320  415 
\rmindex{pretty printing} annotations (see below). 
416 

417 
\item The type $\sigma$ specifies the production's nonterminal symbols 

418 
(or name tokens). If $template$ is of the form above then $\sigma$ 

419 
must be a function type with at least~$n$ argument positions, say 

420 
$\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are 

421 
derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

422 
below. Any of these may be function types. 
320  423 

424 
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt 

425 
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal 

426 
priority\indexbold{priorities} required of any phrase that may appear 

427 
as the $i$th argument. Missing priorities default to~0. 

4543  428 

429 
\item The integer $p$ is the priority of this production. If 

430 
omitted, it defaults to the maximal priority. Priorities range 

431 
between 0 and \ttindexbold{max_pri} (= 1000). 

320  432 

433 
\end{itemize} 

434 
% 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

435 
The resulting production is \[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

436 
A@2^{(p@2)}\, \dots\, A@n^{(p@n)}\, w@n \] where $A$ and the $A@i$ are the 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

437 
nonterminals corresponding to the types $\tau$ and $\tau@i$ respectively. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

438 
The nonterminal symbol associated with a type $(\ldots)ty$ is {\tt logic}, if 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

439 
this is a logical type (namely one of class {\tt logic} excluding {\tt 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

440 
prop}). Otherwise it is $ty$ (note that only the outermost type constructor 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

441 
is taken into account). Finally, the nonterminal of a type variable is {\tt 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

442 
any}. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

443 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

444 
\begin{warn} 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

445 
Theories must sometimes declare types for purely syntactic purposes  
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

446 
merely playing the role of nonterminals. One example is \tydx{type}, the 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

447 
builtin type of types. This is a `type of all types' in the syntactic 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

448 
sense only. Do not declare such types under {\tt arities} as belonging to 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

449 
class {\tt logic}\index{*logic class}, for that would make them useless as 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

450 
separate nonterminal symbols. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

451 
\end{warn} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

452 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

453 
Associating nonterminals with types allows a constant's type to specify 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

454 
syntax as well. We can declare the function~$f$ to have type $[\tau@1, 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

455 
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the layout 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

456 
of the function's $n$ arguments. The constant's name, in this case~$f$, will 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

457 
also serve as the label in the abstract syntax tree. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

458 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

459 
You may also declare mixfix syntax without adding constants to the theory's 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

460 
signature, by using a {\tt syntax} section instead of {\tt consts}. Thus a 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

461 
production need not map directly to a logical function (this typically 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

462 
requires additional syntactic translations, see also 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

463 
Chapter~\ref{chap:syntax}). 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

464 

d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

465 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

466 
\medskip 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

467 
As a special case of the general mixfix declaration, the form 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

468 
\begin{center} 
911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

469 
{\tt $c$ ::\ "$\sigma$" ("$template$")} 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

470 
\end{center} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

471 
specifies no priorities. The resulting production puts no priority 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

472 
constraints on any of its arguments and has maximal priority itself. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

473 
Omitting priorities in this manner is prone to syntactic ambiguities unless 
3098  474 
the production's righthand side is fully bracketed, as in 
475 
\verb"if _ then _ else _ fi". 

320  476 

477 
Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"}, 

478 
is sensible only if~$c$ is an identifier. Otherwise you will be unable to 

479 
write terms involving~$c$. 

480 

481 

482 
\subsection{Example: arithmetic expressions} 

483 
\index{examples!of mixfix declarations} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

484 
This theory specification contains a {\tt syntax} section with mixfix 
320  485 
declarations encoding the priority grammar from 
486 
\S\ref{sec:priority_grammars}: 

487 
\begin{ttbox} 

3108  488 
ExpSyntax = Pure + 
320  489 
types 
490 
exp 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

491 
syntax 
1387  492 
"0" :: exp ("0" 9) 
493 
"+" :: [exp, exp] => exp ("_ + _" [0, 1] 0) 

494 
"*" :: [exp, exp] => exp ("_ * _" [3, 2] 2) 

495 
"" :: exp => exp (" _" [3] 3) 

320  496 
end 
497 
\end{ttbox} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

498 
Executing {\tt Syntax.print_gram} reveals the productions derived from the 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

499 
above mixfix declarations (lots of additional information deleted): 
320  500 
\begin{ttbox} 
3108  501 
Syntax.print_gram (syn_of ExpSyntax.thy); 
320  502 
{\out exp = "0" => "0" (9)} 
503 
{\out exp = exp[0] "+" exp[1] => "+" (0)} 

504 
{\out exp = exp[3] "*" exp[2] => "*" (2)} 

505 
{\out exp = "" exp[3] => "" (3)} 

506 
\end{ttbox} 

507 

3108  508 
Note that because {\tt exp} is not of class {\tt logic}, it has been 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

509 
retained as a separate nonterminal. This also entails that the syntax 
3108  510 
does not provide for identifiers or paranthesized expressions. 
511 
Normally you would also want to add the declaration {\tt arities 

512 
exp::logic} after {\tt types} and use {\tt consts} instead of {\tt 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

513 
syntax}. Try this as an exercise and study the changes in the 
867  514 
grammar. 
320  515 

516 
\subsection{The mixfix template} 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

517 
Let us now take a closer look at the string $template$ appearing in mixfix 
320  518 
annotations. This string specifies a list of parsing and printing 
519 
directives: delimiters\index{delimiters}, arguments, spaces, blocks of 

520 
indentation and line breaks. These are encoded by the following character 

521 
sequences: 

522 
\index{pretty printing(} 

523 
\begin{description} 

524 
\item[~$d$~] is a delimiter, namely a nonempty sequence of characters 

525 
other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}. 

526 
Even these characters may appear if escaped; this means preceding it with 

527 
a~{\tt '} (single quote). Thus you have to write {\tt ''} if you really 

911
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

528 
want a single quote. Furthermore, a~{\tt '} followed by a space separates 
55754d6d399c
new in mixfix annotations: "' " (quote space) separates delimiters without
wenzelm
parents:
885
diff
changeset

529 
delimiters without extra white space being added for printing. 
320  530 

531 
\item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol 

532 
or name token. 

533 

534 
\item[~$s$~] is a nonempty sequence of spaces for printing. This and the 

535 
following specifications do not affect parsing at all. 

536 

537 
\item[~{\tt(}$n$~] opens a pretty printing block. The optional number $n$ 

538 
specifies how much indentation to add when a line break occurs within the 

539 
block. If {\tt(} is not followed by digits, the indentation defaults 

540 
to~0. 

541 

542 
\item[~{\tt)}~] closes a pretty printing block. 

543 

544 
\item[~{\tt//}~] forces a line break. 

545 

546 
\item[~{\tt/}$s$~] allows a line break. Here $s$ stands for the string of 

547 
spaces (zero or more) right after the {\tt /} character. These spaces 

548 
are printed if the break is not taken. 

549 
\end{description} 

550 
For example, the template {\tt"(_ +/ _)"} specifies an infix operator. 

551 
There are two argument positions; the delimiter~{\tt+} is preceded by a 

552 
space and followed by a space or line break; the entire phrase is a pretty 

553 
printing block. Other examples appear in Fig.\ts\ref{fig:set_trans} below. 

554 
Isabelle's pretty printer resembles the one described in 

6592  555 
Paulson~\cite{paulsonml2}. 
320  556 

557 
\index{pretty printing)} 

558 

559 

560 
\subsection{Infixes} 

561 
\indexbold{infixes} 

562 

3108  563 
Infix operators associating to the left or right can be declared using 
564 
{\tt infixl} or {\tt infixr}. Basically, the form {\tt $c$ ::\ 

565 
$\sigma$ (infixl $p$)} abbreviates the mixfix declarations 

320  566 
\begin{ttbox} 
1387  567 
"op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\)) 
568 
"op \(c\)" :: \(\sigma\) ("op \(c\)") 

320  569 
\end{ttbox} 
1387  570 
and {\tt $c$ ::\ $\sigma$ (infixr $p$)} abbreviates the mixfix declarations 
320  571 
\begin{ttbox} 
1387  572 
"op \(c\)" :: \(\sigma\) ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\)) 
573 
"op \(c\)" :: \(\sigma\) ("op \(c\)") 

320  574 
\end{ttbox} 
575 
The infix operator is declared as a constant with the prefix {\tt op}. 

576 
Thus, prefixing infixes with \sdx{op} makes them behave like ordinary 

577 
function symbols, as in \ML. Special characters occurring in~$c$ must be 

578 
escaped, as in delimiters, using a single quote. 

579 

3108  580 
A slightly more general form of infix declarations allows constant 
581 
names to be independent from their concrete syntax, namely \texttt{$c$ 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

582 
::\ $\sigma$\ (infixl "$sy$" $p$)}, the same for \texttt{infixr}. As 
3108  583 
an example consider: 
584 
\begin{ttbox} 

585 
and :: [bool, bool] => bool (infixr "&" 35) 

586 
\end{ttbox} 

587 
The internal constant name will then be just \texttt{and}, without any 

588 
\texttt{op} prefixed. 

589 

320  590 

591 
\subsection{Binders} 

592 
\indexbold{binders} 

593 
\begingroup 

594 
\def\Q{{\cal Q}} 

595 
A {\bf binder} is a variablebinding construct such as a quantifier. The 

596 
constant declaration 

597 
\begin{ttbox} 

1387  598 
\(c\) :: \(\sigma\) (binder "\(\Q\)" [\(pb\)] \(p\)) 
320  599 
\end{ttbox} 
600 
introduces a constant~$c$ of type~$\sigma$, which must have the form 

601 
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where 

602 
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$ 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

603 
and the whole term has type~$\tau@3$. The optional integer $pb$ 
1060
a122584b5cc5
In binders, the default body priority is now p instead of 0.
lcp
parents:
911
diff
changeset

604 
specifies the body's priority, by default~$p$. Special characters 
877  605 
in $\Q$ must be escaped using a single quote. 
320  606 

864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

607 
The declaration is expanded internally to something like 
320  608 
\begin{ttbox} 
3098  609 
\(c\)\hskip3pt :: (\(\tau@1\) => \(\tau@2\)) => \(\tau@3\) 
610 
"\(\Q\)" :: [idts, \(\tau@2\)] => \(\tau@3\) ("(3\(\Q\)_./ _)" [0, \(pb\)] \(p\)) 

320  611 
\end{ttbox} 
612 
Here \ndx{idts} is the nonterminal symbol for a list of identifiers with 

332  613 
\index{type constraints} 
320  614 
optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The 
615 
declaration also installs a parse translation\index{translations!parse} 

616 
for~$\Q$ and a print translation\index{translations!print} for~$c$ to 

617 
translate between the internal and external forms. 

618 

619 
A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a 

620 
list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$ 

621 
corresponds to the internal form 

622 
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \] 

623 

624 
\medskip 

625 
For example, let us declare the quantifier~$\forall$:\index{quantifiers} 

626 
\begin{ttbox} 

1387  627 
All :: ('a => o) => o (binder "ALL " 10) 
320  628 
\end{ttbox} 
629 
This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL 

630 
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall 

631 
back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL 

632 
$x$.$P$} have type~$o$, the type of formulae, while the bound variable 

633 
can be polymorphic. 

634 
\endgroup 

635 

636 
\index{mixfix declarations)} 

637 

3108  638 

639 
\section{*Alternative print modes} \label{sec:prmodes} 

640 
\index{print modes(} 

641 
% 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

642 
Isabelle's pretty printer supports alternative output syntaxes. These 
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

643 
may be used independently or in cooperation. The currently active 
3108  644 
print modes (with precedence from left to right) are determined by a 
645 
reference variable. 

646 
\begin{ttbox}\index{*print_mode} 

647 
print_mode: string list ref 

648 
\end{ttbox} 

649 
Initially this may already contain some print mode identifiers, 

650 
depending on how Isabelle has been invoked (e.g.\ by some user 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

651 
interface). So changes should be incremental  adding or deleting 
3108  652 
modes relative to the current value. 
653 

12465  654 
Any \ML{} string is a legal print mode identifier, without any predeclaration 
655 
required. The following names should be considered reserved, though: 

656 
\texttt{""} (the empty string), \texttt{symbols}, \texttt{xsymbols}, and 

657 
\texttt{latex}. 

3108  658 

659 
There is a separate table of mixfix productions for pretty printing 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

660 
associated with each print mode. The currently active ones are 
3108  661 
conceptually just concatenated from left to right, with the standard 
662 
syntax output table always coming last as default. Thus mixfix 

663 
productions of preceding modes in the list may override those of later 

664 
ones. Also note that token translations are always relative to some 

665 
print mode (see \S\ref{sec:tok_tr}). 

666 

667 
\medskip The canonical application of print modes is optional printing 

668 
of mathematical symbols from a special screen font instead of {\sc 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

669 
ascii}. Another example is to reuse Isabelle's advanced 
3108  670 
$\lambda$term printing mechanisms to generate completely different 
3228  671 
output, say for interfacing external tools like \rmindex{model 
672 
checkers} (see also \texttt{HOL/Modelcheck}). 

3108  673 

674 
\index{print modes)} 

675 

676 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

677 
\section{Ambiguity of parsed expressions} \label{sec:ambiguity} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

678 
\index{ambiguity!of parsed expressions} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

679 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

680 
To keep the grammar small and allow common productions to be shared 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

681 
all logical types (except {\tt prop}) are internally represented 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

682 
by one nonterminal, namely {\tt logic}. This and omitted or too freely 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

683 
chosen priorities may lead to ways of parsing an expression that were 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

684 
not intended by the theory's maker. In most cases Isabelle is able to 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

685 
select one of multiple parse trees that an expression has lead 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

686 
to by checking which of them can be typed correctly. But this may not 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

687 
work in every case and always slows down parsing. 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

688 
The warning and error messages that can be produced during this process are 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

689 
as follows: 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

690 

880  691 
If an ambiguity can be resolved by type inference the following 
692 
warning is shown to remind the user that parsing is (unnecessarily) 

3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

693 
slowed down. In cases where it's not easily possible to eliminate the 
880  694 
ambiguity the frequency of the warning can be controlled by changing 
883
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
clasohm
parents:
880
diff
changeset

695 
the value of {\tt Syntax.ambiguity_level} which has type {\tt int 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

696 
ref}. Its default value is 1 and by increasing it one can control how 
883
92abd2ad9d08
renamed Sign.ambiguity_level to Syntax.ambiguity_level
clasohm
parents:
880
diff
changeset

697 
many parse trees are necessary to generate the warning. 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

698 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

699 
\begin{ttbox} 
3801  700 
{\out Ambiguous input "\dots"} 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

701 
{\out produces the following parse trees:} 
3801  702 
{\out \dots} 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

703 
{\out Fortunately, only one parse tree is type correct.} 
3801  704 
{\out You may still want to disambiguate your grammar or your input.} 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

705 
\end{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

706 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

707 
The following message is normally caused by using the same 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

708 
syntax in two different productions: 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

709 

bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

710 
\begin{ttbox} 
3802  711 
{\out Ambiguous input "..."} 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

712 
{\out produces the following parse trees:} 
3802  713 
{\out \dots} 
714 
{\out More than one term is type correct:} 

715 
{\out \dots} 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

716 
\end{ttbox} 
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

717 

866
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

718 
Ambiguities occuring in syntax translation rules cannot be resolved by 
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

719 
type inference because it is not necessary for these rules to be type 
3485
f27a30a18a17
Now there are TWO spaces after each full stop, so that the Emacs sentence
paulson
parents:
3318
diff
changeset

720 
correct. Therefore Isabelle always generates an error message and the 
866
2d3d020eef11
added documentation of bind_thm, qed, qed_goal, get_thm, thms_of
clasohm
parents:
864
diff
changeset

721 
ambiguity should be eliminated by changing the grammar or the rule. 
711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

722 

320  723 

724 
\section{Example: some minimal logics} \label{sec:min_logics} 

725 
\index{examples!of logic definitions} 

726 

727 
This section presents some examples that have a simple syntax. They 

728 
demonstrate how to define new objectlogics from scratch. 

729 

711
bb868a30e66f
updated remarks about grammar; added section about ambiguities
clasohm
parents:
452
diff
changeset

730 
First we must define how an objectlogic syntax is embedded into the 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

731 
metalogic. Since all theorems must conform to the syntax for~\ndx{prop} 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

732 
(see Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the 
320  733 
objectlevel syntax. Assume that the syntax of your objectlogic defines a 
864
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

734 
metatype~\tydx{o} of formulae which refers to the nonterminal {\tt logic}. 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

735 
These formulae can now appear in axioms and theorems wherever \ndx{prop} does 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

736 
if you add the production 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

737 
\[ prop ~=~ logic. \] 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

738 
This is not supposed to be a copy production but an implicit coercion from 
d63b111b917a
quite a lot of minor and major revisions (inspecting theories, read_axm,
wenzelm
parents:
711
diff
changeset

739 
formulae to propositions: 
320  740 
\begin{ttbox} 
741 
Base = Pure + 

742 
types 

743 
o 

744 
arities 

745 
o :: logic 

746 
consts 

1387  747 
Trueprop :: o => prop ("_" 5) 
320  748 
end 
749 
\end{ttbox} 

750 
The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible 

332  751 
coercion function. Assuming this definition resides in a file {\tt Base.thy}, 
320  752 
you have to load it with the command {\tt use_thy "Base"}. 
753 

754 
One of the simplest nontrivial logics is {\bf minimal logic} of 

755 
implication. Its definition in Isabelle needs no advanced features but 

756 
illustrates the overall mechanism nicely: 

757 
\begin{ttbox} 

758 
Hilbert = Base + 

759 
consts 

1387  760 
">" :: [o, o] => o (infixr 10) 
320  761 
rules 
762 
K "P > Q > P" 

763 
S "(P > Q > R) > (P > Q) > P > R" 

764 
MP "[ P > Q; P ] ==> Q" 

765 
end 

766 
\end{ttbox} 

332  767 
After loading this definition from the file {\tt Hilbert.thy}, you can 
320  768 
start to prove theorems in the logic: 
769 
\begin{ttbox} 

5205  770 
Goal "P > P"; 
320  771 
{\out Level 0} 
772 
{\out P > P} 

773 
{\out 1. P > P} 

774 
\ttbreak 

775 
by (resolve_tac [Hilbert.MP] 1); 

776 
{\out Level 1} 

777 
{\out P > P} 

778 
{\out 1. ?P > P > P} 

779 
{\out 2. ?P} 

780 
\ttbreak 

781 
by (resolve_tac [Hilbert.MP] 1); 

782 
{\out Level 2} 

783 
{\out P > P} 

784 
{\out 1. ?P1 > ?P > P > P} 

785 
{\out 2. ?P1} 

786 
{\out 3. ?P} 

787 
\ttbreak 

788 
by (resolve_tac [Hilbert.S] 1); 

789 
{\out Level 3} 

790 
{\out P > P} 

791 
{\out 1. P > ?Q2 > P} 

792 
{\out 2. P > ?Q2} 

793 
\ttbreak 

794 
by (resolve_tac [Hilbert.K] 1); 

795 
{\out Level 4} 

796 
{\out P > P} 

797 
{\out 1. P > ?Q2} 

798 
\ttbreak 

799 
by (resolve_tac [Hilbert.K] 1); 

800 
{\out Level 5} 

801 
{\out P > P} 

802 
{\out No subgoals!} 

803 
\end{ttbox} 

804 
As we can see, this Hilbertstyle formulation of minimal logic is easy to 

805 
define but difficult to use. The following natural deduction formulation is 

806 
better: 

807 
\begin{ttbox} 

808 
MinI = Base + 

809 
consts 

1387  810 
">" :: [o, o] => o (infixr 10) 
320  811 
rules 
812 
impI "(P ==> Q) ==> P > Q" 

813 
impE "[ P > Q; P ] ==> Q" 

814 
end 

815 
\end{ttbox} 

816 
Note, however, that although the two systems are equivalent, this fact 

817 
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be 

818 
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt 

819 
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule 

820 
in {\tt Hilbert}, something that can only be shown by induction over all 

821 
possible proofs in {\tt Hilbert}. 

822 

823 
We may easily extend minimal logic with falsity: 

824 
\begin{ttbox} 

825 
MinIF = MinI + 

826 
consts 

1387  827 
False :: o 
320  828 
rules 
829 
FalseE "False ==> P" 

830 
end 

831 
\end{ttbox} 

832 
On the other hand, we may wish to introduce conjunction only: 

833 
\begin{ttbox} 

834 
MinC = Base + 

835 
consts 

1387  836 
"&" :: [o, o] => o (infixr 30) 
320  837 
\ttbreak 
838 
rules 

839 
conjI "[ P; Q ] ==> P & Q" 

840 
conjE1 "P & Q ==> P" 

841 
conjE2 "P & Q ==> Q" 

842 
end 

843 
\end{ttbox} 

844 
And if we want to have all three connectives together, we create and load a 

3108  845 
theory file consisting of a single line: 
320  846 
\begin{ttbox} 
847 
MinIFC = MinIF + MinC 

848 
\end{ttbox} 

849 
Now we can prove mixed theorems like 

850 
\begin{ttbox} 

5205  851 
Goal "P & False > Q"; 
320  852 
by (resolve_tac [MinI.impI] 1); 
853 
by (dresolve_tac [MinC.conjE2] 1); 

854 
by (eresolve_tac [MinIF.FalseE] 1); 

855 
\end{ttbox} 

856 
Try this as an exercise! 

5371  857 

858 

859 
%%% Local Variables: 

860 
%%% mode: latex 

861 
%%% TeXmaster: "ref" 

862 
%%% End: 