
1 \chapter{Defining Logics} \label{DefiningLogics} 

2 This chapter is intended for Isabelle experts. It explains how to define new 

3 logical systems, Isabelle's {\it raison d'\^etre}. Isabelle logics are 

4 hierarchies of theories. A number of simple examples are contained in the 

5 introductory manual; the full syntax of theory definitions is shown in the 

6 {\em Reference Manual}. The purpose of this chapter is to explain the 

7 remaining subtleties, especially some context conditions on the class 

8 structure and the definition of new mixfix syntax. A full understanding of 

9 the material requires knowledge of the internal representation of terms (data 

10 type {\tt term}) as detailed in the {\em Reference Manual}. Sections marked 

11 with a * can be skipped on first reading. 

12 

13 

14 \section{Classes and Types *} 

15 \index{*arities!context conditions} 

16 

17 Type declarations are subject to the following two wellformedness 

18 conditions: 

19 \begin{itemize} 

20 \item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$ 

21 with $\vec{r} \neq \vec{s}$. For example 

22 \begin{ttbox} 

23 types ty 1 

24 arities ty :: (\{logic\}) logic 

25 ty :: (\{\})logic 

26 \end{ttbox} 

27 leads to an error message and fails. 

28 \item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty :: 

29 (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold 

30 for $i=1,\dots,n$. The relationship $\preceq$, defined as 

31 \[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \] 

32 expresses that the set of types represented by $s'$ is a subset of the set of 

33 types represented by $s$. For example 

34 \begin{ttbox} 

35 classes term < logic 

36 types ty 1 

37 arities ty :: (\{logic\})logic 

38 ty :: (\{\})term 

39 \end{ttbox} 

40 leads to an error message and fails. 

41 \end{itemize} 

42 These conditions guarantee principal types~\cite{nipkowprehofer}. 

43 

44 \section{Precedence Grammars} 

45 \label{PrecedenceGrammars} 

46 \index{precedence grammar(} 

47 

48 The precise syntax of a logic is best defined by a contextfree grammar. 

49 These grammars obey the following conventions: identifiers denote 

50 nonterminals, {\tt typewriter} fount denotes terminals, repetition is 

51 indicated by \dots, and alternatives are separated by $$. 

52 

53 In order to simplify the description of mathematical languages, we introduce 

54 an extended format which permits {\bf precedences}\index{precedence}. This 

55 scheme generalizes precedence declarations in \ML\ and {\sc prolog}. In this 

56 extended grammar format, nonterminals are decorated by integers, their 

57 precedence. In the sequel, precedences are shown as subscripts. A nonterminal 

58 $A@p$ on the righthand side of a production may only be replaced using a 

59 production $A@q = \gamma$ where $p \le q$. 

60 

61 Formally, a set of context free productions $G$ induces a derivation 

62 relation $\rew@G$ on strings as follows: 

63 \[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~ 

64 \exists q \ge p.~(A@q=\gamma) \in G 

65 \] 

66 Any extended grammar of this kind can be translated into a normal context 

67 free grammar. However, this translation may require the introduction of a 

68 large number of new nonterminals and productions. 

69 

70 \begin{example} 

71 \label{PrecedenceEx} 

72 The following simple grammar for arithmetic expressions demonstrates how 

73 binding power and associativity of operators can be enforced by precedences. 

74 \begin{center} 

75 \begin{tabular}{rclr} 

76 $A@9$ & = & {\tt0} \\ 

77 $A@9$ & = & {\tt(} $A@0$ {\tt)} \\ 

78 $A@0$ & = & $A@0$ {\tt+} $A@1$ \\ 

79 $A@2$ & = & $A@3$ {\tt*} $A@2$ \\ 

80 $A@3$ & = & {\tt} $A@3$ 

81 \end{tabular} 

82 \end{center} 

83 The choice of precedences determines that \verb$$ binds tighter than 

84 \verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$ 

85 associate to the left and right, respectively. 

86 \end{example} 

87 

88 To minimize the number of subscripts, we adopt the following conventions: 

89 \begin{itemize} 

90 \item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for 

91 some fixed $max_pri$. 

92 \item precedence $0$ on the righthand side and precedence $max_pri$ on the 

93 lefthand side may be omitted. 

94 \end{itemize} 

95 In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$. 

96 

97 Using these conventions and assuming $max_pri=9$, the grammar in 

98 Example~\ref{PrecedenceEx} becomes 

99 \begin{center} 

100 \begin{tabular}{rclc} 

101 $A$ & = & {\tt0} & \hspace*{4em} \\ 

102 & $$ & {\tt(} $A$ {\tt)} \\ 

103 & $$ & $A$ {\tt+} $A@1$ & (0) \\ 

104 & $$ & $A@3$ {\tt*} $A@2$ & (2) \\ 

105 & $$ & {\tt} $A@3$ & (3) 

106 \end{tabular} 

107 \end{center} 

108 

109 \index{precedence grammar)} 

110 

111 \section{Basic syntax *} 

112 

113 An informal account of most of Isabelle's syntax (metalogic, types etc) is 

114 contained in {\em Introduction to Isabelle}. A precise description using a 

115 precedence grammar is shown in Figure~\ref{MetaLogicSyntax}. This description 

116 is the basis of all extensions by objectlogics. 

117 \begin{figure}[htb] 

118 \begin{center} 

119 \begin{tabular}{rclc} 

120 $prop$ &=& \ttindex{PROP} $aprop$ ~~$$~~ {\tt(} $prop$ {\tt)} \\ 

121 &$$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\ 

122 &$$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\ 

123 &$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop@1$ & (1) \\ 

124 &$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\ 

125 $logic$ &=& $prop$ ~~$$~~ $fun$ \\\\ 

126 $aprop$ &=& $id$ ~~$$~~ $var$ 

127 ~~$$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\ 

128 $fun$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $fun$ {\tt)} \\ 

129 &$$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\ 

130 $idts$ &=& $idt$ ~~$$~~ $idt@1$ $idts$ \\\\ 

131 $idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\ 

132 &$$& $id$ \ttindex{::} $type$ & (0) \\\\ 

133 $type$ &=& $tfree$ ~~$$~~ $tvar$ \\ 

134 &$$& $tfree$ {\tt::} $sort$ ~~$$~~ $tvar$ {\tt::} $sort$ \\ 

135 &$$& $id$ ~~$$~~ $type@{max_pri}$ $id$ 

136 ~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\ 

137 &$$& $type@1$ \ttindex{=>} $type$ & (0) \\ 

138 &$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\ 

139 &$$& {\tt(} $type$ {\tt)} \\\\ 

140 $sort$ &=& $id$ ~~$$~~ {\tt\{\}} 

141 ~~$$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}} 

142 \end{tabular}\index{*"!"!}\index{*"["}\index{*""]} 

143 \indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$} 

144 \indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$} 

145 \end{center} 

146 \caption{MetaLogic Syntax} 

147 \label{MetaLogicSyntax} 

148 \end{figure} 

149 The following main categories are defined: 

150 \begin{description} 

151 \item[$prop$] Terms of type $prop$, i.e.\ formulae of the metalogic. 

152 \item[$aprop$] Atomic propositions. 

153 \item[$logic$] Terms of types in class $logic$. Initially, $logic$ contains 

154 merely $prop$. As the syntax is extended by new objectlogics, more 

155 productions for $logic$ are added (see below). 

156 \item[$fun$] Terms potentially of function type. 

157 \item[$type$] Types. 

158 \item[$idts$] a list of identifiers, possibly constrained by types. Note 

159 that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a 

160 type constructor applied to $nat$. 

161 \end{description} 

162 

163 The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers 

164 ({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns 

165 ({\tt ?'a}) respectively. If we think of them as nonterminals with 

166 predefined syntax, we may assume that all their productions have precedence 

167 $max_pri$. 

168 

169 \subsection{Logical types and default syntax} 

170 

171 Isabelle is concerned with mathematical languages which have a certain 

172 minimal vocabulary: identifiers, variables, parentheses, and the lambda 

173 calculus. Logical types, i.e.\ those of class $logic$, are automatically 

174 equipped with this basic syntax. More precisely, for any type constructor 

175 $ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following 

176 productions are added: 

177 \begin{center} 

178 \begin{tabular}{rclc} 

179 $ty$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $ty$ {\tt)} \\ 

180 &$$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\ 

181 &$$& $ty@{max_pri}$ {\tt::} $type$\\\\ 

182 $logic$ &=& $ty$ 

183 \end{tabular} 

184 \end{center} 

185 

186 

187 \section{Mixfix syntax} 

188 \index{mixfix(} 

189 

190 We distinguish between abstract and concrete syntax. The {\em abstract} 

191 syntax is given by the typed constants of a theory. Abstract syntax trees are 

192 welltyped terms, i.e.\ values of \ML\ type {\tt term}. If none of the 

193 constants are introduced with mixfix annotations, there is no concrete syntax 

194 to speak of: terms can only be abstractions or applications of the form 

195 $f(t@1,\dots,t@n)$, where $f$ is a constant or variable. Since this notation 

196 quickly becomes unreadable, Isabelle supports syntax definitions in the form 

197 of unrestricted contextfree grammars using mixfix annotations. 

198 

199 Mixfix annotations describe the {\em concrete} syntax, its translation into 

200 the abstract syntax, and a prettyprinting scheme, all in one. Isabelle 

201 syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax. 

202 Each mixfix annotation defines a precedence grammar production and associates 

203 an Isabelle constant with it. 

204 

205 A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is 

206 interpreted as a grammar pro\duction as follows: 

207 \begin{itemize} 

208 \item $sy$ is the righthand side of this production, specified as a {\em 

209 mixfix annotation}. In general, $sy$ is of the form 

210 $\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$, where each occurrence of 

211 ``\ttindex{_}'' denotes an argument/nonterminal and the strings 

212 $\alpha@i$ do not contain ``{\tt_}''. 

213 \item $\tau$ specifies the types of the nonterminals on the left and right 

214 hand side. If $sy$ is of the form above, $\tau$ must be of the form 

215 $[\tau@1,\dots,\tau@n] \To \tau'$. Then argument $i$ is of type $\tau@i$ 

216 and the result, i.e.\ the lefthand side of the production, is of type 

217 $\tau'$. Both the $\tau@i$ and $\tau'$ may be function types. 

218 \item $c$ is the name of the Isabelle constant associated with this production. 

219 Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt 

220 Const($c$,dummyT\footnote{Proper types are inserted later on. See 

221 \S\ref{Typing}})\$$a@1$\$$\dots$\$$a@n$}\index{*dummyT}, where $a@i$ is 

222 the term generated by parsing the $i^{th}$ argument. 

223 \item $ps$ must be of the form $[p@1,\dots,p@n]$, where $p@i$ is the 

224 minimal precedence\index{precedence} required of any phrase that may appear 

225 as the $i^{th}$ argument. The null list is interpreted as a list of 0's of 

226 the appropriate length. 

227 \item $p$ is the precedence of this production. 

228 \end{itemize} 

229 Notice that there is a close connection between abstract and concrete syntax: 

230 each production has an associated constant, and types act as {\bf syntactic 

231 categories} in the concrete syntax. To emphasize this connection, we 

232 sometimes refer to the nonterminals on the righthand side of a production as 

233 its arguments and to the nonterminal on the lefthand side as its result. 

234 

235 The maximal legal precedence is called \ttindexbold{max_pri}, which is 

236 currently 1000. If you want to ignore precedences, the safest way to do so is 

237 to use the annotation {\tt($sy$)}: this production puts no precedence 

238 constraints on any of its arguments and has maximal precedence itself, i.e.\ 

239 it is always applicable and does not exclude any productions of its 

240 arguments. 

241 

242 \begin{example} 

243 In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written 

244 as follows: 

245 \begin{ttbox} 

246 types exp 0 

247 consts "0" :: "exp" ("0" 9) 

248 "+" :: "[exp,exp] => exp" ("_ + _" [0,1] 0) 

249 "*" :: "[exp,exp] => exp" ("_ * _" [3,2] 2) 

250 "" :: "exp => exp" (" _" [3] 3) 

251 \end{ttbox} 

252 Parsing the string \verb!"0 +  0 + 0"! produces the term {\tt 

253 $p$\$($p$\$($m$\$$z$)\$$z$)\$$z$} where {\tt$p =$ Const("+",dummyT)}, 

254 {\tt$m =$ Const("",dummyT)}, and {\tt$z =$ Const("0",dummyT)}. 

255 \end{example} 

256 

257 The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf 

258 metacharacter}\index{metacharacter} which does not represent itself but 

259 an argument position. The following characters are also metacharacters: 

260 \begin{ttbox} 

261 ' ( ) / 

262 \end{ttbox} 

263 Preceding any character with a quote (\verb$'$) turns it into an ordinary 

264 character. Thus you can write \verb!''! if you really want a single quote. 

265 The purpose of the other metacharacters is explained in 

266 \S\ref{PrettyPrinting}. Remember that in \ML\ strings \verb$\$ is already a 

267 (different kind of) metacharacter. 

268 

269 

270 \subsection{Types and syntactic categories *} 

271 

272 The precise mapping from types to syntactic categories is defined by the 

273 following function: 

274 \begin{eqnarray*} 

275 N(\tau@1\To\tau@2) &=& fun \\ 

276 N((\tau@1,\dots,\tau@n)ty) &=& ty \\ 

277 N(\alpha) &=& logic 

278 \end{eqnarray*} 

279 Only the outermost type constructor is taken into account and type variables 

280 can range over all logical types. This catches some illtyped terms (like 

281 $Cons(x,0)$, where $Cons :: [\alpha,\alpha list] \To \alpha list$ and $0 :: 

282 nat$) but leaves the real work to the type checker. 

283 

284 In terms of the precedence grammar format introduced in 

285 \S\ref{PrecedenceGrammars}, the declaration 

286 \begin{ttbox} 

287 consts \(c\) :: "[\(\tau@1\),\dots,\(\tau@n\)]\(\To\tau\)" ("\(\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n\)") [\(p@1\),\dots,\(p@n\)] \(p\)) 

288 \end{ttbox} 

289 defines the production 

290 \[ N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots 

291 ~\alpha@{n1} ~N(\tau@n)@{p@n}~ \alpha@n 

292 \] 

293 

294 \subsection{Copy productions *} 

295 

296 Productions which do not create a new node in the abstract syntax tree are 

297 called {\bf copy productions}. They must have exactly one nonterminal on 

298 the right hand side. The term generated when parsing that nonterminal is 

299 simply passed up as the result of parsing the whole copy production. In 

300 Isabelle a copy production is indicated by an empty constant name, i.e.\ by 

301 \begin{ttbox} 

302 consts "" :: \(\tau\) (\(sy\) \(ps\) \(p\)) 

303 \end{ttbox} 

304 

305 A special kind of copy production is one where, modulo white space, $sy$ is 

306 {\tt"_"}. It is called a {\bf chain production}. Chain productions should be 

307 seen as an abbreviation mechanism. Conceptually, they are removed from the 

308 grammar by adding appropriate new rules. Precedence information attached to 

309 chain productions is ignored. The following example demonstrates the effect: 

310 the grammar defined by 

311 \begin{ttbox} 

312 types A,B,C 0 

313 consts AB :: "B => A" ("A _" [10] 517) 

314 "" :: "C => B" ("_" [0] 100) 

315 x :: "C" ("x" 5) 

316 y :: "C" ("y" 15) 

317 \end{ttbox} 

318 admits {\tt"A y"} but not {\tt"A x"}. Had the constant in the second 

319 production been some nonempty string, both {\tt"A y"} and {\tt"A x"} would 

320 be legal. 

321 

322 \index{mixfix)} 

323 

324 \section{Lexical conventions} 

325 

326 The lexical analyzer distinguishes the following kinds of tokens: delimiters, 

327 identifiers, unknowns, type variables and type unknowns. 

328 

329 Delimiters are userdefined, i.e.\ they are extracted from the syntax 

330 definition. If $\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$ is a mixfix 

331 annotation, each $\alpha@i$ is decomposed into substrings 

332 $\beta@1~\dots~\beta@k$ which are separated by and do not contain 

333 \bfindex{white space} ( = blanks, tabs, newlines). Each $\beta@j$ becomes a 

334 delimiter. Thus a delimiter can be an arbitrary string not containing white 

335 space. 

336 

337 The lexical syntax of identifiers and variables ( = unknowns) is defined in 

338 the introductory manual. Parsing an identifier $f$ generates {\tt 

339 Free($f$,dummyT)}\index{*dummyT}. Parsing a variable {\tt?}$v$ generates 

340 {\tt Var(($u$,$i$),dummyT)} where $i$ is the integer value of the longest 

341 numeric suffix of $v$ (possibly $0$), and $u$ is the remaining prefix. 

342 Parsing a variable {\tt?}$v{.}i$ generates {\tt Var(($v$,$i$),dummyT)}. The 

343 following table covers the four different cases that can arise: 

344 \begin{center}\tt 

345 \begin{tabular}{cccc} 

346 "?v" & "?v.7" & "?v5" & "?v7.5" \\ 

347 Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$) 

348 \end{tabular} 

349 \end{center} 

350 where $d = {\tt dummyT}$. 

351 

352 In mixfix annotations, \ttindexbold{id}, \ttindexbold{var}, 

353 \ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of 

354 identifiers, unknowns, type variables and type unknowns, respectively. 

355 

356 

357 The lexical analyzer translates input strings to token lists by repeatedly 

358 taking the maximal prefix of the input string that forms a valid token. A 

359 maximal prefix that is both a delimiter and an identifier or variable (like 

360 {\tt ALL}) is treated as a delimiter. White spaces are separators. 

361 

362 An important consequence of this translation scheme is that delimiters need 

363 not be separated by white space to be recognized as separate. If \verb$""$ 

364 is a delimiter but \verb$""$ is not, the string \verb$""$ is treated as 

365 two consecutive occurrences of \verb$""$. This is in contrast to \ML\ which 

366 would treat \verb$""$ as a single (undeclared) identifier. The 

367 consequence of Isabelle's more liberal scheme is that the same string may be 

368 parsed in a different way after extending the syntax: after adding 

369 \verb$""$ as a delimiter, the input \verb$""$ is treated as 

370 a single occurrence of \verb$""$. 

371 

372 \section{Infix operators} 

373 

374 {\tt Infixl} and {\tt infixr} declare infix operators which associate to the 

375 left and right respectively. As in \ML, prefixing infix operators with 

376 \ttindexbold{op} turns them into curried functions. Infix declarations can 

377 be reduced to mixfix ones as follows: 

378 \begin{center}\tt 

379 \begin{tabular}{l@{~~$\Longrightarrow$~~}l} 

380 "$c$" ::~$\tau$ (\ttindexbold{infixl} $p$) & 

381 "op $c$" ::~$\tau$ ("_ $c$ _" [$p$,$p+1$] $p$) \\ 

382 "$c$" ::~$\tau$ (\ttindexbold{infixr} $p$) & 

383 "op $c$" ::~$\tau$ ("_ $c$ _" [$p+1$,$p$] $p$) 

384 \end{tabular} 

385 \end{center} 

386 

387 

388 \section{Binders} 

389 A {\bf binder} is a variablebinding constant, such as a quantifier. 

390 The declaration 

391 \begin{ttbox} 

392 consts \(c\) :: \(\tau\) (binder \(Q\) \(p\)) 

393 \end{ttbox}\indexbold{*binder} 

394 introduces a binder $c$ of type $\tau$, 

395 which must have the form $(\tau@1\To\tau@2)\To\tau@3$. Its concrete syntax 

396 is $Q~x.t$. A binder is like a generalized quantifier where $\tau@1$ is the 

397 type of the bound variable $x$, $\tau@2$ the type of the body $t$, and 

398 $\tau@3$ the type of the whole term. For example $\forall$ can be declared 

399 like this: 

400 \begin{ttbox} 

401 consts All :: "('a => o) => o" (binder "ALL " 10) 

402 \end{ttbox} 

403 This allows us to write $\forall x.P$ either as {\tt ALL $x$.$P$} or {\tt 

404 All(\%$x$.$P$)}; the latter form is for purists only. 

405 

406 In case $\tau@2 = \tau@3$, nested quantifications can be written as $Q~x@1 

407 \dots x@n.t$. From a syntactic point of view, 

408 \begin{ttbox} 

409 consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)" (binder "\(Q\)" \(p\)) 

410 \end{ttbox} 

411 is equivalent to 

412 \begin{ttbox} 

413 consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)" 

414 "\(Q\)" :: "[idts,\(\tau@2\)] => \(\tau@3\)" ("\(Q\)_. _" \(p\)) 

415 \end{ttbox} 

416 where {\tt idts} is the syntactic category $idts$ defined in 

417 Figure~\ref{MetaLogicSyntax}. 

418 

419 However, there is more to binders than concrete syntax: behind the scenes the 

420 body of the quantified expression has to be converted into a 

421 $\lambda$abstraction (when parsing) and back again (when printing). This 

422 is performed by the translation mechanism, which is discussed below. For 

423 binders, the definition of the required translation functions has been 

424 automated. Many other syntactic forms, such as set comprehension, require 

425 special treatment. 

426 

427 

428 \section{Parse translations *} 

429 \label{Parsetranslations} 

430 \index{parse translation(} 

431 

432 So far we have pretended that there is a close enough relationship between 

433 concrete and abstract syntax to allow an automatic translation from one to 

434 the other using the constant name supplied with each production. In many 

435 cases this scheme is not powerful enough, especially for constructs involving 

436 variable bindings. Therefore the $ML$section of a theory definition can 

437 associate constant names with userdefined translation functions by including 

438 a line 

439 \begin{ttbox} 

440 val parse_translation = \dots 

441 \end{ttbox} 

442 where the righthand side of this binding must be an \MLexpression of type 

443 \verb$(string * (term list > term))list$. 

444 

445 After the input string has been translated into a term according to the 

446 syntax definition, there is a second phase in which the term is translated 

447 using the usersupplied functions in a bottomup manner. Given a list $tab$ 

448 of the above type, a term $t$ is translated as follows. If $t$ is not of the 

449 form {\tt Const($c$,$\tau$)\$$t@1$\$\dots\$$t@n$}, then $t$ is returned 

450 unchanged. Otherwise all $t@i$ are translated into $t@i'$. Let {\tt $t' =$ 

451 Const($c$,$\tau$)\$$t@1'$\$\dots\$$t@n'$}. If there is no pair $(c,f)$ in 

452 $tab$, return $t'$. Otherwise apply $f$ to $[t@1',\dots,t@n']$. If that 

453 raises an exception, return $t'$, otherwise return the result. 

454 \begin{example}\label{listenum} 

455 \MLlists are constructed by {\tt[]} and {\tt::}. For readability the 

456 list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}. 

457 In Isabelle the two forms of lists are declared as follows: 

458 \begin{ttbox} 

459 types list 1 

460 enum 0 

461 arities list :: (term)term 

462 consts "[]" :: "'a list" ("[]") 

463 ":" :: "['a, 'a list] => 'a list" (infixr 50) 

464 enum :: "enum => 'a list" ("[_]") 

465 sing :: "'a => enum" ("_") 

466 cons :: "['a,enum] => enum" ("_, _") 

467 end 

468 \end{ttbox} 

469 Because \verb$::$ is already used for type constraints, it is replaced by 

470 \verb$:$ as the infix list constructor. 

471 

472 In order to allow list enumeration, the new type {\tt enum} is introduced. 

473 Its only purpose is syntactic and hence it does not need an arity, in 

474 contrast to the logical type {\tt list}. Although \hbox{\tt[$x$,$y$,$z$]} is 

475 syntactically legal, it needs to be translated into a term built up from 

476 \verb$[]$ and \verb$:$. This is what \verb$make_list$ accomplishes: 

477 \begin{ttbox} 

478 val cons = Const("op :", dummyT); 

479 

480 fun make_list (Const("sing",_)$e) = cons $ e $ Const("[]", dummyT) 

481  make_list (Const("cons",_)$e$es) = cons $ e $ make_list es; 

482 \end{ttbox} 

483 To hook this translation up to Isabelle's parser, the theory definition needs 

484 to contain the following $ML$section: 

485 \begin{ttbox} 

486 ML 

487 fun enum_tr[enum] = make_list enum; 

488 val parse_translation = [("enum",enum_tr)] 

489 \end{ttbox} 

490 This causes \verb!Const("enum",_)$!$t$ to be replaced by 

491 \verb$enum_tr[$$t$\verb$]$. 

492 

493 Of course the definition of \verb$make_list$ should be included in the 

494 $ML$section. 

495 \end{example} 

496 \begin{example}\label{SET} 

497 Isabelle represents the set $\{ x \mid P(x) \}$ internally by $Set(\lambda 

498 x.P(x))$. The internal and external forms need separate 

499 constants:\footnote{In practice, the external form typically has a name 

500 beginning with an {\at} sign, such as {\tt {\at}SET}. This emphasizes that 

501 the constant should be used only for parsing/printing.} 

502 \begin{ttbox} 

503 types set 1 

504 arities set :: (term)term 

505 consts Set :: "('a => o) => 'a set" 

506 SET :: "[id,o] => 'a set" ("\{_  _\}") 

507 \end{ttbox} 

508 Parsing {\tt"\{$x$  $P$\}"} according to this syntax yields the term {\tt 

509 Const("SET",dummyT) \$ Free("\(x\)",dummyT) \$ \(p\)}, where $p$ is the 

510 result of parsing $P$. What we need is the term {\tt 

511 Const("Set",dummyT)\$Abs("$x$",dummyT,$p'$)}, where $p'$ is some 

512 ``abstracted'' version of $p$. Therefore we define a function 

513 \begin{ttbox} 

514 fun set_tr[Free(s,T), p] = Const("Set", dummyT) $ 

515 Abs(s, T, abstract_over(Free(s,T), p)); 

516 \end{ttbox} 

517 where \verb$abstract_over: term*term > term$ is a predefined function such 

518 that {\tt abstract_over($u$,$t$)} replaces every occurrence of $u$ in $t$ by 

519 a {\tt Bound} variable of the correct index (i.e.\ 0 at top level). Remember 

520 that {\tt dummyT} is replaced by the correct types at a later stage (see 

521 \S\ref{Typing}). Function {\tt set_tr} is associated with {\tt SET} by 

522 including the \MLtext 

523 \begin{ttbox} 

524 val parse_translation = [("SET", set_tr)]; 

525 \end{ttbox} 

526 \end{example} 

527 

528 If you want to run the above examples in Isabelle, you should note that an 

529 $ML$section needs to contain not just a definition of 

530 \verb$parse_translation$ but also of a variable \verb$print_translation$. The 

531 purpose of the latter is to reverse the effect of the former during printing; 

532 details are found in \S\ref{Printtranslations}. Hence you need to include 

533 the line 

534 \begin{ttbox} 

535 val print_translation = []; 

536 \end{ttbox} 

537 This is instructive because the terms are then printed out in their internal 

538 form. For example the input \hbox{\tt[$x$,$y$,$z$]} is echoed as 

539 \hbox{\tt$x$:$y$:$z$:[]}. This helps to check that your parse translation is 

540 working correctly. 

541 

542 %\begin{warn} 

543 %Explicit type constraints disappear with type checking but are still 

544 %visible to the parse translation functions. 

545 %\end{warn} 

546 

547 \index{parse translation)} 

548 

549 \section{Printing} 

550 

551 Syntax definitions provide printing information in three distinct ways: 

552 through 

553 \begin{itemize} 

554 \item the syntax of the language (as used for parsing), 

555 \item pretty printing information, and 

556 \item print translation functions. 

557 \end{itemize} 

558 The bare mixfix declarations enable Isabelle to print terms, but the result 

559 will not necessarily be pretty and may look different from what you expected. 

560 To produce a pleasing layout, you need to read the following sections. 

561 

562 \subsection{Printing with mixfix declarations} 

563 

564 Let {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} be a term and let 

565 \begin{ttbox} 

566 consts \(c\) :: \(\tau\) (\(sy\)) 

567 \end{ttbox} 

568 be a mixfix declaration where $sy$ is of the form 

569 $\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$. Printing $t$ according to 

570 $sy$ means printing the string 

571 $\alpha@0\beta@1\alpha@1\ldots\alpha@{n1}\beta@n\alpha@n$, where $\beta@i$ 

572 is the result of printing $t@i$. 

573 

574 Note that the system does {\em not\/} insert blanks. They should be part of 

575 the mixfix syntax if they are required to separate tokens or achieve a 

576 certain layout. 

577 

578 \subsection{Pretty printing} 

579 \label{PrettyPrinting} 

580 \index{pretty printing} 

581 

582 In order to format the output, it is possible to embed pretty printing 

583 directives in mixfix annotations. These directives are ignored during parsing 

584 and affect only printing. The characters {\tt(}, {\tt)} and {\tt/} are 

585 interpreted as metacharacters\index{metacharacter} when found in a mixfix 

586 annotation. Their meaning is 

587 \begin{description} 

588 \item[~{\tt(}~] Open a block. A sequence of digits following it is 

589 interpreted as the \bfindex{indentation} of this block. It causes the 

590 output to be indented by $n$ positions if a line break occurs within the 

591 block. If {\tt(} is not followed by a digit, the indentation defaults to 

592 $0$. 

593 \item[~{\tt)}~] Close a block. 

594 \item[~\ttindex{/}~] Allow a \bfindex{line break}. White space immediately 

595 following {\tt/} is not printed if the line is broken at this point. 

596 \end{description} 

597 

598 \subsection{Print translations *} 

599 \label{Printtranslations} 

600 \index{print translation(} 

601 

602 Since terms are translated after parsing (see \S\ref{Parsetranslations}), 

603 there is a similar mechanism to translate them back before printing. 

604 Therefore the $ML$section of a theory definition can associate constant 

605 names with userdefined translation functions by including a line 

606 \begin{ttbox} 

607 val print_translation = \dots 

608 \end{ttbox} 

609 where the righthand side of this binding is again an \MLexpression of type 

610 \verb$(string * (term list > term))list$. 

611 Including a pair $(c,f)$ in this list causes the printer to print 

612 $f[t@1,\dots,t@n]$ whenever it finds {\tt Const($c$,_)\$$t@1$\$\dots\$$t@n$}. 

613 \begin{example} 

614 Reversing the effect of the parse translation in Example~\ref{listenum} is 

615 accomplished by the following function: 

616 \begin{ttbox} 

617 fun make_enum (Const("op :",_) $ e $ es) = case es of 

618 Const("[]",_) => Const("sing",dummyT) $ e 

619  _ => Const("enum",dummyT) $ e $ make_enum es; 

620 \end{ttbox} 

621 It translates \hbox{\tt$x$:$y$:$z$:[]} to \hbox{\tt[$x$,$y$,$z$]}. However, 

622 if the input does not terminate with an empty list, e.g.\ \hbox{\tt$x$:$xs$}, 

623 \verb$make_enum$ raises exception {\tt Match}. This signals that the 

624 attempted translation has failed and the term should be printed as is. 

625 The connection with Isabelle's pretty printer is established as follows: 

626 \begin{ttbox} 

627 fun enum_tr'[x,xs] = Const("enum",dummyT) $ 

628 make_enum(Const("op :",dummyT)$x$xs); 

629 val print_translation = [("op :", enum_tr')]; 

630 \end{ttbox} 

631 This declaration causes the printer to print \hbox{\tt enum_tr'[$x$,$y$]} 

632 whenever it finds \verb!Const("op :",_)$!$x$\verb!$!$y$. 

633 \end{example} 

634 \begin{example} 

635 In Example~\ref{SET} we showed how to translate the concrete syntax for set 

636 comprehension into the proper internal form. The string {\tt"\{$x$  

637 $P$\}"} now becomes {\tt Const("Set",_)\$Abs("$x$",_,$p$)}. If, however, 

638 the latter term were printed without translating it back, it would result 

639 in {\tt"Set(\%$x$.$P$)"}. Therefore the abstraction has to be turned back 

640 into a term that matches the concrete mixfix syntax: 

641 \begin{ttbox} 

642 fun set_tr'[Abs(x,T,P)] = 

643 let val (x',P') = variant_abs(x,T,P) 

644 in Const("SET",dummyT) $ Free(x',T) $ P' end; 

645 \end{ttbox} 

646 The function \verb$variant_abs$, a basic term manipulation function, replaces 

647 the bound variable $x$ by a {\tt Free} variable $x'$ having a unique name. A 

648 term produced by {\tt set_tr'} can now be printed according to the concrete 

649 syntax defined in Example~\ref{SET} above. 

650 

651 Notice that the application of {\tt set_tr'} fails if the second component of 

652 the argument is not an abstraction, but for example just a {\tt Free} 

653 variable. This is intentional because it signals to the caller that the 

654 translation is inapplicable. As a result {\tt Const("Set",_)\$Free("P",_)} 

655 prints as {\tt"Set(P)"}. 

656 

657 The full theory extension, including concrete syntax and both translation 

658 functions, has the following form: 

659 \begin{ttbox} 

660 types set 1 

661 arities set :: (term)term 

662 consts Set :: "('a => o) => 'a set" 

663 SET :: "[id,o] => 'a set" ("\{_  _\}") 

664 end 

665 ML 

666 fun set_tr[Free(s,T), p] = \dots; 

667 val parse_translation = [("SET", set_tr)]; 

668 fun set_tr'[Abs(x,T,P)] = \dots; 

669 val print_translation = [("Set", set_tr')]; 

670 \end{ttbox} 

671 \end{example} 

672 As during the parse translation process, the types attached to constants 

673 during print translation are ignored. Thus {\tt Const("SET",dummyT)} in 

674 {\tt set_tr'} above is acceptable. The types of {\tt Free}s and {\tt Var}s 

675 however must be preserved because they may get printed (see {\tt 

676 show_types}). 

677 

678 \index{print translation)} 

679 

680 \subsection{Printing a term} 

681 

682 Let $tab$ be the set of all stringfunction pairs of print translations in the 

683 current syntax. 

684 

685 Terms are printed recursively; print translations are applied top down: 

686 \begin{itemize} 

687 \item {\tt Free($x$,_)} is printed as $x$. 

688 \item {\tt Var(($x$,$i$),_)} is printed as $x$, if $i = 0$ and $x$ does not 

689 end with a digit, as $x$ followed by $i$, if $i \neq 0$ and $x$ does not 

690 end with a digit, and as {\tt $x$.$i$}, if $x$ ends with a digit. Thus the 

691 following cases can arise: 

692 \begin{center} 

693 {\tt\begin{tabular}{cccc} 

694 \verb$Var(("v",0),_)$ & \verb$Var(("v",7),_)$ & \verb$Var(("v5",0),_)$ \\ 

695 "?v" & "?v7" & "?v5.0" 

696 \end{tabular}} 

697 \end{center} 

698 \item {\tt Abs($x@1$,_,Abs($x@2$,_,\dots Abs($x@n$,_,$p$)\dots))}, where $p$ 

699 is not an abstraction, is printed as {\tt \%$y@1\dots y@n$.$P$}, where $P$ 

700 is the result of printing $p$, and the $x@i$ are replaced by $y@i$. The 

701 latter are (possibly new) unique names. 

702 \item {\tt Bound($i$)} is printed as {\tt B.$i$} \footnote{The occurrence of 

703 such ``loose'' bound variables indicates that either you are trying to 

704 print a subterm of an abstraction, or there is something wrong with your 

705 print translations.}. 

706 \item The application {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} (where 

707 $n$ may be $0$!) is printed as follows: 

708 

709 If there is a pair $(c,f)$ in $tab$, print $f[t@1,\dots,t@n]$. If this 

710 application raises exception {\tt Match} or there is no pair $(c,f)$ in 

711 $tab$, let $sy$ be the mixfix annotation associated with $c$. If there is 

712 no such $sy$, or if $sy$ does not have exactly $n$ argument positions, $t$ 

713 is printed as an application; otherwise $t$ is printed according to $sy$. 

714 

715 All other applications are printed as applications. 

716 \end{itemize} 

717 Printing a term {\tt $c$\$$t@1$\$\dots\$$t@n$} as an application means 

718 printing it as {\tt $s$($s@1$,\dots,$s@n$)}, where $s@i$ is the result of 

719 printing $t@i$. If $c$ is a {\tt Const}, $s$ is its first argument; 

720 otherwise $s$ is the result of printing $c$ as described above. 

721 \medskip 

722 

723 The printer also inserts parentheses where they are necessary for reasons 

724 of precedence. 

725 

726 \section{Identifiers, constants, and type inference *} 

727 \label{Typing} 

728 

729 There is one final step in the translation from strings to terms that we have 

730 not covered yet. It explains how constants are distinguished from {\tt Free}s 

731 and how {\tt Free}s and {\tt Var}s are typed. Both issues arise because {\tt 

732 Free}s and {\tt Var}s are not declared. 

733 

734 An identifier $f$ that does not appear as a delimiter in the concrete syntax 

735 can be either a free variable or a constant. Since the parser knows only 

736 about those constants which appear in mixfix annotations, it parses $f$ as 

737 {\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt 

738 typ}. Although the parser produces these very raw terms, most user 

739 interface level functions like {\tt goal} type terms according to the given 

740 theory, say $T$. In a first step, every occurrence of {\tt Free($f$,_)} or 

741 {\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is 

742 a constant $f$ of {\tt typ} $\tau$ in $T$. This means that identifiers are 

743 treated as {\tt Free}s iff they are not declared in the theory. The types of 

744 the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML. Type 

745 constraints can be used to remove ambiguities. 

746 

747 One peculiarity of the current type inference algorithm is that variables 

748 with the same name must have the same type, irrespective of whether they are 

749 schematic, free or bound. For example, take the firstorder formula $f(x) = x 

750 \land (\forall f.~ f=f)$ where ${=} :: [\alpha{::}term,\alpha]\To o$ and 

751 $\forall :: (\alpha{::}term\To o)\To o$. The first conjunct forces 

752 $x::\alpha{::}term$ and $f::\alpha\To\alpha$, the second one 

753 $f::\beta{::}term$. Although the two $f$'s are distinct, they are required to 

754 have the same type. Unifying $\alpha\To\alpha$ and $\beta{::}term$ fails 

755 because, in firstorder logic, function types are not in class $term$. 

756 

757 

758 \section{Putting it all together} 

759 

760 Having discussed the individual building blocks of a logic definition, it 

761 remains to be shown how they fit together. In particular we need to say how 

762 an objectlogic syntax is hooked up to the metalogic. Since all theorems 

763 must conform to the syntax for $prop$ (see Figure~\ref{MetaLogicSyntax}), 

764 that syntax has to be extended with the objectlevel syntax. Assume that the 

765 syntax of your objectlogic defines a category $o$ of formulae. These 

766 formulae can now appear in axioms and theorems wherever $prop$ does if you 

767 add the production 

768 \[ prop ~=~ form. \] 

769 More precisely, you need a coercion from formulae to propositions: 

770 \begin{ttbox} 

771 Base = Pure + 

772 types o 0 

773 arities o :: logic 

774 consts Trueprop :: "o => prop" ("_" 5) 

775 end 

776 \end{ttbox} 

777 The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible 

778 coercion function. Assuming this definition resides in a file {\tt base.thy}, 

779 you have to load it with the command {\tt use_thy"base"}. 

780 

781 One of the simplest nontrivial logics is {\em minimal logic} of 

782 implication. Its definition in Isabelle needs no advanced features but 

783 illustrates the overall mechanism quite nicely: 

784 \begin{ttbox} 

785 Hilbert = Base + 

786 consts ">" :: "[o,o] => o" (infixr 10) 

787 rules 

788 K "P > Q > P" 

789 S "(P > Q > R) > (P > Q) > P > R" 

790 MP "[ P > Q; P ] ==> Q" 

791 end 

792 \end{ttbox} 

793 After loading this definition you can start to prove theorems in this logic: 

794 \begin{ttbox} 

795 goal Hilbert.thy "P > P"; 

796 {\out Level 0} 

797 {\out P > P} 

798 {\out 1. P > P} 

799 by (resolve_tac [Hilbert.MP] 1); 

800 {\out Level 1} 

801 {\out P > P} 

802 {\out 1. ?P > P > P} 

803 {\out 2. ?P} 

804 by (resolve_tac [Hilbert.MP] 1); 

805 {\out Level 2} 

806 {\out P > P} 

807 {\out 1. ?P1 > ?P > P > P} 

808 {\out 2. ?P1} 

809 {\out 3. ?P} 

810 by (resolve_tac [Hilbert.S] 1); 

811 {\out Level 3} 

812 {\out P > P} 

813 {\out 1. P > ?Q2 > P} 

814 {\out 2. P > ?Q2} 

815 by (resolve_tac [Hilbert.K] 1); 

816 {\out Level 4} 

817 {\out P > P} 

818 {\out 1. P > ?Q2} 

819 by (resolve_tac [Hilbert.K] 1); 

820 {\out Level 5} 

821 {\out P > P} 

822 {\out No subgoals!} 

823 \end{ttbox} 

824 As you can see, this Hilbertstyle formulation of minimal logic is easy to 

825 define but difficult to use. The following natural deduction formulation is 

826 far preferable: 

827 \begin{ttbox} 

828 MinI = Base + 

829 consts ">" :: "[o,o] => o" (infixr 10) 

830 rules 

831 impI "(P ==> Q) ==> P > Q" 

832 impE "[ P > Q; P ] ==> Q" 

833 end 

834 \end{ttbox} 

835 Note, however, that although the two systems are equivalent, this fact cannot 

836 be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI$ 

837 (exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!. The reason 

838 is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!, 

839 something that can only be shown by induction over all possible proofs in 

840 \verb!Hilbert!. 

841 

842 It is a very simple matter to extend minimal logic with falsity: 

843 \begin{ttbox} 

844 MinIF = MinI + 

845 consts False :: "o" 

846 rules 

847 FalseE "False ==> P" 

848 end 

849 \end{ttbox} 

850 On the other hand, we may wish to introduce conjunction only: 

851 \begin{ttbox} 

852 MinC = Base + 

853 consts "&" :: "[o,o] => o" (infixr 30) 

854 rules 

855 conjI "[ P; Q ] ==> P & Q" 

856 conjE1 "P & Q ==> P" 

857 conjE2 "P & Q ==> Q" 

858 end 

859 \end{ttbox} 

860 And if we want to have all three connectives together, we define: 

861 \begin{ttbox} 

862 MinIFC = MinIF + MinC 

863 \end{ttbox} 

864 Now we can prove mixed theorems like 

865 \begin{ttbox} 

866 goal MinIFC.thy "P & False > Q"; 

867 by (resolve_tac [MinI.impI] 1); 

868 by (dresolve_tac [MinC.conjE2] 1); 

869 by (eresolve_tac [MinIF.FalseE] 1); 

870 \end{ttbox} 

871 Try this as an exercise! 