summary |
shortlog |
changelog |
graph |
tags |
branches |
files |
changeset |
file |
revisions |
annotate |
diff |
raw

doc-src/Logics/old.defining.tex

author | paulson |

Fri Feb 16 18:00:47 1996 +0100 (1996-02-16) | |

changeset 1512 | ce37c64244c0 |

parent 104 | d8205bb279a7 |

permissions | -rw-r--r-- |

Elimination of fully-functorial style.

Type tactic changed to a type abbrevation (from a datatype).

Constructor tactic and function apply deleted.

Type tactic changed to a type abbrevation (from a datatype).

Constructor tactic and function apply deleted.

1 \chapter{Defining Logics} \label{Defining-Logics}

2 This chapter is intended for Isabelle experts. It explains how to define new

3 logical systems, Isabelle's {\it raison d'\^etre}. Isabelle logics are

4 hierarchies of theories. A number of simple examples are contained in the

5 introductory manual; the full syntax of theory definitions is shown in the

6 {\em Reference Manual}. The purpose of this chapter is to explain the

7 remaining subtleties, especially some context conditions on the class

8 structure and the definition of new mixfix syntax. A full understanding of

9 the material requires knowledge of the internal representation of terms (data

10 type {\tt term}) as detailed in the {\em Reference Manual}. Sections marked

11 with a * can be skipped on first reading.

14 \section{Classes and Types *}

15 \index{*arities!context conditions}

17 Type declarations are subject to the following two well-formedness

18 conditions:

19 \begin{itemize}

20 \item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$

21 with $\vec{r} \neq \vec{s}$. For example

22 \begin{ttbox}

23 types ty 1

24 arities ty :: (\{logic\}) logic

25 ty :: (\{\})logic

26 \end{ttbox}

27 leads to an error message and fails.

28 \item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::

29 (s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold

30 for $i=1,\dots,n$. The relationship $\preceq$, defined as

31 \[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]

32 expresses that the set of types represented by $s'$ is a subset of the set of

33 types represented by $s$. For example

34 \begin{ttbox}

35 classes term < logic

36 types ty 1

37 arities ty :: (\{logic\})logic

38 ty :: (\{\})term

39 \end{ttbox}

40 leads to an error message and fails.

41 \end{itemize}

42 These conditions guarantee principal types~\cite{nipkow-prehofer}.

44 \section{Precedence Grammars}

45 \label{PrecedenceGrammars}

46 \index{precedence grammar|(}

48 The precise syntax of a logic is best defined by a context-free grammar.

49 These grammars obey the following conventions: identifiers denote

50 nonterminals, {\tt typewriter} fount denotes terminals, repetition is

51 indicated by \dots, and alternatives are separated by $|$.

53 In order to simplify the description of mathematical languages, we introduce

54 an extended format which permits {\bf precedences}\index{precedence}. This

55 scheme generalizes precedence declarations in \ML\ and {\sc prolog}. In this

56 extended grammar format, nonterminals are decorated by integers, their

57 precedence. In the sequel, precedences are shown as subscripts. A nonterminal

58 $A@p$ on the right-hand side of a production may only be replaced using a

59 production $A@q = \gamma$ where $p \le q$.

61 Formally, a set of context free productions $G$ induces a derivation

62 relation $\rew@G$ on strings as follows:

63 \[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~

64 \exists q \ge p.~(A@q=\gamma) \in G

65 \]

66 Any extended grammar of this kind can be translated into a normal context

67 free grammar. However, this translation may require the introduction of a

68 large number of new nonterminals and productions.

70 \begin{example}

71 \label{PrecedenceEx}

72 The following simple grammar for arithmetic expressions demonstrates how

73 binding power and associativity of operators can be enforced by precedences.

74 \begin{center}

75 \begin{tabular}{rclr}

76 $A@9$ & = & {\tt0} \\

77 $A@9$ & = & {\tt(} $A@0$ {\tt)} \\

78 $A@0$ & = & $A@0$ {\tt+} $A@1$ \\

79 $A@2$ & = & $A@3$ {\tt*} $A@2$ \\

80 $A@3$ & = & {\tt-} $A@3$

81 \end{tabular}

82 \end{center}

83 The choice of precedences determines that \verb$-$ binds tighter than

84 \verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$

85 associate to the left and right, respectively.

86 \end{example}

88 To minimize the number of subscripts, we adopt the following conventions:

89 \begin{itemize}

90 \item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for

91 some fixed $max_pri$.

92 \item precedence $0$ on the right-hand side and precedence $max_pri$ on the

93 left-hand side may be omitted.

94 \end{itemize}

95 In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.

97 Using these conventions and assuming $max_pri=9$, the grammar in

98 Example~\ref{PrecedenceEx} becomes

99 \begin{center}

100 \begin{tabular}{rclc}

101 $A$ & = & {\tt0} & \hspace*{4em} \\

102 & $|$ & {\tt(} $A$ {\tt)} \\

103 & $|$ & $A$ {\tt+} $A@1$ & (0) \\

104 & $|$ & $A@3$ {\tt*} $A@2$ & (2) \\

105 & $|$ & {\tt-} $A@3$ & (3)

106 \end{tabular}

107 \end{center}

109 \index{precedence grammar|)}

111 \section{Basic syntax *}

113 An informal account of most of Isabelle's syntax (meta-logic, types etc) is

114 contained in {\em Introduction to Isabelle}. A precise description using a

115 precedence grammar is shown in Figure~\ref{MetaLogicSyntax}. This description

116 is the basis of all extensions by object-logics.

117 \begin{figure}[htb]

118 \begin{center}

119 \begin{tabular}{rclc}

120 $prop$ &=& \ttindex{PROP} $aprop$ ~~$|$~~ {\tt(} $prop$ {\tt)} \\

121 &$|$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\

122 &$|$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\

123 &$|$& {\tt[|} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt|]} {\tt==>} $prop@1$ & (1) \\

124 &$|$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\

125 $logic$ &=& $prop$ ~~$|$~~ $fun$ \\\\

126 $aprop$ &=& $id$ ~~$|$~~ $var$

127 ~~$|$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\

128 $fun$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $fun$ {\tt)} \\

129 &$|$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\

130 $idts$ &=& $idt$ ~~$|$~~ $idt@1$ $idts$ \\\\

131 $idt$ &=& $id$ ~~$|$~~ {\tt(} $idt$ {\tt)} \\

132 &$|$& $id$ \ttindex{::} $type$ & (0) \\\\

133 $type$ &=& $tfree$ ~~$|$~~ $tvar$ \\

134 &$|$& $tfree$ {\tt::} $sort$ ~~$|$~~ $tvar$ {\tt::} $sort$ \\

135 &$|$& $id$ ~~$|$~~ $type@{max_pri}$ $id$

136 ~~$|$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\

137 &$|$& $type@1$ \ttindex{=>} $type$ & (0) \\

138 &$|$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\

139 &$|$& {\tt(} $type$ {\tt)} \\\\

140 $sort$ &=& $id$ ~~$|$~~ {\tt\{\}}

141 ~~$|$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}}

142 \end{tabular}\index{*"!"!}\index{*"["|}\index{*"|"]}

143 \indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}

144 \indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}

145 \end{center}

146 \caption{Meta-Logic Syntax}

147 \label{MetaLogicSyntax}

148 \end{figure}

149 The following main categories are defined:

150 \begin{description}

151 \item[$prop$] Terms of type $prop$, i.e.\ formulae of the meta-logic.

152 \item[$aprop$] Atomic propositions.

153 \item[$logic$] Terms of types in class $logic$. Initially, $logic$ contains

154 merely $prop$. As the syntax is extended by new object-logics, more

155 productions for $logic$ are added (see below).

156 \item[$fun$] Terms potentially of function type.

157 \item[$type$] Types.

158 \item[$idts$] a list of identifiers, possibly constrained by types. Note

159 that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a

160 type constructor applied to $nat$.

161 \end{description}

163 The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers

164 ({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns

165 ({\tt ?'a}) respectively. If we think of them as nonterminals with

166 predefined syntax, we may assume that all their productions have precedence

167 $max_pri$.

169 \subsection{Logical types and default syntax}

171 Isabelle is concerned with mathematical languages which have a certain

172 minimal vocabulary: identifiers, variables, parentheses, and the lambda

173 calculus. Logical types, i.e.\ those of class $logic$, are automatically

174 equipped with this basic syntax. More precisely, for any type constructor

175 $ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following

176 productions are added:

177 \begin{center}

178 \begin{tabular}{rclc}

179 $ty$ &=& $id$ ~~$|$~~ $var$ ~~$|$~~ {\tt(} $ty$ {\tt)} \\

180 &$|$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\

181 &$|$& $ty@{max_pri}$ {\tt::} $type$\\\\

182 $logic$ &=& $ty$

183 \end{tabular}

184 \end{center}

187 \section{Mixfix syntax}

188 \index{mixfix|(}

190 We distinguish between abstract and concrete syntax. The {\em abstract}

191 syntax is given by the typed constants of a theory. Abstract syntax trees are

192 well-typed terms, i.e.\ values of \ML\ type {\tt term}. If none of the

193 constants are introduced with mixfix annotations, there is no concrete syntax

194 to speak of: terms can only be abstractions or applications of the form

195 $f(t@1,\dots,t@n)$, where $f$ is a constant or variable. Since this notation

196 quickly becomes unreadable, Isabelle supports syntax definitions in the form

197 of unrestricted context-free grammars using mixfix annotations.

199 Mixfix annotations describe the {\em concrete} syntax, its translation into

200 the abstract syntax, and a pretty-printing scheme, all in one. Isabelle

201 syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.

202 Each mixfix annotation defines a precedence grammar production and associates

203 an Isabelle constant with it.

205 A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is

206 interpreted as a grammar pro\-duction as follows:

207 \begin{itemize}

208 \item $sy$ is the right-hand side of this production, specified as a {\em

209 mixfix annotation}. In general, $sy$ is of the form

210 $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$, where each occurrence of

211 ``\ttindex{_}'' denotes an argument/nonterminal and the strings

212 $\alpha@i$ do not contain ``{\tt_}''.

213 \item $\tau$ specifies the types of the nonterminals on the left and right

214 hand side. If $sy$ is of the form above, $\tau$ must be of the form

215 $[\tau@1,\dots,\tau@n] \To \tau'$. Then argument $i$ is of type $\tau@i$

216 and the result, i.e.\ the left-hand side of the production, is of type

217 $\tau'$. Both the $\tau@i$ and $\tau'$ may be function types.

218 \item $c$ is the name of the Isabelle constant associated with this production.

219 Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt

220 Const($c$,dummyT\footnote{Proper types are inserted later on. See

221 \S\ref{Typing}})\$$a@1$\$$\dots$\$$a@n$}\index{*dummyT}, where $a@i$ is

222 the term generated by parsing the $i^{th}$ argument.

223 \item $ps$ must be of the form $[p@1,\dots,p@n]$, where $p@i$ is the

224 minimal precedence\index{precedence} required of any phrase that may appear

225 as the $i^{th}$ argument. The null list is interpreted as a list of 0's of

226 the appropriate length.

227 \item $p$ is the precedence of this production.

228 \end{itemize}

229 Notice that there is a close connection between abstract and concrete syntax:

230 each production has an associated constant, and types act as {\bf syntactic

231 categories} in the concrete syntax. To emphasize this connection, we

232 sometimes refer to the nonterminals on the right-hand side of a production as

233 its arguments and to the nonterminal on the left-hand side as its result.

235 The maximal legal precedence is called \ttindexbold{max_pri}, which is

236 currently 1000. If you want to ignore precedences, the safest way to do so is

237 to use the annotation {\tt($sy$)}: this production puts no precedence

238 constraints on any of its arguments and has maximal precedence itself, i.e.\

239 it is always applicable and does not exclude any productions of its

240 arguments.

242 \begin{example}

243 In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written

244 as follows:

245 \begin{ttbox}

246 types exp 0

247 consts "0" :: "exp" ("0" 9)

248 "+" :: "[exp,exp] => exp" ("_ + _" [0,1] 0)

249 "*" :: "[exp,exp] => exp" ("_ * _" [3,2] 2)

250 "-" :: "exp => exp" ("- _" [3] 3)

251 \end{ttbox}

252 Parsing the string \verb!"0 + - 0 + 0"! produces the term {\tt

253 $p$\$($p$\$($m$\$$z$)\$$z$)\$$z$} where {\tt$p =$ Const("+",dummyT)},

254 {\tt$m =$ Const("-",dummyT)}, and {\tt$z =$ Const("0",dummyT)}.

255 \end{example}

257 The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf

258 meta-character}\index{meta-character} which does not represent itself but

259 an argument position. The following characters are also meta-characters:

260 \begin{ttbox}

261 ' ( ) /

262 \end{ttbox}

263 Preceding any character with a quote (\verb$'$) turns it into an ordinary

264 character. Thus you can write \verb!''! if you really want a single quote.

265 The purpose of the other meta-characters is explained in

266 \S\ref{PrettyPrinting}. Remember that in \ML\ strings \verb$\$ is already a

267 (different kind of) meta-character.

270 \subsection{Types and syntactic categories *}

272 The precise mapping from types to syntactic categories is defined by the

273 following function:

274 \begin{eqnarray*}

275 N(\tau@1\To\tau@2) &=& fun \\

276 N((\tau@1,\dots,\tau@n)ty) &=& ty \\

277 N(\alpha) &=& logic

278 \end{eqnarray*}

279 Only the outermost type constructor is taken into account and type variables

280 can range over all logical types. This catches some ill-typed terms (like

281 $Cons(x,0)$, where $Cons :: [\alpha,\alpha list] \To \alpha list$ and $0 ::

282 nat$) but leaves the real work to the type checker.

284 In terms of the precedence grammar format introduced in

285 \S\ref{PrecedenceGrammars}, the declaration

286 \begin{ttbox}

287 consts \(c\) :: "[\(\tau@1\),\dots,\(\tau@n\)]\(\To\tau\)" ("\(\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n\)") [\(p@1\),\dots,\(p@n\)] \(p\))

288 \end{ttbox}

289 defines the production

290 \[ N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots

291 ~\alpha@{n-1} ~N(\tau@n)@{p@n}~ \alpha@n

292 \]

294 \subsection{Copy productions *}

296 Productions which do not create a new node in the abstract syntax tree are

297 called {\bf copy productions}. They must have exactly one nonterminal on

298 the right hand side. The term generated when parsing that nonterminal is

299 simply passed up as the result of parsing the whole copy production. In

300 Isabelle a copy production is indicated by an empty constant name, i.e.\ by

301 \begin{ttbox}

302 consts "" :: \(\tau\) (\(sy\) \(ps\) \(p\))

303 \end{ttbox}

305 A special kind of copy production is one where, modulo white space, $sy$ is

306 {\tt"_"}. It is called a {\bf chain production}. Chain productions should be

307 seen as an abbreviation mechanism. Conceptually, they are removed from the

308 grammar by adding appropriate new rules. Precedence information attached to

309 chain productions is ignored. The following example demonstrates the effect:

310 the grammar defined by

311 \begin{ttbox}

312 types A,B,C 0

313 consts AB :: "B => A" ("A _" [10] 517)

314 "" :: "C => B" ("_" [0] 100)

315 x :: "C" ("x" 5)

316 y :: "C" ("y" 15)

317 \end{ttbox}

318 admits {\tt"A y"} but not {\tt"A x"}. Had the constant in the second

319 production been some non-empty string, both {\tt"A y"} and {\tt"A x"} would

320 be legal.

322 \index{mixfix|)}

324 \section{Lexical conventions}

326 The lexical analyzer distinguishes the following kinds of tokens: delimiters,

327 identifiers, unknowns, type variables and type unknowns.

329 Delimiters are user-defined, i.e.\ they are extracted from the syntax

330 definition. If $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$ is a mixfix

331 annotation, each $\alpha@i$ is decomposed into substrings

332 $\beta@1~\dots~\beta@k$ which are separated by and do not contain

333 \bfindex{white space} ( = blanks, tabs, newlines). Each $\beta@j$ becomes a

334 delimiter. Thus a delimiter can be an arbitrary string not containing white

335 space.

337 The lexical syntax of identifiers and variables ( = unknowns) is defined in

338 the introductory manual. Parsing an identifier $f$ generates {\tt

339 Free($f$,dummyT)}\index{*dummyT}. Parsing a variable {\tt?}$v$ generates

340 {\tt Var(($u$,$i$),dummyT)} where $i$ is the integer value of the longest

341 numeric suffix of $v$ (possibly $0$), and $u$ is the remaining prefix.

342 Parsing a variable {\tt?}$v{.}i$ generates {\tt Var(($v$,$i$),dummyT)}. The

343 following table covers the four different cases that can arise:

344 \begin{center}\tt

345 \begin{tabular}{cccc}

346 "?v" & "?v.7" & "?v5" & "?v7.5" \\

347 Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$)

348 \end{tabular}

349 \end{center}

350 where $d = {\tt dummyT}$.

352 In mixfix annotations, \ttindexbold{id}, \ttindexbold{var},

353 \ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of

354 identifiers, unknowns, type variables and type unknowns, respectively.

357 The lexical analyzer translates input strings to token lists by repeatedly

358 taking the maximal prefix of the input string that forms a valid token. A

359 maximal prefix that is both a delimiter and an identifier or variable (like

360 {\tt ALL}) is treated as a delimiter. White spaces are separators.

362 An important consequence of this translation scheme is that delimiters need

363 not be separated by white space to be recognized as separate. If \verb$"-"$

364 is a delimiter but \verb$"--"$ is not, the string \verb$"--"$ is treated as

365 two consecutive occurrences of \verb$"-"$. This is in contrast to \ML\ which

366 would treat \verb$"--"$ as a single (undeclared) identifier. The

367 consequence of Isabelle's more liberal scheme is that the same string may be

368 parsed in a different way after extending the syntax: after adding

369 \verb$"--"$ as a delimiter, the input \verb$"--"$ is treated as

370 a single occurrence of \verb$"--"$.

372 \section{Infix operators}

374 {\tt Infixl} and {\tt infixr} declare infix operators which associate to the

375 left and right respectively. As in \ML, prefixing infix operators with

376 \ttindexbold{op} turns them into curried functions. Infix declarations can

377 be reduced to mixfix ones as follows:

378 \begin{center}\tt

379 \begin{tabular}{l@{~~$\Longrightarrow$~~}l}

380 "$c$" ::~$\tau$ (\ttindexbold{infixl} $p$) &

381 "op $c$" ::~$\tau$ ("_ $c$ _" [$p$,$p+1$] $p$) \\

382 "$c$" ::~$\tau$ (\ttindexbold{infixr} $p$) &

383 "op $c$" ::~$\tau$ ("_ $c$ _" [$p+1$,$p$] $p$)

384 \end{tabular}

385 \end{center}

388 \section{Binders}

389 A {\bf binder} is a variable-binding constant, such as a quantifier.

390 The declaration

391 \begin{ttbox}

392 consts \(c\) :: \(\tau\) (binder \(Q\) \(p\))

393 \end{ttbox}\indexbold{*binder}

394 introduces a binder $c$ of type $\tau$,

395 which must have the form $(\tau@1\To\tau@2)\To\tau@3$. Its concrete syntax

396 is $Q~x.t$. A binder is like a generalized quantifier where $\tau@1$ is the

397 type of the bound variable $x$, $\tau@2$ the type of the body $t$, and

398 $\tau@3$ the type of the whole term. For example $\forall$ can be declared

399 like this:

400 \begin{ttbox}

401 consts All :: "('a => o) => o" (binder "ALL " 10)

402 \end{ttbox}

403 This allows us to write $\forall x.P$ either as {\tt ALL $x$.$P$} or {\tt

404 All(\%$x$.$P$)}; the latter form is for purists only.

406 In case $\tau@2 = \tau@3$, nested quantifications can be written as $Q~x@1

407 \dots x@n.t$. From a syntactic point of view,

408 \begin{ttbox}

409 consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)" (binder "\(Q\)" \(p\))

410 \end{ttbox}

411 is equivalent to

412 \begin{ttbox}

413 consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)"

414 "\(Q\)" :: "[idts,\(\tau@2\)] => \(\tau@3\)" ("\(Q\)_. _" \(p\))

415 \end{ttbox}

416 where {\tt idts} is the syntactic category $idts$ defined in

417 Figure~\ref{MetaLogicSyntax}.

419 However, there is more to binders than concrete syntax: behind the scenes the

420 body of the quantified expression has to be converted into a

421 $\lambda$-abstraction (when parsing) and back again (when printing). This

422 is performed by the translation mechanism, which is discussed below. For

423 binders, the definition of the required translation functions has been

424 automated. Many other syntactic forms, such as set comprehension, require

425 special treatment.

428 \section{Parse translations *}

429 \label{Parse-translations}

430 \index{parse translation|(}

432 So far we have pretended that there is a close enough relationship between

433 concrete and abstract syntax to allow an automatic translation from one to

434 the other using the constant name supplied with each production. In many

435 cases this scheme is not powerful enough, especially for constructs involving

436 variable bindings. Therefore the $ML$-section of a theory definition can

437 associate constant names with user-defined translation functions by including

438 a line

439 \begin{ttbox}

440 val parse_translation = \dots

441 \end{ttbox}

442 where the right-hand side of this binding must be an \ML-expression of type

443 \verb$(string * (term list -> term))list$.

445 After the input string has been translated into a term according to the

446 syntax definition, there is a second phase in which the term is translated

447 using the user-supplied functions in a bottom-up manner. Given a list $tab$

448 of the above type, a term $t$ is translated as follows. If $t$ is not of the

449 form {\tt Const($c$,$\tau$)\$$t@1$\$\dots\$$t@n$}, then $t$ is returned

450 unchanged. Otherwise all $t@i$ are translated into $t@i'$. Let {\tt $t' =$

451 Const($c$,$\tau$)\$$t@1'$\$\dots\$$t@n'$}. If there is no pair $(c,f)$ in

452 $tab$, return $t'$. Otherwise apply $f$ to $[t@1',\dots,t@n']$. If that

453 raises an exception, return $t'$, otherwise return the result.

454 \begin{example}\label{list-enum}

455 \ML-lists are constructed by {\tt[]} and {\tt::}. For readability the

456 list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}.

457 In Isabelle the two forms of lists are declared as follows:

458 \begin{ttbox}

459 types list 1

460 enum 0

461 arities list :: (term)term

462 consts "[]" :: "'a list" ("[]")

463 ":" :: "['a, 'a list] => 'a list" (infixr 50)

464 enum :: "enum => 'a list" ("[_]")

465 sing :: "'a => enum" ("_")

466 cons :: "['a,enum] => enum" ("_, _")

467 end

468 \end{ttbox}

469 Because \verb$::$ is already used for type constraints, it is replaced by

470 \verb$:$ as the infix list constructor.

472 In order to allow list enumeration, the new type {\tt enum} is introduced.

473 Its only purpose is syntactic and hence it does not need an arity, in

474 contrast to the logical type {\tt list}. Although \hbox{\tt[$x$,$y$,$z$]} is

475 syntactically legal, it needs to be translated into a term built up from

476 \verb$[]$ and \verb$:$. This is what \verb$make_list$ accomplishes:

477 \begin{ttbox}

478 val cons = Const("op :", dummyT);

480 fun make_list (Const("sing",_)$e) = cons $ e $ Const("[]", dummyT)

481 | make_list (Const("cons",_)$e$es) = cons $ e $ make_list es;

482 \end{ttbox}

483 To hook this translation up to Isabelle's parser, the theory definition needs

484 to contain the following $ML$-section:

485 \begin{ttbox}

486 ML

487 fun enum_tr[enum] = make_list enum;

488 val parse_translation = [("enum",enum_tr)]

489 \end{ttbox}

490 This causes \verb!Const("enum",_)$!$t$ to be replaced by

491 \verb$enum_tr[$$t$\verb$]$.

493 Of course the definition of \verb$make_list$ should be included in the

494 $ML$-section.

495 \end{example}

496 \begin{example}\label{SET}

497 Isabelle represents the set $\{ x \mid P(x) \}$ internally by $Set(\lambda

498 x.P(x))$. The internal and external forms need separate

499 constants:\footnote{In practice, the external form typically has a name

500 beginning with an {\at} sign, such as {\tt {\at}SET}. This emphasizes that

501 the constant should be used only for parsing/printing.}

502 \begin{ttbox}

503 types set 1

504 arities set :: (term)term

505 consts Set :: "('a => o) => 'a set"

506 SET :: "[id,o] => 'a set" ("\{_ | _\}")

507 \end{ttbox}

508 Parsing {\tt"\{$x$ | $P$\}"} according to this syntax yields the term {\tt

509 Const("SET",dummyT) \$ Free("\(x\)",dummyT) \$ \(p\)}, where $p$ is the

510 result of parsing $P$. What we need is the term {\tt

511 Const("Set",dummyT)\$Abs("$x$",dummyT,$p'$)}, where $p'$ is some

512 ``abstracted'' version of $p$. Therefore we define a function

513 \begin{ttbox}

514 fun set_tr[Free(s,T), p] = Const("Set", dummyT) $

515 Abs(s, T, abstract_over(Free(s,T), p));

516 \end{ttbox}

517 where \verb$abstract_over: term*term -> term$ is a predefined function such

518 that {\tt abstract_over($u$,$t$)} replaces every occurrence of $u$ in $t$ by

519 a {\tt Bound} variable of the correct index (i.e.\ 0 at top level). Remember

520 that {\tt dummyT} is replaced by the correct types at a later stage (see

521 \S\ref{Typing}). Function {\tt set_tr} is associated with {\tt SET} by

522 including the \ML-text

523 \begin{ttbox}

524 val parse_translation = [("SET", set_tr)];

525 \end{ttbox}

526 \end{example}

528 If you want to run the above examples in Isabelle, you should note that an

529 $ML$-section needs to contain not just a definition of

530 \verb$parse_translation$ but also of a variable \verb$print_translation$. The

531 purpose of the latter is to reverse the effect of the former during printing;

532 details are found in \S\ref{Print-translations}. Hence you need to include

533 the line

534 \begin{ttbox}

535 val print_translation = [];

536 \end{ttbox}

537 This is instructive because the terms are then printed out in their internal

538 form. For example the input \hbox{\tt[$x$,$y$,$z$]} is echoed as

539 \hbox{\tt$x$:$y$:$z$:[]}. This helps to check that your parse translation is

540 working correctly.

542 %\begin{warn}

543 %Explicit type constraints disappear with type checking but are still

544 %visible to the parse translation functions.

545 %\end{warn}

547 \index{parse translation|)}

549 \section{Printing}

551 Syntax definitions provide printing information in three distinct ways:

552 through

553 \begin{itemize}

554 \item the syntax of the language (as used for parsing),

555 \item pretty printing information, and

556 \item print translation functions.

557 \end{itemize}

558 The bare mixfix declarations enable Isabelle to print terms, but the result

559 will not necessarily be pretty and may look different from what you expected.

560 To produce a pleasing layout, you need to read the following sections.

562 \subsection{Printing with mixfix declarations}

564 Let {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} be a term and let

565 \begin{ttbox}

566 consts \(c\) :: \(\tau\) (\(sy\))

567 \end{ttbox}

568 be a mixfix declaration where $sy$ is of the form

569 $\alpha@0\_\alpha@1\dots\alpha@{n-1}\_\alpha@n$. Printing $t$ according to

570 $sy$ means printing the string

571 $\alpha@0\beta@1\alpha@1\ldots\alpha@{n-1}\beta@n\alpha@n$, where $\beta@i$

572 is the result of printing $t@i$.

574 Note that the system does {\em not\/} insert blanks. They should be part of

575 the mixfix syntax if they are required to separate tokens or achieve a

576 certain layout.

578 \subsection{Pretty printing}

579 \label{PrettyPrinting}

580 \index{pretty printing}

582 In order to format the output, it is possible to embed pretty printing

583 directives in mixfix annotations. These directives are ignored during parsing

584 and affect only printing. The characters {\tt(}, {\tt)} and {\tt/} are

585 interpreted as meta-characters\index{meta-character} when found in a mixfix

586 annotation. Their meaning is

587 \begin{description}

588 \item[~{\tt(}~] Open a block. A sequence of digits following it is

589 interpreted as the \bfindex{indentation} of this block. It causes the

590 output to be indented by $n$ positions if a line break occurs within the

591 block. If {\tt(} is not followed by a digit, the indentation defaults to

592 $0$.

593 \item[~{\tt)}~] Close a block.

594 \item[~\ttindex{/}~] Allow a \bfindex{line break}. White space immediately

595 following {\tt/} is not printed if the line is broken at this point.

596 \end{description}

598 \subsection{Print translations *}

599 \label{Print-translations}

600 \index{print translation|(}

602 Since terms are translated after parsing (see \S\ref{Parse-translations}),

603 there is a similar mechanism to translate them back before printing.

604 Therefore the $ML$-section of a theory definition can associate constant

605 names with user-defined translation functions by including a line

606 \begin{ttbox}

607 val print_translation = \dots

608 \end{ttbox}

609 where the right-hand side of this binding is again an \ML-expression of type

610 \verb$(string * (term list -> term))list$.

611 Including a pair $(c,f)$ in this list causes the printer to print

612 $f[t@1,\dots,t@n]$ whenever it finds {\tt Const($c$,_)\$$t@1$\$\dots\$$t@n$}.

613 \begin{example}

614 Reversing the effect of the parse translation in Example~\ref{list-enum} is

615 accomplished by the following function:

616 \begin{ttbox}

617 fun make_enum (Const("op :",_) $ e $ es) = case es of

618 Const("[]",_) => Const("sing",dummyT) $ e

619 | _ => Const("enum",dummyT) $ e $ make_enum es;

620 \end{ttbox}

621 It translates \hbox{\tt$x$:$y$:$z$:[]} to \hbox{\tt[$x$,$y$,$z$]}. However,

622 if the input does not terminate with an empty list, e.g.\ \hbox{\tt$x$:$xs$},

623 \verb$make_enum$ raises exception {\tt Match}. This signals that the

624 attempted translation has failed and the term should be printed as is.

625 The connection with Isabelle's pretty printer is established as follows:

626 \begin{ttbox}

627 fun enum_tr'[x,xs] = Const("enum",dummyT) $

628 make_enum(Const("op :",dummyT)$x$xs);

629 val print_translation = [("op :", enum_tr')];

630 \end{ttbox}

631 This declaration causes the printer to print \hbox{\tt enum_tr'[$x$,$y$]}

632 whenever it finds \verb!Const("op :",_)$!$x$\verb!$!$y$.

633 \end{example}

634 \begin{example}

635 In Example~\ref{SET} we showed how to translate the concrete syntax for set

636 comprehension into the proper internal form. The string {\tt"\{$x$ |

637 $P$\}"} now becomes {\tt Const("Set",_)\$Abs("$x$",_,$p$)}. If, however,

638 the latter term were printed without translating it back, it would result

639 in {\tt"Set(\%$x$.$P$)"}. Therefore the abstraction has to be turned back

640 into a term that matches the concrete mixfix syntax:

641 \begin{ttbox}

642 fun set_tr'[Abs(x,T,P)] =

643 let val (x',P') = variant_abs(x,T,P)

644 in Const("SET",dummyT) $ Free(x',T) $ P' end;

645 \end{ttbox}

646 The function \verb$variant_abs$, a basic term manipulation function, replaces

647 the bound variable $x$ by a {\tt Free} variable $x'$ having a unique name. A

648 term produced by {\tt set_tr'} can now be printed according to the concrete

649 syntax defined in Example~\ref{SET} above.

651 Notice that the application of {\tt set_tr'} fails if the second component of

652 the argument is not an abstraction, but for example just a {\tt Free}

653 variable. This is intentional because it signals to the caller that the

654 translation is inapplicable. As a result {\tt Const("Set",_)\$Free("P",_)}

655 prints as {\tt"Set(P)"}.

657 The full theory extension, including concrete syntax and both translation

658 functions, has the following form:

659 \begin{ttbox}

660 types set 1

661 arities set :: (term)term

662 consts Set :: "('a => o) => 'a set"

663 SET :: "[id,o] => 'a set" ("\{_ | _\}")

664 end

665 ML

666 fun set_tr[Free(s,T), p] = \dots;

667 val parse_translation = [("SET", set_tr)];

668 fun set_tr'[Abs(x,T,P)] = \dots;

669 val print_translation = [("Set", set_tr')];

670 \end{ttbox}

671 \end{example}

672 As during the parse translation process, the types attached to constants

673 during print translation are ignored. Thus {\tt Const("SET",dummyT)} in

674 {\tt set_tr'} above is acceptable. The types of {\tt Free}s and {\tt Var}s

675 however must be preserved because they may get printed (see {\tt

676 show_types}).

678 \index{print translation|)}

680 \subsection{Printing a term}

682 Let $tab$ be the set of all string-function pairs of print translations in the

683 current syntax.

685 Terms are printed recursively; print translations are applied top down:

686 \begin{itemize}

687 \item {\tt Free($x$,_)} is printed as $x$.

688 \item {\tt Var(($x$,$i$),_)} is printed as $x$, if $i = 0$ and $x$ does not

689 end with a digit, as $x$ followed by $i$, if $i \neq 0$ and $x$ does not

690 end with a digit, and as {\tt $x$.$i$}, if $x$ ends with a digit. Thus the

691 following cases can arise:

692 \begin{center}

693 {\tt\begin{tabular}{cccc}

694 \verb$Var(("v",0),_)$ & \verb$Var(("v",7),_)$ & \verb$Var(("v5",0),_)$ \\

695 "?v" & "?v7" & "?v5.0"

696 \end{tabular}}

697 \end{center}

698 \item {\tt Abs($x@1$,_,Abs($x@2$,_,\dots Abs($x@n$,_,$p$)\dots))}, where $p$

699 is not an abstraction, is printed as {\tt \%$y@1\dots y@n$.$P$}, where $P$

700 is the result of printing $p$, and the $x@i$ are replaced by $y@i$. The

701 latter are (possibly new) unique names.

702 \item {\tt Bound($i$)} is printed as {\tt B.$i$} \footnote{The occurrence of

703 such ``loose'' bound variables indicates that either you are trying to

704 print a subterm of an abstraction, or there is something wrong with your

705 print translations.}.

706 \item The application {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} (where

707 $n$ may be $0$!) is printed as follows:

709 If there is a pair $(c,f)$ in $tab$, print $f[t@1,\dots,t@n]$. If this

710 application raises exception {\tt Match} or there is no pair $(c,f)$ in

711 $tab$, let $sy$ be the mixfix annotation associated with $c$. If there is

712 no such $sy$, or if $sy$ does not have exactly $n$ argument positions, $t$

713 is printed as an application; otherwise $t$ is printed according to $sy$.

715 All other applications are printed as applications.

716 \end{itemize}

717 Printing a term {\tt $c$\$$t@1$\$\dots\$$t@n$} as an application means

718 printing it as {\tt $s$($s@1$,\dots,$s@n$)}, where $s@i$ is the result of

719 printing $t@i$. If $c$ is a {\tt Const}, $s$ is its first argument;

720 otherwise $s$ is the result of printing $c$ as described above.

721 \medskip

723 The printer also inserts parentheses where they are necessary for reasons

724 of precedence.

726 \section{Identifiers, constants, and type inference *}

727 \label{Typing}

729 There is one final step in the translation from strings to terms that we have

730 not covered yet. It explains how constants are distinguished from {\tt Free}s

731 and how {\tt Free}s and {\tt Var}s are typed. Both issues arise because {\tt

732 Free}s and {\tt Var}s are not declared.

734 An identifier $f$ that does not appear as a delimiter in the concrete syntax

735 can be either a free variable or a constant. Since the parser knows only

736 about those constants which appear in mixfix annotations, it parses $f$ as

737 {\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt

738 typ}. Although the parser produces these very raw terms, most user

739 interface level functions like {\tt goal} type terms according to the given

740 theory, say $T$. In a first step, every occurrence of {\tt Free($f$,_)} or

741 {\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is

742 a constant $f$ of {\tt typ} $\tau$ in $T$. This means that identifiers are

743 treated as {\tt Free}s iff they are not declared in the theory. The types of

744 the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML. Type

745 constraints can be used to remove ambiguities.

747 One peculiarity of the current type inference algorithm is that variables

748 with the same name must have the same type, irrespective of whether they are

749 schematic, free or bound. For example, take the first-order formula $f(x) = x

750 \land (\forall f.~ f=f)$ where ${=} :: [\alpha{::}term,\alpha]\To o$ and

751 $\forall :: (\alpha{::}term\To o)\To o$. The first conjunct forces

752 $x::\alpha{::}term$ and $f::\alpha\To\alpha$, the second one

753 $f::\beta{::}term$. Although the two $f$'s are distinct, they are required to

754 have the same type. Unifying $\alpha\To\alpha$ and $\beta{::}term$ fails

755 because, in first-order logic, function types are not in class $term$.

758 \section{Putting it all together}

760 Having discussed the individual building blocks of a logic definition, it

761 remains to be shown how they fit together. In particular we need to say how

762 an object-logic syntax is hooked up to the meta-logic. Since all theorems

763 must conform to the syntax for $prop$ (see Figure~\ref{MetaLogicSyntax}),

764 that syntax has to be extended with the object-level syntax. Assume that the

765 syntax of your object-logic defines a category $o$ of formulae. These

766 formulae can now appear in axioms and theorems wherever $prop$ does if you

767 add the production

768 \[ prop ~=~ form. \]

769 More precisely, you need a coercion from formulae to propositions:

770 \begin{ttbox}

771 Base = Pure +

772 types o 0

773 arities o :: logic

774 consts Trueprop :: "o => prop" ("_" 5)

775 end

776 \end{ttbox}

777 The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible

778 coercion function. Assuming this definition resides in a file {\tt base.thy},

779 you have to load it with the command {\tt use_thy"base"}.

781 One of the simplest nontrivial logics is {\em minimal logic} of

782 implication. Its definition in Isabelle needs no advanced features but

783 illustrates the overall mechanism quite nicely:

784 \begin{ttbox}

785 Hilbert = Base +

786 consts "-->" :: "[o,o] => o" (infixr 10)

787 rules

788 K "P --> Q --> P"

789 S "(P --> Q --> R) --> (P --> Q) --> P --> R"

790 MP "[| P --> Q; P |] ==> Q"

791 end

792 \end{ttbox}

793 After loading this definition you can start to prove theorems in this logic:

794 \begin{ttbox}

795 goal Hilbert.thy "P --> P";

796 {\out Level 0}

797 {\out P --> P}

798 {\out 1. P --> P}

799 by (resolve_tac [Hilbert.MP] 1);

800 {\out Level 1}

801 {\out P --> P}

802 {\out 1. ?P --> P --> P}

803 {\out 2. ?P}

804 by (resolve_tac [Hilbert.MP] 1);

805 {\out Level 2}

806 {\out P --> P}

807 {\out 1. ?P1 --> ?P --> P --> P}

808 {\out 2. ?P1}

809 {\out 3. ?P}

810 by (resolve_tac [Hilbert.S] 1);

811 {\out Level 3}

812 {\out P --> P}

813 {\out 1. P --> ?Q2 --> P}

814 {\out 2. P --> ?Q2}

815 by (resolve_tac [Hilbert.K] 1);

816 {\out Level 4}

817 {\out P --> P}

818 {\out 1. P --> ?Q2}

819 by (resolve_tac [Hilbert.K] 1);

820 {\out Level 5}

821 {\out P --> P}

822 {\out No subgoals!}

823 \end{ttbox}

824 As you can see, this Hilbert-style formulation of minimal logic is easy to

825 define but difficult to use. The following natural deduction formulation is

826 far preferable:

827 \begin{ttbox}

828 MinI = Base +

829 consts "-->" :: "[o,o] => o" (infixr 10)

830 rules

831 impI "(P ==> Q) ==> P --> Q"

832 impE "[| P --> Q; P |] ==> Q"

833 end

834 \end{ttbox}

835 Note, however, that although the two systems are equivalent, this fact cannot

836 be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI$

837 (exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!. The reason

838 is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,

839 something that can only be shown by induction over all possible proofs in

840 \verb!Hilbert!.

842 It is a very simple matter to extend minimal logic with falsity:

843 \begin{ttbox}

844 MinIF = MinI +

845 consts False :: "o"

846 rules

847 FalseE "False ==> P"

848 end

849 \end{ttbox}

850 On the other hand, we may wish to introduce conjunction only:

851 \begin{ttbox}

852 MinC = Base +

853 consts "&" :: "[o,o] => o" (infixr 30)

854 rules

855 conjI "[| P; Q |] ==> P & Q"

856 conjE1 "P & Q ==> P"

857 conjE2 "P & Q ==> Q"

858 end

859 \end{ttbox}

860 And if we want to have all three connectives together, we define:

861 \begin{ttbox}

862 MinIFC = MinIF + MinC

863 \end{ttbox}

864 Now we can prove mixed theorems like

865 \begin{ttbox}

866 goal MinIFC.thy "P & False --> Q";

867 by (resolve_tac [MinI.impI] 1);

868 by (dresolve_tac [MinC.conjE2] 1);

869 by (eresolve_tac [MinIF.FalseE] 1);

870 \end{ttbox}

871 Try this as an exercise!