104

1 
\chapter{Defining Logics} \label{DefiningLogics}


2 
This chapter is intended for Isabelle experts. It explains how to define new


3 
logical systems, Isabelle's {\it raison d'\^etre}. Isabelle logics are


4 
hierarchies of theories. A number of simple examples are contained in the


5 
introductory manual; the full syntax of theory definitions is shown in the


6 
{\em Reference Manual}. The purpose of this chapter is to explain the


7 
remaining subtleties, especially some context conditions on the class


8 
structure and the definition of new mixfix syntax. A full understanding of


9 
the material requires knowledge of the internal representation of terms (data


10 
type {\tt term}) as detailed in the {\em Reference Manual}. Sections marked


11 
with a * can be skipped on first reading.


12 


13 


14 
\section{Classes and Types *}


15 
\index{*arities!context conditions}


16 


17 
Type declarations are subject to the following two wellformedness


18 
conditions:


19 
\begin{itemize}


20 
\item There are no two declarations $ty :: (\vec{r})c$ and $ty :: (\vec{s})c$


21 
with $\vec{r} \neq \vec{s}$. For example


22 
\begin{ttbox}


23 
types ty 1


24 
arities ty :: (\{logic\}) logic


25 
ty :: (\{\})logic


26 
\end{ttbox}


27 
leads to an error message and fails.


28 
\item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::


29 
(s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold


30 
for $i=1,\dots,n$. The relationship $\preceq$, defined as


31 
\[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]


32 
expresses that the set of types represented by $s'$ is a subset of the set of


33 
types represented by $s$. For example


34 
\begin{ttbox}


35 
classes term < logic


36 
types ty 1


37 
arities ty :: (\{logic\})logic


38 
ty :: (\{\})term


39 
\end{ttbox}


40 
leads to an error message and fails.


41 
\end{itemize}


42 
These conditions guarantee principal types~\cite{nipkowprehofer}.


43 


44 
\section{Precedence Grammars}


45 
\label{PrecedenceGrammars}


46 
\index{precedence grammar(}


47 


48 
The precise syntax of a logic is best defined by a contextfree grammar.


49 
These grammars obey the following conventions: identifiers denote


50 
nonterminals, {\tt typewriter} fount denotes terminals, repetition is


51 
indicated by \dots, and alternatives are separated by $$.


52 


53 
In order to simplify the description of mathematical languages, we introduce


54 
an extended format which permits {\bf precedences}\index{precedence}. This


55 
scheme generalizes precedence declarations in \ML\ and {\sc prolog}. In this


56 
extended grammar format, nonterminals are decorated by integers, their


57 
precedence. In the sequel, precedences are shown as subscripts. A nonterminal


58 
$A@p$ on the righthand side of a production may only be replaced using a


59 
production $A@q = \gamma$ where $p \le q$.


60 


61 
Formally, a set of context free productions $G$ induces a derivation


62 
relation $\rew@G$ on strings as follows:


63 
\[ \alpha A@p \beta ~\rew@G~ \alpha\gamma\beta ~~~iff~~~


64 
\exists q \ge p.~(A@q=\gamma) \in G


65 
\]


66 
Any extended grammar of this kind can be translated into a normal context


67 
free grammar. However, this translation may require the introduction of a


68 
large number of new nonterminals and productions.


69 


70 
\begin{example}


71 
\label{PrecedenceEx}


72 
The following simple grammar for arithmetic expressions demonstrates how


73 
binding power and associativity of operators can be enforced by precedences.


74 
\begin{center}


75 
\begin{tabular}{rclr}


76 
$A@9$ & = & {\tt0} \\


77 
$A@9$ & = & {\tt(} $A@0$ {\tt)} \\


78 
$A@0$ & = & $A@0$ {\tt+} $A@1$ \\


79 
$A@2$ & = & $A@3$ {\tt*} $A@2$ \\


80 
$A@3$ & = & {\tt} $A@3$


81 
\end{tabular}


82 
\end{center}


83 
The choice of precedences determines that \verb$$ binds tighter than


84 
\verb$*$ which binds tighter than \verb$+$, and that \verb$+$ and \verb$*$


85 
associate to the left and right, respectively.


86 
\end{example}


87 


88 
To minimize the number of subscripts, we adopt the following conventions:


89 
\begin{itemize}


90 
\item all precedences $p$ must be in the range $0 \leq p \leq max_pri$ for


91 
some fixed $max_pri$.


92 
\item precedence $0$ on the righthand side and precedence $max_pri$ on the


93 
lefthand side may be omitted.


94 
\end{itemize}


95 
In addition, we write the production $A@p = \alpha$ as $A = \alpha~(p)$.


96 


97 
Using these conventions and assuming $max_pri=9$, the grammar in


98 
Example~\ref{PrecedenceEx} becomes


99 
\begin{center}


100 
\begin{tabular}{rclc}


101 
$A$ & = & {\tt0} & \hspace*{4em} \\


102 
& $$ & {\tt(} $A$ {\tt)} \\


103 
& $$ & $A$ {\tt+} $A@1$ & (0) \\


104 
& $$ & $A@3$ {\tt*} $A@2$ & (2) \\


105 
& $$ & {\tt} $A@3$ & (3)


106 
\end{tabular}


107 
\end{center}


108 


109 
\index{precedence grammar)}


110 


111 
\section{Basic syntax *}


112 


113 
An informal account of most of Isabelle's syntax (metalogic, types etc) is


114 
contained in {\em Introduction to Isabelle}. A precise description using a


115 
precedence grammar is shown in Figure~\ref{MetaLogicSyntax}. This description


116 
is the basis of all extensions by objectlogics.


117 
\begin{figure}[htb]


118 
\begin{center}


119 
\begin{tabular}{rclc}


120 
$prop$ &=& \ttindex{PROP} $aprop$ ~~$$~~ {\tt(} $prop$ {\tt)} \\


121 
&$$& $logic@3$ \ttindex{==} $logic@2$ & (2) \\


122 
&$$& $prop@2$ \ttindex{==>} $prop@1$ & (1) \\


123 
&$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop@1$ & (1) \\


124 
&$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\


125 
$logic$ &=& $prop$ ~~$$~~ $fun$ \\\\


126 
$aprop$ &=& $id$ ~~$$~~ $var$


127 
~~$$~~ $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\


128 
$fun$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $fun$ {\tt)} \\


129 
&$$& \ttindex{\%} $idts$ {\tt.} $logic$ & (0) \\\\


130 
$idts$ &=& $idt$ ~~$$~~ $idt@1$ $idts$ \\\\


131 
$idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\


132 
&$$& $id$ \ttindex{::} $type$ & (0) \\\\


133 
$type$ &=& $tfree$ ~~$$~~ $tvar$ \\


134 
&$$& $tfree$ {\tt::} $sort$ ~~$$~~ $tvar$ {\tt::} $sort$ \\


135 
&$$& $id$ ~~$$~~ $type@{max_pri}$ $id$


136 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\


137 
&$$& $type@1$ \ttindex{=>} $type$ & (0) \\


138 
&$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\


139 
&$$& {\tt(} $type$ {\tt)} \\\\


140 
$sort$ &=& $id$ ~~$$~~ {\tt\{\}}


141 
~~$$~~ {\tt\{} $id$ {\tt,} \dots {\tt,} $id$ {\tt\}}


142 
\end{tabular}\index{*"!"!}\index{*"["}\index{*""]}


143 
\indexbold{type@$type$}\indexbold{sort@$sort$}\indexbold{idts@$idts$}


144 
\indexbold{logic@$logic$}\indexbold{prop@$prop$}\indexbold{fun@$fun$}


145 
\end{center}


146 
\caption{MetaLogic Syntax}


147 
\label{MetaLogicSyntax}


148 
\end{figure}


149 
The following main categories are defined:


150 
\begin{description}


151 
\item[$prop$] Terms of type $prop$, i.e.\ formulae of the metalogic.


152 
\item[$aprop$] Atomic propositions.


153 
\item[$logic$] Terms of types in class $logic$. Initially, $logic$ contains


154 
merely $prop$. As the syntax is extended by new objectlogics, more


155 
productions for $logic$ are added (see below).


156 
\item[$fun$] Terms potentially of function type.


157 
\item[$type$] Types.


158 
\item[$idts$] a list of identifiers, possibly constrained by types. Note


159 
that $x::nat~y$ is parsed as $x::(nat~y)$, i.e.\ $y$ is treated like a


160 
type constructor applied to $nat$.


161 
\end{description}


162 


163 
The predefined types $id$, $var$, $tfree$ and $tvar$ represent identifiers


164 
({\tt f}), unknowns ({\tt ?f}), type variables ({\tt 'a}), and type unknowns


165 
({\tt ?'a}) respectively. If we think of them as nonterminals with


166 
predefined syntax, we may assume that all their productions have precedence


167 
$max_pri$.


168 


169 
\subsection{Logical types and default syntax}


170 


171 
Isabelle is concerned with mathematical languages which have a certain


172 
minimal vocabulary: identifiers, variables, parentheses, and the lambda


173 
calculus. Logical types, i.e.\ those of class $logic$, are automatically


174 
equipped with this basic syntax. More precisely, for any type constructor


175 
$ty$ with arity $(\dots)c$, where $c$ is a subclass of $logic$, the following


176 
productions are added:


177 
\begin{center}


178 
\begin{tabular}{rclc}


179 
$ty$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $ty$ {\tt)} \\


180 
&$$& $fun@{max_pri}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\


181 
&$$& $ty@{max_pri}$ {\tt::} $type$\\\\


182 
$logic$ &=& $ty$


183 
\end{tabular}


184 
\end{center}


185 


186 


187 
\section{Mixfix syntax}


188 
\index{mixfix(}


189 


190 
We distinguish between abstract and concrete syntax. The {\em abstract}


191 
syntax is given by the typed constants of a theory. Abstract syntax trees are


192 
welltyped terms, i.e.\ values of \ML\ type {\tt term}. If none of the


193 
constants are introduced with mixfix annotations, there is no concrete syntax


194 
to speak of: terms can only be abstractions or applications of the form


195 
$f(t@1,\dots,t@n)$, where $f$ is a constant or variable. Since this notation


196 
quickly becomes unreadable, Isabelle supports syntax definitions in the form


197 
of unrestricted contextfree grammars using mixfix annotations.


198 


199 
Mixfix annotations describe the {\em concrete} syntax, its translation into


200 
the abstract syntax, and a prettyprinting scheme, all in one. Isabelle


201 
syntax definitions are inspired by \OBJ's~\cite{OBJ} {\em mixfix\/} syntax.


202 
Each mixfix annotation defines a precedence grammar production and associates


203 
an Isabelle constant with it.


204 


205 
A {\em mixfix declaration} {\tt consts $c$ ::\ $\tau$ ($sy$ $ps$ $p$)} is


206 
interpreted as a grammar pro\duction as follows:


207 
\begin{itemize}


208 
\item $sy$ is the righthand side of this production, specified as a {\em


209 
mixfix annotation}. In general, $sy$ is of the form


210 
$\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$, where each occurrence of


211 
``\ttindex{_}'' denotes an argument/nonterminal and the strings


212 
$\alpha@i$ do not contain ``{\tt_}''.


213 
\item $\tau$ specifies the types of the nonterminals on the left and right


214 
hand side. If $sy$ is of the form above, $\tau$ must be of the form


215 
$[\tau@1,\dots,\tau@n] \To \tau'$. Then argument $i$ is of type $\tau@i$


216 
and the result, i.e.\ the lefthand side of the production, is of type


217 
$\tau'$. Both the $\tau@i$ and $\tau'$ may be function types.


218 
\item $c$ is the name of the Isabelle constant associated with this production.


219 
Parsing an instance of the phrase $sy$ generates the {\tt term} {\tt


220 
Const($c$,dummyT\footnote{Proper types are inserted later on. See


221 
\S\ref{Typing}})\$$a@1$\$$\dots$\$$a@n$}\index{*dummyT}, where $a@i$ is


222 
the term generated by parsing the $i^{th}$ argument.


223 
\item $ps$ must be of the form $[p@1,\dots,p@n]$, where $p@i$ is the


224 
minimal precedence\index{precedence} required of any phrase that may appear


225 
as the $i^{th}$ argument. The null list is interpreted as a list of 0's of


226 
the appropriate length.


227 
\item $p$ is the precedence of this production.


228 
\end{itemize}


229 
Notice that there is a close connection between abstract and concrete syntax:


230 
each production has an associated constant, and types act as {\bf syntactic


231 
categories} in the concrete syntax. To emphasize this connection, we


232 
sometimes refer to the nonterminals on the righthand side of a production as


233 
its arguments and to the nonterminal on the lefthand side as its result.


234 


235 
The maximal legal precedence is called \ttindexbold{max_pri}, which is


236 
currently 1000. If you want to ignore precedences, the safest way to do so is


237 
to use the annotation {\tt($sy$)}: this production puts no precedence


238 
constraints on any of its arguments and has maximal precedence itself, i.e.\


239 
it is always applicable and does not exclude any productions of its


240 
arguments.


241 


242 
\begin{example}


243 
In mixfix notation the grammar in Example~\ref{PrecedenceEx} can be written


244 
as follows:


245 
\begin{ttbox}


246 
types exp 0


247 
consts "0" :: "exp" ("0" 9)


248 
"+" :: "[exp,exp] => exp" ("_ + _" [0,1] 0)


249 
"*" :: "[exp,exp] => exp" ("_ * _" [3,2] 2)


250 
"" :: "exp => exp" (" _" [3] 3)


251 
\end{ttbox}


252 
Parsing the string \verb!"0 +  0 + 0"! produces the term {\tt


253 
$p$\$($p$\$($m$\$$z$)\$$z$)\$$z$} where {\tt$p =$ Const("+",dummyT)},


254 
{\tt$m =$ Const("",dummyT)}, and {\tt$z =$ Const("0",dummyT)}.


255 
\end{example}


256 


257 
The interpretation of \ttindex{_} in a mixfix annotation is always as a {\bf


258 
metacharacter}\index{metacharacter} which does not represent itself but


259 
an argument position. The following characters are also metacharacters:


260 
\begin{ttbox}


261 
' ( ) /


262 
\end{ttbox}


263 
Preceding any character with a quote (\verb$'$) turns it into an ordinary


264 
character. Thus you can write \verb!''! if you really want a single quote.


265 
The purpose of the other metacharacters is explained in


266 
\S\ref{PrettyPrinting}. Remember that in \ML\ strings \verb$\$ is already a


267 
(different kind of) metacharacter.


268 


269 


270 
\subsection{Types and syntactic categories *}


271 


272 
The precise mapping from types to syntactic categories is defined by the


273 
following function:


274 
\begin{eqnarray*}


275 
N(\tau@1\To\tau@2) &=& fun \\


276 
N((\tau@1,\dots,\tau@n)ty) &=& ty \\


277 
N(\alpha) &=& logic


278 
\end{eqnarray*}


279 
Only the outermost type constructor is taken into account and type variables


280 
can range over all logical types. This catches some illtyped terms (like


281 
$Cons(x,0)$, where $Cons :: [\alpha,\alpha list] \To \alpha list$ and $0 ::


282 
nat$) but leaves the real work to the type checker.


283 


284 
In terms of the precedence grammar format introduced in


285 
\S\ref{PrecedenceGrammars}, the declaration


286 
\begin{ttbox}


287 
consts \(c\) :: "[\(\tau@1\),\dots,\(\tau@n\)]\(\To\tau\)" ("\(\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n\)") [\(p@1\),\dots,\(p@n\)] \(p\))


288 
\end{ttbox}


289 
defines the production


290 
\[ N(\tau)@p ~~=~~ \alpha@0 ~N(\tau@1)@{p@1}~ \alpha@1~ \dots


291 
~\alpha@{n1} ~N(\tau@n)@{p@n}~ \alpha@n


292 
\]


293 


294 
\subsection{Copy productions *}


295 


296 
Productions which do not create a new node in the abstract syntax tree are


297 
called {\bf copy productions}. They must have exactly one nonterminal on


298 
the right hand side. The term generated when parsing that nonterminal is


299 
simply passed up as the result of parsing the whole copy production. In


300 
Isabelle a copy production is indicated by an empty constant name, i.e.\ by


301 
\begin{ttbox}


302 
consts "" :: \(\tau\) (\(sy\) \(ps\) \(p\))


303 
\end{ttbox}


304 


305 
A special kind of copy production is one where, modulo white space, $sy$ is


306 
{\tt"_"}. It is called a {\bf chain production}. Chain productions should be


307 
seen as an abbreviation mechanism. Conceptually, they are removed from the


308 
grammar by adding appropriate new rules. Precedence information attached to


309 
chain productions is ignored. The following example demonstrates the effect:


310 
the grammar defined by


311 
\begin{ttbox}


312 
types A,B,C 0


313 
consts AB :: "B => A" ("A _" [10] 517)


314 
"" :: "C => B" ("_" [0] 100)


315 
x :: "C" ("x" 5)


316 
y :: "C" ("y" 15)


317 
\end{ttbox}


318 
admits {\tt"A y"} but not {\tt"A x"}. Had the constant in the second


319 
production been some nonempty string, both {\tt"A y"} and {\tt"A x"} would


320 
be legal.


321 


322 
\index{mixfix)}


323 


324 
\section{Lexical conventions}


325 


326 
The lexical analyzer distinguishes the following kinds of tokens: delimiters,


327 
identifiers, unknowns, type variables and type unknowns.


328 


329 
Delimiters are userdefined, i.e.\ they are extracted from the syntax


330 
definition. If $\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$ is a mixfix


331 
annotation, each $\alpha@i$ is decomposed into substrings


332 
$\beta@1~\dots~\beta@k$ which are separated by and do not contain


333 
\bfindex{white space} ( = blanks, tabs, newlines). Each $\beta@j$ becomes a


334 
delimiter. Thus a delimiter can be an arbitrary string not containing white


335 
space.


336 


337 
The lexical syntax of identifiers and variables ( = unknowns) is defined in


338 
the introductory manual. Parsing an identifier $f$ generates {\tt


339 
Free($f$,dummyT)}\index{*dummyT}. Parsing a variable {\tt?}$v$ generates


340 
{\tt Var(($u$,$i$),dummyT)} where $i$ is the integer value of the longest


341 
numeric suffix of $v$ (possibly $0$), and $u$ is the remaining prefix.


342 
Parsing a variable {\tt?}$v{.}i$ generates {\tt Var(($v$,$i$),dummyT)}. The


343 
following table covers the four different cases that can arise:


344 
\begin{center}\tt


345 
\begin{tabular}{cccc}


346 
"?v" & "?v.7" & "?v5" & "?v7.5" \\


347 
Var(("v",0),$d$) & Var(("v",7),$d$) & Var(("v",5),$d$) & Var(("v7",5),$d$)


348 
\end{tabular}


349 
\end{center}


350 
where $d = {\tt dummyT}$.


351 


352 
In mixfix annotations, \ttindexbold{id}, \ttindexbold{var},


353 
\ttindexbold{tfree} and \ttindexbold{tvar} are the predefined categories of


354 
identifiers, unknowns, type variables and type unknowns, respectively.


355 


356 


357 
The lexical analyzer translates input strings to token lists by repeatedly


358 
taking the maximal prefix of the input string that forms a valid token. A


359 
maximal prefix that is both a delimiter and an identifier or variable (like


360 
{\tt ALL}) is treated as a delimiter. White spaces are separators.


361 


362 
An important consequence of this translation scheme is that delimiters need


363 
not be separated by white space to be recognized as separate. If \verb$""$


364 
is a delimiter but \verb$""$ is not, the string \verb$""$ is treated as


365 
two consecutive occurrences of \verb$""$. This is in contrast to \ML\ which


366 
would treat \verb$""$ as a single (undeclared) identifier. The


367 
consequence of Isabelle's more liberal scheme is that the same string may be


368 
parsed in a different way after extending the syntax: after adding


369 
\verb$""$ as a delimiter, the input \verb$""$ is treated as


370 
a single occurrence of \verb$""$.


371 


372 
\section{Infix operators}


373 


374 
{\tt Infixl} and {\tt infixr} declare infix operators which associate to the


375 
left and right respectively. As in \ML, prefixing infix operators with


376 
\ttindexbold{op} turns them into curried functions. Infix declarations can


377 
be reduced to mixfix ones as follows:


378 
\begin{center}\tt


379 
\begin{tabular}{l@{~~$\Longrightarrow$~~}l}


380 
"$c$" ::~$\tau$ (\ttindexbold{infixl} $p$) &


381 
"op $c$" ::~$\tau$ ("_ $c$ _" [$p$,$p+1$] $p$) \\


382 
"$c$" ::~$\tau$ (\ttindexbold{infixr} $p$) &


383 
"op $c$" ::~$\tau$ ("_ $c$ _" [$p+1$,$p$] $p$)


384 
\end{tabular}


385 
\end{center}


386 


387 


388 
\section{Binders}


389 
A {\bf binder} is a variablebinding constant, such as a quantifier.


390 
The declaration


391 
\begin{ttbox}


392 
consts \(c\) :: \(\tau\) (binder \(Q\) \(p\))


393 
\end{ttbox}\indexbold{*binder}


394 
introduces a binder $c$ of type $\tau$,


395 
which must have the form $(\tau@1\To\tau@2)\To\tau@3$. Its concrete syntax


396 
is $Q~x.t$. A binder is like a generalized quantifier where $\tau@1$ is the


397 
type of the bound variable $x$, $\tau@2$ the type of the body $t$, and


398 
$\tau@3$ the type of the whole term. For example $\forall$ can be declared


399 
like this:


400 
\begin{ttbox}


401 
consts All :: "('a => o) => o" (binder "ALL " 10)


402 
\end{ttbox}


403 
This allows us to write $\forall x.P$ either as {\tt ALL $x$.$P$} or {\tt


404 
All(\%$x$.$P$)}; the latter form is for purists only.


405 


406 
In case $\tau@2 = \tau@3$, nested quantifications can be written as $Q~x@1


407 
\dots x@n.t$. From a syntactic point of view,


408 
\begin{ttbox}


409 
consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)" (binder "\(Q\)" \(p\))


410 
\end{ttbox}


411 
is equivalent to


412 
\begin{ttbox}


413 
consts \(c\) :: "\((\tau@1\To\tau@2)\To\tau@3\)"


414 
"\(Q\)" :: "[idts,\(\tau@2\)] => \(\tau@3\)" ("\(Q\)_. _" \(p\))


415 
\end{ttbox}


416 
where {\tt idts} is the syntactic category $idts$ defined in


417 
Figure~\ref{MetaLogicSyntax}.


418 


419 
However, there is more to binders than concrete syntax: behind the scenes the


420 
body of the quantified expression has to be converted into a


421 
$\lambda$abstraction (when parsing) and back again (when printing). This


422 
is performed by the translation mechanism, which is discussed below. For


423 
binders, the definition of the required translation functions has been


424 
automated. Many other syntactic forms, such as set comprehension, require


425 
special treatment.


426 


427 


428 
\section{Parse translations *}


429 
\label{Parsetranslations}


430 
\index{parse translation(}


431 


432 
So far we have pretended that there is a close enough relationship between


433 
concrete and abstract syntax to allow an automatic translation from one to


434 
the other using the constant name supplied with each production. In many


435 
cases this scheme is not powerful enough, especially for constructs involving


436 
variable bindings. Therefore the $ML$section of a theory definition can


437 
associate constant names with userdefined translation functions by including


438 
a line


439 
\begin{ttbox}


440 
val parse_translation = \dots


441 
\end{ttbox}


442 
where the righthand side of this binding must be an \MLexpression of type


443 
\verb$(string * (term list > term))list$.


444 


445 
After the input string has been translated into a term according to the


446 
syntax definition, there is a second phase in which the term is translated


447 
using the usersupplied functions in a bottomup manner. Given a list $tab$


448 
of the above type, a term $t$ is translated as follows. If $t$ is not of the


449 
form {\tt Const($c$,$\tau$)\$$t@1$\$\dots\$$t@n$}, then $t$ is returned


450 
unchanged. Otherwise all $t@i$ are translated into $t@i'$. Let {\tt $t' =$


451 
Const($c$,$\tau$)\$$t@1'$\$\dots\$$t@n'$}. If there is no pair $(c,f)$ in


452 
$tab$, return $t'$. Otherwise apply $f$ to $[t@1',\dots,t@n']$. If that


453 
raises an exception, return $t'$, otherwise return the result.


454 
\begin{example}\label{listenum}


455 
\MLlists are constructed by {\tt[]} and {\tt::}. For readability the


456 
list \hbox{\tt$x$::$y$::$z$::[]} can be written \hbox{\tt[$x$,$y$,$z$]}.


457 
In Isabelle the two forms of lists are declared as follows:


458 
\begin{ttbox}


459 
types list 1


460 
enum 0


461 
arities list :: (term)term


462 
consts "[]" :: "'a list" ("[]")


463 
":" :: "['a, 'a list] => 'a list" (infixr 50)


464 
enum :: "enum => 'a list" ("[_]")


465 
sing :: "'a => enum" ("_")


466 
cons :: "['a,enum] => enum" ("_, _")


467 
end


468 
\end{ttbox}


469 
Because \verb$::$ is already used for type constraints, it is replaced by


470 
\verb$:$ as the infix list constructor.


471 


472 
In order to allow list enumeration, the new type {\tt enum} is introduced.


473 
Its only purpose is syntactic and hence it does not need an arity, in


474 
contrast to the logical type {\tt list}. Although \hbox{\tt[$x$,$y$,$z$]} is


475 
syntactically legal, it needs to be translated into a term built up from


476 
\verb$[]$ and \verb$:$. This is what \verb$make_list$ accomplishes:


477 
\begin{ttbox}


478 
val cons = Const("op :", dummyT);


479 


480 
fun make_list (Const("sing",_)$e) = cons $ e $ Const("[]", dummyT)


481 
 make_list (Const("cons",_)$e$es) = cons $ e $ make_list es;


482 
\end{ttbox}


483 
To hook this translation up to Isabelle's parser, the theory definition needs


484 
to contain the following $ML$section:


485 
\begin{ttbox}


486 
ML


487 
fun enum_tr[enum] = make_list enum;


488 
val parse_translation = [("enum",enum_tr)]


489 
\end{ttbox}


490 
This causes \verb!Const("enum",_)$!$t$ to be replaced by


491 
\verb$enum_tr[$$t$\verb$]$.


492 


493 
Of course the definition of \verb$make_list$ should be included in the


494 
$ML$section.


495 
\end{example}


496 
\begin{example}\label{SET}


497 
Isabelle represents the set $\{ x \mid P(x) \}$ internally by $Set(\lambda


498 
x.P(x))$. The internal and external forms need separate


499 
constants:\footnote{In practice, the external form typically has a name


500 
beginning with an {\at} sign, such as {\tt {\at}SET}. This emphasizes that


501 
the constant should be used only for parsing/printing.}


502 
\begin{ttbox}


503 
types set 1


504 
arities set :: (term)term


505 
consts Set :: "('a => o) => 'a set"


506 
SET :: "[id,o] => 'a set" ("\{_  _\}")


507 
\end{ttbox}


508 
Parsing {\tt"\{$x$  $P$\}"} according to this syntax yields the term {\tt


509 
Const("SET",dummyT) \$ Free("\(x\)",dummyT) \$ \(p\)}, where $p$ is the


510 
result of parsing $P$. What we need is the term {\tt


511 
Const("Set",dummyT)\$Abs("$x$",dummyT,$p'$)}, where $p'$ is some


512 
``abstracted'' version of $p$. Therefore we define a function


513 
\begin{ttbox}


514 
fun set_tr[Free(s,T), p] = Const("Set", dummyT) $


515 
Abs(s, T, abstract_over(Free(s,T), p));


516 
\end{ttbox}


517 
where \verb$abstract_over: term*term > term$ is a predefined function such


518 
that {\tt abstract_over($u$,$t$)} replaces every occurrence of $u$ in $t$ by


519 
a {\tt Bound} variable of the correct index (i.e.\ 0 at top level). Remember


520 
that {\tt dummyT} is replaced by the correct types at a later stage (see


521 
\S\ref{Typing}). Function {\tt set_tr} is associated with {\tt SET} by


522 
including the \MLtext


523 
\begin{ttbox}


524 
val parse_translation = [("SET", set_tr)];


525 
\end{ttbox}


526 
\end{example}


527 


528 
If you want to run the above examples in Isabelle, you should note that an


529 
$ML$section needs to contain not just a definition of


530 
\verb$parse_translation$ but also of a variable \verb$print_translation$. The


531 
purpose of the latter is to reverse the effect of the former during printing;


532 
details are found in \S\ref{Printtranslations}. Hence you need to include


533 
the line


534 
\begin{ttbox}


535 
val print_translation = [];


536 
\end{ttbox}


537 
This is instructive because the terms are then printed out in their internal


538 
form. For example the input \hbox{\tt[$x$,$y$,$z$]} is echoed as


539 
\hbox{\tt$x$:$y$:$z$:[]}. This helps to check that your parse translation is


540 
working correctly.


541 


542 
%\begin{warn}


543 
%Explicit type constraints disappear with type checking but are still


544 
%visible to the parse translation functions.


545 
%\end{warn}


546 


547 
\index{parse translation)}


548 


549 
\section{Printing}


550 


551 
Syntax definitions provide printing information in three distinct ways:


552 
through


553 
\begin{itemize}


554 
\item the syntax of the language (as used for parsing),


555 
\item pretty printing information, and


556 
\item print translation functions.


557 
\end{itemize}


558 
The bare mixfix declarations enable Isabelle to print terms, but the result


559 
will not necessarily be pretty and may look different from what you expected.


560 
To produce a pleasing layout, you need to read the following sections.


561 


562 
\subsection{Printing with mixfix declarations}


563 


564 
Let {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} be a term and let


565 
\begin{ttbox}


566 
consts \(c\) :: \(\tau\) (\(sy\))


567 
\end{ttbox}


568 
be a mixfix declaration where $sy$ is of the form


569 
$\alpha@0\_\alpha@1\dots\alpha@{n1}\_\alpha@n$. Printing $t$ according to


570 
$sy$ means printing the string


571 
$\alpha@0\beta@1\alpha@1\ldots\alpha@{n1}\beta@n\alpha@n$, where $\beta@i$


572 
is the result of printing $t@i$.


573 


574 
Note that the system does {\em not\/} insert blanks. They should be part of


575 
the mixfix syntax if they are required to separate tokens or achieve a


576 
certain layout.


577 


578 
\subsection{Pretty printing}


579 
\label{PrettyPrinting}


580 
\index{pretty printing}


581 


582 
In order to format the output, it is possible to embed pretty printing


583 
directives in mixfix annotations. These directives are ignored during parsing


584 
and affect only printing. The characters {\tt(}, {\tt)} and {\tt/} are


585 
interpreted as metacharacters\index{metacharacter} when found in a mixfix


586 
annotation. Their meaning is


587 
\begin{description}


588 
\item[~{\tt(}~] Open a block. A sequence of digits following it is


589 
interpreted as the \bfindex{indentation} of this block. It causes the


590 
output to be indented by $n$ positions if a line break occurs within the


591 
block. If {\tt(} is not followed by a digit, the indentation defaults to


592 
$0$.


593 
\item[~{\tt)}~] Close a block.


594 
\item[~\ttindex{/}~] Allow a \bfindex{line break}. White space immediately


595 
following {\tt/} is not printed if the line is broken at this point.


596 
\end{description}


597 


598 
\subsection{Print translations *}


599 
\label{Printtranslations}


600 
\index{print translation(}


601 


602 
Since terms are translated after parsing (see \S\ref{Parsetranslations}),


603 
there is a similar mechanism to translate them back before printing.


604 
Therefore the $ML$section of a theory definition can associate constant


605 
names with userdefined translation functions by including a line


606 
\begin{ttbox}


607 
val print_translation = \dots


608 
\end{ttbox}


609 
where the righthand side of this binding is again an \MLexpression of type


610 
\verb$(string * (term list > term))list$.


611 
Including a pair $(c,f)$ in this list causes the printer to print


612 
$f[t@1,\dots,t@n]$ whenever it finds {\tt Const($c$,_)\$$t@1$\$\dots\$$t@n$}.


613 
\begin{example}


614 
Reversing the effect of the parse translation in Example~\ref{listenum} is


615 
accomplished by the following function:


616 
\begin{ttbox}


617 
fun make_enum (Const("op :",_) $ e $ es) = case es of


618 
Const("[]",_) => Const("sing",dummyT) $ e


619 
 _ => Const("enum",dummyT) $ e $ make_enum es;


620 
\end{ttbox}


621 
It translates \hbox{\tt$x$:$y$:$z$:[]} to \hbox{\tt[$x$,$y$,$z$]}. However,


622 
if the input does not terminate with an empty list, e.g.\ \hbox{\tt$x$:$xs$},


623 
\verb$make_enum$ raises exception {\tt Match}. This signals that the


624 
attempted translation has failed and the term should be printed as is.


625 
The connection with Isabelle's pretty printer is established as follows:


626 
\begin{ttbox}


627 
fun enum_tr'[x,xs] = Const("enum",dummyT) $


628 
make_enum(Const("op :",dummyT)$x$xs);


629 
val print_translation = [("op :", enum_tr')];


630 
\end{ttbox}


631 
This declaration causes the printer to print \hbox{\tt enum_tr'[$x$,$y$]}


632 
whenever it finds \verb!Const("op :",_)$!$x$\verb!$!$y$.


633 
\end{example}


634 
\begin{example}


635 
In Example~\ref{SET} we showed how to translate the concrete syntax for set


636 
comprehension into the proper internal form. The string {\tt"\{$x$ 


637 
$P$\}"} now becomes {\tt Const("Set",_)\$Abs("$x$",_,$p$)}. If, however,


638 
the latter term were printed without translating it back, it would result


639 
in {\tt"Set(\%$x$.$P$)"}. Therefore the abstraction has to be turned back


640 
into a term that matches the concrete mixfix syntax:


641 
\begin{ttbox}


642 
fun set_tr'[Abs(x,T,P)] =


643 
let val (x',P') = variant_abs(x,T,P)


644 
in Const("SET",dummyT) $ Free(x',T) $ P' end;


645 
\end{ttbox}


646 
The function \verb$variant_abs$, a basic term manipulation function, replaces


647 
the bound variable $x$ by a {\tt Free} variable $x'$ having a unique name. A


648 
term produced by {\tt set_tr'} can now be printed according to the concrete


649 
syntax defined in Example~\ref{SET} above.


650 


651 
Notice that the application of {\tt set_tr'} fails if the second component of


652 
the argument is not an abstraction, but for example just a {\tt Free}


653 
variable. This is intentional because it signals to the caller that the


654 
translation is inapplicable. As a result {\tt Const("Set",_)\$Free("P",_)}


655 
prints as {\tt"Set(P)"}.


656 


657 
The full theory extension, including concrete syntax and both translation


658 
functions, has the following form:


659 
\begin{ttbox}


660 
types set 1


661 
arities set :: (term)term


662 
consts Set :: "('a => o) => 'a set"


663 
SET :: "[id,o] => 'a set" ("\{_  _\}")


664 
end


665 
ML


666 
fun set_tr[Free(s,T), p] = \dots;


667 
val parse_translation = [("SET", set_tr)];


668 
fun set_tr'[Abs(x,T,P)] = \dots;


669 
val print_translation = [("Set", set_tr')];


670 
\end{ttbox}


671 
\end{example}


672 
As during the parse translation process, the types attached to constants


673 
during print translation are ignored. Thus {\tt Const("SET",dummyT)} in


674 
{\tt set_tr'} above is acceptable. The types of {\tt Free}s and {\tt Var}s


675 
however must be preserved because they may get printed (see {\tt


676 
show_types}).


677 


678 
\index{print translation)}


679 


680 
\subsection{Printing a term}


681 


682 
Let $tab$ be the set of all stringfunction pairs of print translations in the


683 
current syntax.


684 


685 
Terms are printed recursively; print translations are applied top down:


686 
\begin{itemize}


687 
\item {\tt Free($x$,_)} is printed as $x$.


688 
\item {\tt Var(($x$,$i$),_)} is printed as $x$, if $i = 0$ and $x$ does not


689 
end with a digit, as $x$ followed by $i$, if $i \neq 0$ and $x$ does not


690 
end with a digit, and as {\tt $x$.$i$}, if $x$ ends with a digit. Thus the


691 
following cases can arise:


692 
\begin{center}


693 
{\tt\begin{tabular}{cccc}


694 
\verb$Var(("v",0),_)$ & \verb$Var(("v",7),_)$ & \verb$Var(("v5",0),_)$ \\


695 
"?v" & "?v7" & "?v5.0"


696 
\end{tabular}}


697 
\end{center}


698 
\item {\tt Abs($x@1$,_,Abs($x@2$,_,\dots Abs($x@n$,_,$p$)\dots))}, where $p$


699 
is not an abstraction, is printed as {\tt \%$y@1\dots y@n$.$P$}, where $P$


700 
is the result of printing $p$, and the $x@i$ are replaced by $y@i$. The


701 
latter are (possibly new) unique names.


702 
\item {\tt Bound($i$)} is printed as {\tt B.$i$} \footnote{The occurrence of


703 
such ``loose'' bound variables indicates that either you are trying to


704 
print a subterm of an abstraction, or there is something wrong with your


705 
print translations.}.


706 
\item The application {\tt$t =$ Const($c$,_)\$$t@1$\$\dots\$$t@n$} (where


707 
$n$ may be $0$!) is printed as follows:


708 


709 
If there is a pair $(c,f)$ in $tab$, print $f[t@1,\dots,t@n]$. If this


710 
application raises exception {\tt Match} or there is no pair $(c,f)$ in


711 
$tab$, let $sy$ be the mixfix annotation associated with $c$. If there is


712 
no such $sy$, or if $sy$ does not have exactly $n$ argument positions, $t$


713 
is printed as an application; otherwise $t$ is printed according to $sy$.


714 


715 
All other applications are printed as applications.


716 
\end{itemize}


717 
Printing a term {\tt $c$\$$t@1$\$\dots\$$t@n$} as an application means


718 
printing it as {\tt $s$($s@1$,\dots,$s@n$)}, where $s@i$ is the result of


719 
printing $t@i$. If $c$ is a {\tt Const}, $s$ is its first argument;


720 
otherwise $s$ is the result of printing $c$ as described above.


721 
\medskip


722 


723 
The printer also inserts parentheses where they are necessary for reasons


724 
of precedence.


725 


726 
\section{Identifiers, constants, and type inference *}


727 
\label{Typing}


728 


729 
There is one final step in the translation from strings to terms that we have


730 
not covered yet. It explains how constants are distinguished from {\tt Free}s


731 
and how {\tt Free}s and {\tt Var}s are typed. Both issues arise because {\tt


732 
Free}s and {\tt Var}s are not declared.


733 


734 
An identifier $f$ that does not appear as a delimiter in the concrete syntax


735 
can be either a free variable or a constant. Since the parser knows only


736 
about those constants which appear in mixfix annotations, it parses $f$ as


737 
{\tt Free("$f$",dummyT)}, where \ttindex{dummyT} is the predefined dummy {\tt


738 
typ}. Although the parser produces these very raw terms, most user


739 
interface level functions like {\tt goal} type terms according to the given


740 
theory, say $T$. In a first step, every occurrence of {\tt Free($f$,_)} or


741 
{\tt Const($f$,_)} is replaced by {\tt Const($f$,$\tau$)}, provided there is


742 
a constant $f$ of {\tt typ} $\tau$ in $T$. This means that identifiers are


743 
treated as {\tt Free}s iff they are not declared in the theory. The types of


744 
the remaining {\tt Free}s (and {\tt Var}s) are inferred as in \ML. Type


745 
constraints can be used to remove ambiguities.


746 


747 
One peculiarity of the current type inference algorithm is that variables


748 
with the same name must have the same type, irrespective of whether they are


749 
schematic, free or bound. For example, take the firstorder formula $f(x) = x


750 
\land (\forall f.~ f=f)$ where ${=} :: [\alpha{::}term,\alpha]\To o$ and


751 
$\forall :: (\alpha{::}term\To o)\To o$. The first conjunct forces


752 
$x::\alpha{::}term$ and $f::\alpha\To\alpha$, the second one


753 
$f::\beta{::}term$. Although the two $f$'s are distinct, they are required to


754 
have the same type. Unifying $\alpha\To\alpha$ and $\beta{::}term$ fails


755 
because, in firstorder logic, function types are not in class $term$.


756 


757 


758 
\section{Putting it all together}


759 


760 
Having discussed the individual building blocks of a logic definition, it


761 
remains to be shown how they fit together. In particular we need to say how


762 
an objectlogic syntax is hooked up to the metalogic. Since all theorems


763 
must conform to the syntax for $prop$ (see Figure~\ref{MetaLogicSyntax}),


764 
that syntax has to be extended with the objectlevel syntax. Assume that the


765 
syntax of your objectlogic defines a category $o$ of formulae. These


766 
formulae can now appear in axioms and theorems wherever $prop$ does if you


767 
add the production


768 
\[ prop ~=~ form. \]


769 
More precisely, you need a coercion from formulae to propositions:


770 
\begin{ttbox}


771 
Base = Pure +


772 
types o 0


773 
arities o :: logic


774 
consts Trueprop :: "o => prop" ("_" 5)


775 
end


776 
\end{ttbox}


777 
The constant {\tt Trueprop} (the name is arbitrary) acts as an invisible


778 
coercion function. Assuming this definition resides in a file {\tt base.thy},


779 
you have to load it with the command {\tt use_thy"base"}.


780 


781 
One of the simplest nontrivial logics is {\em minimal logic} of


782 
implication. Its definition in Isabelle needs no advanced features but


783 
illustrates the overall mechanism quite nicely:


784 
\begin{ttbox}


785 
Hilbert = Base +


786 
consts ">" :: "[o,o] => o" (infixr 10)


787 
rules


788 
K "P > Q > P"


789 
S "(P > Q > R) > (P > Q) > P > R"


790 
MP "[ P > Q; P ] ==> Q"


791 
end


792 
\end{ttbox}


793 
After loading this definition you can start to prove theorems in this logic:


794 
\begin{ttbox}


795 
goal Hilbert.thy "P > P";


796 
{\out Level 0}


797 
{\out P > P}


798 
{\out 1. P > P}


799 
by (resolve_tac [Hilbert.MP] 1);


800 
{\out Level 1}


801 
{\out P > P}


802 
{\out 1. ?P > P > P}


803 
{\out 2. ?P}


804 
by (resolve_tac [Hilbert.MP] 1);


805 
{\out Level 2}


806 
{\out P > P}


807 
{\out 1. ?P1 > ?P > P > P}


808 
{\out 2. ?P1}


809 
{\out 3. ?P}


810 
by (resolve_tac [Hilbert.S] 1);


811 
{\out Level 3}


812 
{\out P > P}


813 
{\out 1. P > ?Q2 > P}


814 
{\out 2. P > ?Q2}


815 
by (resolve_tac [Hilbert.K] 1);


816 
{\out Level 4}


817 
{\out P > P}


818 
{\out 1. P > ?Q2}


819 
by (resolve_tac [Hilbert.K] 1);


820 
{\out Level 5}


821 
{\out P > P}


822 
{\out No subgoals!}


823 
\end{ttbox}


824 
As you can see, this Hilbertstyle formulation of minimal logic is easy to


825 
define but difficult to use. The following natural deduction formulation is


826 
far preferable:


827 
\begin{ttbox}


828 
MinI = Base +


829 
consts ">" :: "[o,o] => o" (infixr 10)


830 
rules


831 
impI "(P ==> Q) ==> P > Q"


832 
impE "[ P > Q; P ] ==> Q"


833 
end


834 
\end{ttbox}


835 
Note, however, that although the two systems are equivalent, this fact cannot


836 
be proved within Isabelle: {\tt S} and {\tt K} can be derived in \verb$MinI$


837 
(exercise!), but {\tt impI} cannot be derived in \verb!Hilbert!. The reason


838 
is that {\tt impI} is only an {\em admissible} rule in \verb!Hilbert!,


839 
something that can only be shown by induction over all possible proofs in


840 
\verb!Hilbert!.


841 


842 
It is a very simple matter to extend minimal logic with falsity:


843 
\begin{ttbox}


844 
MinIF = MinI +


845 
consts False :: "o"


846 
rules


847 
FalseE "False ==> P"


848 
end


849 
\end{ttbox}


850 
On the other hand, we may wish to introduce conjunction only:


851 
\begin{ttbox}


852 
MinC = Base +


853 
consts "&" :: "[o,o] => o" (infixr 30)


854 
rules


855 
conjI "[ P; Q ] ==> P & Q"


856 
conjE1 "P & Q ==> P"


857 
conjE2 "P & Q ==> Q"


858 
end


859 
\end{ttbox}


860 
And if we want to have all three connectives together, we define:


861 
\begin{ttbox}


862 
MinIFC = MinIF + MinC


863 
\end{ttbox}


864 
Now we can prove mixed theorems like


865 
\begin{ttbox}


866 
goal MinIFC.thy "P & False > Q";


867 
by (resolve_tac [MinI.impI] 1);


868 
by (dresolve_tac [MinC.conjE2] 1);


869 
by (eresolve_tac [MinIF.FalseE] 1);


870 
\end{ttbox}


871 
Try this as an exercise!
