320

1 
%% $Id$


2 
\chapter{Defining Logics} \label{DefiningLogics}


3 
This chapter explains how to define new formal systems  in particular,


4 
their concrete syntax. While Isabelle can be regarded as a theorem prover


5 
for set theory, higherorder logic or the sequent calculus, its


6 
distinguishing feature is support for the definition of new logics.


7 


8 
Isabelle logics are hierarchies of theories, which are described and


9 
illustrated in


10 
\iflabelundefined{sec:definingtheories}{{\em Introduction to Isabelle}}%


11 
{\S\ref{sec:definingtheories}}. That material, together with the theory


12 
files provided in the examples directories, should suffice for all simple


13 
applications. The easiest way to define a new theory is by modifying a


14 
copy of an existing theory.


15 


16 
This chapter documents the metalogic syntax, mixfix declarations and


17 
pretty printing. The extended examples in \S\ref{sec:min_logics}


18 
demonstrate the logical aspects of the definition of theories.


19 


20 


21 
\section{Priority grammars} \label{sec:priority_grammars}


22 
\index{priority grammars(}


23 


24 
A contextfree grammar contains a set of {\bf nonterminal symbols}, a set of


25 
{\bf terminal symbols} and a set of {\bf productions}\index{productions}.


26 
Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and


27 
$\gamma$ is a string of terminals and nonterminals. One designated


28 
nonterminal is called the {\bf start symbol}. The language defined by the


29 
grammar consists of all strings of terminals that can be derived from the


30 
start symbol by applying productions as rewrite rules.


31 


32 
The syntax of an Isabelle logic is specified by a {\bf priority


33 
grammar}.\index{priorities} Each nonterminal is decorated by an integer


34 
priority, as in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be


35 
rewritten using a production $A^{(q)} = \gamma$ only if~$p \le q$. Any


36 
priority grammar can be translated into a normal context free grammar by


37 
introducing new nonterminals and productions.


38 


39 
Formally, a set of context free productions $G$ induces a derivation


40 
relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of


41 
terminal or nonterminal symbols. Then


42 
\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \]


43 
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$.


44 


45 
The following simple grammar for arithmetic expressions demonstrates how


46 
binding power and associativity of operators can be enforced by priorities.


47 
\begin{center}


48 
\begin{tabular}{rclr}


49 
$A^{(9)}$ & = & {\tt0} \\


50 
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\


51 
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\


52 
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\


53 
$A^{(3)}$ & = & {\tt} $A^{(3)}$


54 
\end{tabular}


55 
\end{center}


56 
The choice of priorities determines that {\tt } binds tighter than {\tt *},


57 
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the


58 
left and {\tt *} to the right.


59 


60 
For clarity, grammars obey these conventions:


61 
\begin{itemize}


62 
\item All priorities must lie between~0 and \ttindex{max_pri}, which is a


63 
some fixed integer. Sometimes {\tt max_pri} is written as $\infty$.


64 
\item Priority 0 on the righthand side and priority \ttindex{max_pri} on


65 
the lefthand side may be omitted.


66 
\item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the


67 
priority of the lefthand side actually appears in a column on the far


68 
right.


69 
\item Alternatives are separated by~$$.


70 
\item Repetition is indicated by dots~(\dots) in an informal but obvious


71 
way.


72 
\end{itemize}


73 


74 
Using these conventions and assuming $\infty=9$, the grammar


75 
takes the form


76 
\begin{center}


77 
\begin{tabular}{rclc}


78 
$A$ & = & {\tt0} & \hspace*{4em} \\


79 
& $$ & {\tt(} $A$ {\tt)} \\


80 
& $$ & $A$ {\tt+} $A^{(1)}$ & (0) \\


81 
& $$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\


82 
& $$ & {\tt} $A^{(3)}$ & (3)


83 
\end{tabular}


84 
\end{center}


85 
\index{priority grammars)}


86 


87 


88 
\begin{figure}


89 
\begin{center}


90 
\begin{tabular}{rclc}


91 
$prop$ &=& {\tt PROP} $aprop$ ~~$$~~ {\tt(} $prop$ {\tt)} \\


92 
&$$& $logic^{(3)}$ {\tt ==} $logic^{(2)}$ & (2) \\


93 
&$$& $logic^{(3)}$ {\tt =?=} $logic^{(2)}$ & (2) \\


94 
&$$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\


95 
&$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop^{(1)}$ & (1) \\


96 
&$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\


97 
$logic$ &=& $prop$ ~~$$~~ $fun$ \\\\


98 
$aprop$ &=& $id$ ~~$$~~ $var$


99 
~~$$~~ $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\


100 
$fun$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $fun$ {\tt)} \\


101 
&$$& $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\

452

102 
&$$& $fun^{(4)}$ {\tt::} $type$ & (4) \\

320

103 
&$$& {\tt \%} $idts$ {\tt.} $logic$ & (0) \\\\


104 
$idts$ &=& $idt$ ~~$$~~ $idt^{(1)}$ $idts$ \\\\


105 
$idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\


106 
&$$& $id$ {\tt ::} $type$ & (0) \\\\


107 
$type$ &=& $tid$ ~~$$~~ $tvar$ ~~$$~~ $tid$ {\tt::} $sort$


108 
~~$$~~ $tvar$ {\tt::} $sort$ \\


109 
&$$& $id$ ~~$$~~ $type^{(\infty)}$ $id$


110 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\


111 
&$$& $type^{(1)}$ {\tt =>} $type$ & (0) \\


112 
&$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\


113 
&$$& {\tt(} $type$ {\tt)} \\\\


114 
$sort$ &=& $id$ ~~$$~~ {\tt\ttlbrace\ttrbrace}


115 
~~$$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace}


116 
\end{tabular}


117 
\index{*PROP symbol}


118 
\index{*== symbol}\index{*=?= symbol}\index{*==> symbol}


119 
\index{*:: symbol}\index{*=> symbol}

332

120 
\index{sort constraints}


121 
%the index command: a percent is permitted, but braces must match!

320

122 
\index{%@{\tt\%} symbol}


123 
\index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol}


124 
\index{*[ symbol}\index{*] symbol}


125 
\index{*"!"! symbol}


126 
\index{*"[" symbol}


127 
\index{*""] symbol}


128 
\end{center}


129 
\caption{Metalogic syntax}\label{fig:pure_gram}


130 
\end{figure}


131 


132 


133 
\section{The Pure syntax} \label{sec:basic_syntax}


134 
\index{syntax!Pure(}


135 


136 
At the root of all objectlogics lies the theory \thydx{Pure}. It


137 
contains, among many other things, the Pure syntax. An informal account of


138 
this basic syntax (types, terms and formulae) appears in


139 
\iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}%


140 
{\S\ref{sec:forward}}. A more precise description using a priority grammar


141 
appears in Fig.\ts\ref{fig:pure_gram}. It defines the following


142 
nonterminals:


143 
\begin{ttdescription}


144 
\item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae


145 
of the metalogic.


146 


147 
\item[\ndxbold{aprop}] denotes atomic propositions. These typically


148 
include the judgement forms of the objectlogic; its definition


149 
introduces a metalevel predicate for each judgement form.


150 


151 
\item[\ndxbold{logic}] denotes terms whose type belongs to class


152 
\cldx{logic}. As the syntax is extended by new objectlogics, more


153 
productions for {\tt logic} are added automatically (see below).


154 


155 
\item[\ndxbold{fun}] denotes terms potentially of function type.


156 


157 
\item[\ndxbold{type}] denotes types of the metalogic.


158 


159 
\item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained


160 
by types.


161 
\end{ttdescription}


162 


163 
\begin{warn}


164 
In {\tt idts}, note that \verbx::nat y is parsed as \verbx::(nat y),


165 
treating {\tt y} like a type constructor applied to {\tt nat}. The


166 
likely result is an error message. To avoid this interpretation, use


167 
parentheses and write \verb(x::nat) y.

332

168 
\index{type constraints}\index{*:: symbol}

320

169 


170 
Similarly, \verbx::nat y::nat is parsed as \verbx::(nat y::nat) and


171 
yields an error. The correct form is \verb(x::nat) (y::nat).


172 
\end{warn}


173 


174 
\subsection{Logical types and default syntax}\label{logicaltypes}


175 
\index{lambda calc@$\lambda$calculus}


176 


177 
Isabelle's representation of mathematical languages is based on the simply


178 
typed $\lambda$calculus. All logical types, namely those of class


179 
\cldx{logic}, are automatically equipped with a basic syntax of types,


180 
identifiers, variables, parentheses, $\lambda$abstractions and


181 
applications.


182 


183 
More precisely, for each type constructor $ty$ with arity $(\vec{s})c$,


184 
where $c$ is a subclass of \cldx{logic}, several productions are added:


185 
\begin{center}


186 
\begin{tabular}{rclc}


187 
$ty$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $ty$ {\tt)} \\


188 
&$$& $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\

452

189 
&$$& $ty^{(4)}$ {\tt::} $type$ ~~~~~~~ (3) \\\\

320

190 
$logic$ &=& $ty$


191 
\end{tabular}


192 
\end{center}


193 

452

194 
\begin{warn}


195 
Type constraints bind very weakly. For example, \verb!x<y::nat! is normally


196 
parsed as \verb!(x<y)::nat!, unless \verb$<$ has priority of 4 or less, in


197 
which case the string is likely to be ambiguous. The correct form is


198 
\verb!x<(y::nat)!.


199 
\end{warn}

320

200 


201 
\subsection{Lexical matters}


202 
The parser does not process input strings directly. It operates on token


203 
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of


204 
tokens: \bfindex{delimiters} and \bfindex{name tokens}.


205 


206 
\index{reserved words}


207 
Delimiters can be regarded as reserved words of the syntax. You can


208 
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they


209 
appear in typewriter font, for example {\tt ==}, {\tt =?=} and


210 
{\tt PROP}\@.


211 


212 
Name tokens have a predefined syntax. The lexer distinguishes four


213 
disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type


214 
identifiers\index{type identifiers}, type unknowns\index{type unknowns}.


215 
They are denoted by \ndxbold{id}, \ndxbold{var}, \ndxbold{tid},


216 
\ndxbold{tvar}, respectively. Typical examples are {\tt x}, {\tt ?x7},


217 
{\tt 'a}, {\tt ?'a3}. Here is the precise syntax:


218 
\begin{eqnarray*}


219 
id & = & letter~quasiletter^* \\


220 
var & = & \mbox{\tt ?}id ~~~~ \mbox{\tt ?}id\mbox{\tt .}nat \\


221 
tid & = & \mbox{\tt '}id \\


222 
tvar & = & \mbox{\tt ?}tid ~~~~


223 
\mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex]


224 
letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\


225 
digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\


226 
quasiletter & = & letter ~~~~ digit ~~~~ \mbox{\tt _} ~~~~ \mbox{\tt '} \\


227 
nat & = & digit^+


228 
\end{eqnarray*}


229 
A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally


230 
a pair of base name and index (\ML\ type \mltydx{indexname}). These


231 
components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or


232 
run together as in {\tt ?x1}. The latter form is possible if the base name


233 
does not end with digits. If the index is 0, it may be dropped altogether:


234 
{\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}.


235 


236 
The lexer repeatedly takes the maximal prefix of the input string that


237 
forms a valid token. A maximal prefix that is both a delimiter and a name


238 
is treated as a delimiter. Spaces, tabs and newlines are separators; they


239 
never occur within tokens.


240 


241 
Delimiters need not be separated by white space. For example, if {\tt }


242 
is a delimiter but {\tt } is not, then the string {\tt } is treated as


243 
two consecutive occurrences of the token~{\tt }. In contrast, \ML\


244 
treats {\tt } as a single symbolic name. The consequence of Isabelle's


245 
more liberal scheme is that the same string may be parsed in different ways


246 
after extending the syntax: after adding {\tt } as a delimiter, the input


247 
{\tt } is treated as a single token.


248 


249 
Although name tokens are returned from the lexer rather than the parser, it


250 
is more logical to regard them as nonterminals. Delimiters, however, are


251 
terminals; they are just syntactic sugar and contribute nothing to the


252 
abstract syntax tree.


253 


254 


255 
\subsection{*Inspecting the syntax}


256 
\begin{ttbox}


257 
syn_of : theory > Syntax.syntax


258 
Syntax.print_syntax : Syntax.syntax > unit


259 
Syntax.print_gram : Syntax.syntax > unit


260 
Syntax.print_trans : Syntax.syntax > unit


261 
\end{ttbox}


262 
The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes


263 
in \ML. You can display values of this type by calling the following


264 
functions:


265 
\begin{ttdescription}


266 
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle


267 
theory~{\it thy} as an \ML\ value.


268 


269 
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all


270 
information contained in the syntax {\it syn}. The displayed output can


271 
be large. The following two functions are more selective.


272 


273 
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part


274 
of~{\it syn}, namely the lexicon, roots and productions. These are


275 
discussed below.


276 


277 
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation


278 
part of~{\it syn}, namely the constants, parse/print macros and


279 
parse/print translations.


280 
\end{ttdescription}


281 


282 
Let us demonstrate these functions by inspecting Pure's syntax. Even that


283 
is too verbose to display in full.


284 
\begin{ttbox}\index{*Pure theory}


285 
Syntax.print_syntax (syn_of Pure.thy);


286 
{\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots}


287 
{\out roots: logic type fun prop}


288 
{\out prods:}


289 
{\out type = tid (1000)}


290 
{\out type = tvar (1000)}


291 
{\out type = id (1000)}


292 
{\out type = tid "::" sort[0] => "_ofsort" (1000)}


293 
{\out type = tvar "::" sort[0] => "_ofsort" (1000)}


294 
{\out \vdots}


295 
\ttbreak


296 
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots}


297 
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"}


298 
{\out "_idtyp" "_lambda" "_tapp" "_tappl"}


299 
{\out parse_rules:}


300 
{\out parse_translation: "!!" "_K" "_abs" "_aprop"}


301 
{\out print_translation: "all"}


302 
{\out print_rules:}


303 
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"}


304 
\end{ttbox}


305 

332

306 
As you can see, the output is divided into labelled sections. The grammar

320

307 
is represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest


308 
refers to syntactic translations and macro expansion. Here is an


309 
explanation of the various sections.


310 
\begin{description}


311 
\item[{\tt lexicon}] lists the delimiters used for lexical


312 
analysis.\index{delimiters}


313 


314 
\item[{\tt roots}] lists the grammar's nonterminal symbols. You must


315 
name the desired root when calling lower level functions or specifying


316 
macros. Higher level functions usually expect a type and derive the


317 
actual root as described in~\S\ref{sec:grammar}.


318 


319 
\item[{\tt prods}] lists the \rmindex{productions} of the priority grammar.


320 
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}.


321 
Each delimiter is quoted. Some productions are shown with {\tt =>} and


322 
an attached string. These strings later become the heads of parse


323 
trees; they also play a vital role when terms are printed (see


324 
\S\ref{sec:asts}).


325 


326 
Productions with no strings attached are called {\bf copy


327 
productions}\indexbold{productions!copy}. Their righthand side must


328 
have exactly one nonterminal symbol (or name token). The parser does


329 
not create a new parse tree node for copy productions, but simply


330 
returns the parse tree of the righthand symbol.


331 


332 
If the righthand side consists of a single nonterminal with no


333 
delimiters, then the copy production is called a {\bf chain


334 
production}. Chain productions act as abbreviations:


335 
conceptually, they are removed from the grammar by adding new


336 
productions. Priority information attached to chain productions is


337 
ignored; only the dummy value $1$ is displayed.


338 


339 
\item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}]


340 
relate to macros (see \S\ref{sec:macros}).


341 


342 
\item[{\tt parse_ast_translation}, {\tt print_ast_translation}]


343 
list sets of constants that invoke translation functions for abstract


344 
syntax trees. Section \S\ref{sec:asts} below discusses this obscure


345 
matter.\index{constants!for translations}


346 


347 
\item[{\tt parse_translation}, {\tt print_translation}] list sets


348 
of constants that invoke translation functions for terms (see


349 
\S\ref{sec:tr_funs}).


350 
\end{description}


351 
\index{syntax!Pure)}


352 


353 


354 
\section{Mixfix declarations} \label{sec:mixfix}


355 
\index{mixfix declarations(}


356 


357 
When defining a theory, you declare new constants by giving their names,


358 
their type, and an optional {\bf mixfix annotation}. Mixfix annotations


359 
allow you to extend Isabelle's basic $\lambda$calculus syntax with


360 
readable notation. They can express any contextfree priority grammar.


361 
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more


362 
general than the priority declarations of \ML\ and Prolog.


363 


364 
A mixfix annotation defines a production of the priority grammar. It


365 
describes the concrete syntax, the translation to abstract syntax, and the


366 
pretty printing. Special case annotations provide a simple means of


367 
specifying infix operators, binders and so forth.


368 


369 
\subsection{Grammar productions}\label{sec:grammar}\index{productions}


370 


371 
Let us examine the treatment of the production


372 
\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\,


373 
A@n^{(p@n)}\, w@n. \]


374 
Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$,


375 
\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals.


376 
In the corresponding mixfix annotation, the priorities are given separately


377 
as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are identified with


378 
types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's


379 
effect on nonterminals is expressed as the function type


380 
\[ [\tau@1, \ldots, \tau@n]\To \tau. \]


381 
Finally, the template


382 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \]


383 
describes the strings of terminals.


384 


385 
A simple type is typically declared for each nonterminal symbol. In


386 
firstorder logic, type~$i$ stands for terms and~$o$ for formulae. Only


387 
the outermost type constructor is taken into account. For example, any


388 
type of the form $\sigma list$ stands for a list; productions may refer

332

389 
to the symbol {\tt list} and will apply to lists of any type.

320

390 


391 
The symbol associated with a type is called its {\bf root} since it may


392 
serve as the root of a parse tree. Precisely, the root of $(\tau@1, \dots,


393 
\tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is


394 
a type constructor. Type infixes are a special case of this; in


395 
particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}. Finally, the


396 
root of a type variable is {\tt logic}; general productions might


397 
refer to this nonterminal.


398 


399 
Identifying nonterminals with types allows a constant's type to specify


400 
syntax as well. We can declare the function~$f$ to have type $[\tau@1,


401 
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the


402 
layout of the function's $n$ arguments. The constant's name, in this


403 
case~$f$, will also serve as the label in the abstract syntax tree. There


404 
are two exceptions to this treatment of constants:


405 
\begin{enumerate}\index{constants!syntactic}


406 
\item A production need not map directly to a logical function. In this


407 
case, you must declare a constant whose purpose is purely syntactic.


408 
By convention such constants begin with the symbol~{\tt\at},


409 
ensuring that they can never be written in formulae.


410 


411 
\item A copy production has no associated constant.\index{productions!copy}


412 
\end{enumerate}


413 
There is something artificial about this representation of productions,


414 
but it is convenient, particularly for simple theory extensions.


415 


416 
\subsection{The general mixfix form}


417 
Here is a detailed account of mixfix declarations. Suppose the following


418 
line occurs within the {\tt consts} section of a {\tt .thy} file:


419 
\begin{center}


420 
{\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}


421 
\end{center}

332

422 
This constant declaration and mixfix annotation are interpreted as follows:

320

423 
\begin{itemize}\index{productions}


424 
\item The string {\tt $c$} is the name of the constant associated with the


425 
production; unless it is a valid identifier, it must be enclosed in


426 
quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy


427 
production.\index{productions!copy} Otherwise, parsing an instance of the


428 
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$


429 
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$th


430 
argument.


431 


432 
\item The constant $c$, if nonempty, is declared to have type $\sigma$.


433 


434 
\item The string $template$ specifies the righthand side of


435 
the production. It has the form


436 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \]


437 
where each occurrence of {\tt_} denotes an argument position and


438 
the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in


439 
the concrete syntax, you must escape it as described below.) The $w@i$


440 
may consist of \rmindex{delimiters}, spaces or


441 
\rmindex{pretty printing} annotations (see below).


442 


443 
\item The type $\sigma$ specifies the production's nonterminal symbols


444 
(or name tokens). If $template$ is of the form above then $\sigma$


445 
must be a function type with at least~$n$ argument positions, say


446 
$\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are


447 
derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described


448 
above. Any of these may be function types; the corresponding root is


449 
then \tydx{fun}.


450 


451 
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt


452 
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal


453 
priority\indexbold{priorities} required of any phrase that may appear


454 
as the $i$th argument. Missing priorities default to~0.


455 


456 
\item The integer $p$ is the priority of this production. If omitted, it


457 
defaults to the maximal priority.


458 
Priorities range between 0 and \ttindexbold{max_pri} (= 1000).


459 
\end{itemize}


460 
%


461 
The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no


462 
priorities. The resulting production puts no priority constraints on any


463 
of its arguments and has maximal priority itself. Omitting priorities in


464 
this manner will introduce syntactic ambiguities unless the production's


465 
righthand side is fully bracketed, as in \verb"if _ then _ else _ fi".


466 


467 
Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},


468 
is sensible only if~$c$ is an identifier. Otherwise you will be unable to


469 
write terms involving~$c$.


470 


471 
\begin{warn}


472 
Theories must sometimes declare types for purely syntactic purposes. One


473 
example is \tydx{type}, the builtin type of types. This is a `type of


474 
all types' in the syntactic sense only. Do not declare such types under


475 
{\tt arities} as belonging to class {\tt logic}\index{*logic class}, for


476 
that would allow their use in arbitrary Isabelle


477 
expressions~(\S\ref{logicaltypes}).


478 
\end{warn}


479 


480 
\subsection{Example: arithmetic expressions}


481 
\index{examples!of mixfix declarations}


482 
This theory specification contains a {\tt consts} section with mixfix


483 
declarations encoding the priority grammar from


484 
\S\ref{sec:priority_grammars}:


485 
\begin{ttbox}


486 
EXP = Pure +


487 
types


488 
exp


489 
arities


490 
exp :: logic


491 
consts


492 
"0" :: "exp" ("0" 9)


493 
"+" :: "[exp, exp] => exp" ("_ + _" [0, 1] 0)


494 
"*" :: "[exp, exp] => exp" ("_ * _" [3, 2] 2)


495 
"" :: "exp => exp" (" _" [3] 3)


496 
end


497 
\end{ttbox}


498 
The {\tt arities} declaration causes {\tt exp} to be added as a new root.

332

499 
If you put this into a file {\tt EXP.thy} and load it via {\tt

320

500 
use_thy "EXP"}, you can run some tests:


501 
\begin{ttbox}


502 
val read_exp = Syntax.test_read (syn_of EXP.thy) "exp";


503 
{\out val it = fn : string > unit}


504 
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0";


505 
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}


506 
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")}


507 
{\out \vdots}


508 
read_exp "0 +  0 + 0";


509 
{\out tokens: "0" "+" "" "0" "+" "0"}


510 
{\out raw: ("+" ("+" "0" ("" "0")) "0")}


511 
{\out \vdots}


512 
\end{ttbox}


513 
The output of \ttindex{Syntax.test_read} includes the token list ({\tt


514 
tokens}) and the raw \AST{} directly derived from the parse tree,


515 
ignoring parse \AST{} translations. The rest is tracing information


516 
provided by the macro expander (see \S\ref{sec:macros}).


517 


518 
Executing {\tt Syntax.print_gram} reveals the productions derived


519 
from our mixfix declarations (lots of additional information deleted):


520 
\begin{ttbox}


521 
Syntax.print_gram (syn_of EXP.thy);


522 
{\out exp = "0" => "0" (9)}


523 
{\out exp = exp[0] "+" exp[1] => "+" (0)}


524 
{\out exp = exp[3] "*" exp[2] => "*" (2)}


525 
{\out exp = "" exp[3] => "" (3)}


526 
\end{ttbox}


527 


528 


529 
\subsection{The mixfix template}


530 
Let us take a closer look at the string $template$ appearing in mixfix


531 
annotations. This string specifies a list of parsing and printing


532 
directives: delimiters\index{delimiters}, arguments, spaces, blocks of


533 
indentation and line breaks. These are encoded by the following character


534 
sequences:


535 
\index{pretty printing(}


536 
\begin{description}


537 
\item[~$d$~] is a delimiter, namely a nonempty sequence of characters


538 
other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}.


539 
Even these characters may appear if escaped; this means preceding it with


540 
a~{\tt '} (single quote). Thus you have to write {\tt ''} if you really


541 
want a single quote. Delimiters may never contain spaces.


542 


543 
\item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol


544 
or name token.


545 


546 
\item[~$s$~] is a nonempty sequence of spaces for printing. This and the


547 
following specifications do not affect parsing at all.


548 


549 
\item[~{\tt(}$n$~] opens a pretty printing block. The optional number $n$


550 
specifies how much indentation to add when a line break occurs within the


551 
block. If {\tt(} is not followed by digits, the indentation defaults


552 
to~0.


553 


554 
\item[~{\tt)}~] closes a pretty printing block.


555 


556 
\item[~{\tt//}~] forces a line break.


557 


558 
\item[~{\tt/}$s$~] allows a line break. Here $s$ stands for the string of


559 
spaces (zero or more) right after the {\tt /} character. These spaces


560 
are printed if the break is not taken.


561 
\end{description}


562 
For example, the template {\tt"(_ +/ _)"} specifies an infix operator.


563 
There are two argument positions; the delimiter~{\tt+} is preceded by a


564 
space and followed by a space or line break; the entire phrase is a pretty


565 
printing block. Other examples appear in Fig.\ts\ref{fig:set_trans} below.


566 
Isabelle's pretty printer resembles the one described in


567 
Paulson~\cite{paulson91}.


568 


569 
\index{pretty printing)}


570 


571 


572 
\subsection{Infixes}


573 
\indexbold{infixes}


574 


575 
Infix operators associating to the left or right can be declared


576 
using {\tt infixl} or {\tt infixr}.


577 
Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)}


578 
abbreviates the constant declarations


579 
\begin{ttbox}


580 
"op \(c\)" :: "\(\sigma\)" ("op \(c\)")


581 
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))


582 
\end{ttbox}


583 
and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the constant declarations


584 
\begin{ttbox}


585 
"op \(c\)" :: "\(\sigma\)" ("op \(c\)")


586 
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))


587 
\end{ttbox}


588 
The infix operator is declared as a constant with the prefix {\tt op}.


589 
Thus, prefixing infixes with \sdx{op} makes them behave like ordinary


590 
function symbols, as in \ML. Special characters occurring in~$c$ must be


591 
escaped, as in delimiters, using a single quote.


592 


593 
The expanded forms above would be actually illegal in a {\tt .thy} file


594 
because they declare the constant \hbox{\tt"op \(c\)"} twice.


595 


596 


597 
\subsection{Binders}


598 
\indexbold{binders}


599 
\begingroup


600 
\def\Q{{\cal Q}}


601 
A {\bf binder} is a variablebinding construct such as a quantifier. The


602 
constant declaration


603 
\begin{ttbox}


604 
\(c\) :: "\(\sigma\)" (binder "\(\Q\)" \(p\))


605 
\end{ttbox}


606 
introduces a constant~$c$ of type~$\sigma$, which must have the form


607 
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where


608 
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$


609 
and the whole term has type~$\tau@3$. Special characters in $\Q$ must be


610 
escaped using a single quote.


611 


612 
The declaration is expanded internally to


613 
\begin{ttbox}


614 
\(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"


615 
"\(\Q\)"\hskip3pt :: "[idts, \(\tau@2\)] => \(\tau@3\)" ("(3\(\Q\)_./ _)" \(p\))


616 
\end{ttbox}


617 
Here \ndx{idts} is the nonterminal symbol for a list of identifiers with

332

618 
\index{type constraints}

320

619 
optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The


620 
declaration also installs a parse translation\index{translations!parse}


621 
for~$\Q$ and a print translation\index{translations!print} for~$c$ to


622 
translate between the internal and external forms.


623 


624 
A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a


625 
list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$


626 
corresponds to the internal form


627 
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \]


628 


629 
\medskip


630 
For example, let us declare the quantifier~$\forall$:\index{quantifiers}


631 
\begin{ttbox}


632 
All :: "('a => o) => o" (binder "ALL " 10)


633 
\end{ttbox}


634 
This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL


635 
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall


636 
back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL


637 
$x$.$P$} have type~$o$, the type of formulae, while the bound variable


638 
can be polymorphic.


639 
\endgroup


640 


641 
\index{mixfix declarations)}


642 


643 


644 
\section{Example: some minimal logics} \label{sec:min_logics}


645 
\index{examples!of logic definitions}


646 


647 
This section presents some examples that have a simple syntax. They


648 
demonstrate how to define new objectlogics from scratch.


649 


650 
First we must define how an objectlogic syntax embedded into the


651 
metalogic. Since all theorems must conform to the syntax for~\ndx{prop} (see


652 
Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the


653 
objectlevel syntax. Assume that the syntax of your objectlogic defines a


654 
nonterminal symbol~\ndx{o} of formulae. These formulae can now appear in


655 
axioms and theorems wherever \ndx{prop} does if you add the production


656 
\[ prop ~=~ o. \]


657 
This is not a copy production but a coercion from formulae to propositions:


658 
\begin{ttbox}


659 
Base = Pure +


660 
types


661 
o


662 
arities


663 
o :: logic


664 
consts


665 
Trueprop :: "o => prop" ("_" 5)


666 
end


667 
\end{ttbox}


668 
The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible

332

669 
coercion function. Assuming this definition resides in a file {\tt Base.thy},

320

670 
you have to load it with the command {\tt use_thy "Base"}.


671 


672 
One of the simplest nontrivial logics is {\bf minimal logic} of


673 
implication. Its definition in Isabelle needs no advanced features but


674 
illustrates the overall mechanism nicely:


675 
\begin{ttbox}


676 
Hilbert = Base +


677 
consts


678 
">" :: "[o, o] => o" (infixr 10)


679 
rules


680 
K "P > Q > P"


681 
S "(P > Q > R) > (P > Q) > P > R"


682 
MP "[ P > Q; P ] ==> Q"


683 
end


684 
\end{ttbox}

332

685 
After loading this definition from the file {\tt Hilbert.thy}, you can

320

686 
start to prove theorems in the logic:


687 
\begin{ttbox}


688 
goal Hilbert.thy "P > P";


689 
{\out Level 0}


690 
{\out P > P}


691 
{\out 1. P > P}


692 
\ttbreak


693 
by (resolve_tac [Hilbert.MP] 1);


694 
{\out Level 1}


695 
{\out P > P}


696 
{\out 1. ?P > P > P}


697 
{\out 2. ?P}


698 
\ttbreak


699 
by (resolve_tac [Hilbert.MP] 1);


700 
{\out Level 2}


701 
{\out P > P}


702 
{\out 1. ?P1 > ?P > P > P}


703 
{\out 2. ?P1}


704 
{\out 3. ?P}


705 
\ttbreak


706 
by (resolve_tac [Hilbert.S] 1);


707 
{\out Level 3}


708 
{\out P > P}


709 
{\out 1. P > ?Q2 > P}


710 
{\out 2. P > ?Q2}


711 
\ttbreak


712 
by (resolve_tac [Hilbert.K] 1);


713 
{\out Level 4}


714 
{\out P > P}


715 
{\out 1. P > ?Q2}


716 
\ttbreak


717 
by (resolve_tac [Hilbert.K] 1);


718 
{\out Level 5}


719 
{\out P > P}


720 
{\out No subgoals!}


721 
\end{ttbox}


722 
As we can see, this Hilbertstyle formulation of minimal logic is easy to


723 
define but difficult to use. The following natural deduction formulation is


724 
better:


725 
\begin{ttbox}


726 
MinI = Base +


727 
consts


728 
">" :: "[o, o] => o" (infixr 10)


729 
rules


730 
impI "(P ==> Q) ==> P > Q"


731 
impE "[ P > Q; P ] ==> Q"


732 
end


733 
\end{ttbox}


734 
Note, however, that although the two systems are equivalent, this fact


735 
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be


736 
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt


737 
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule


738 
in {\tt Hilbert}, something that can only be shown by induction over all


739 
possible proofs in {\tt Hilbert}.


740 


741 
We may easily extend minimal logic with falsity:


742 
\begin{ttbox}


743 
MinIF = MinI +


744 
consts


745 
False :: "o"


746 
rules


747 
FalseE "False ==> P"


748 
end


749 
\end{ttbox}


750 
On the other hand, we may wish to introduce conjunction only:


751 
\begin{ttbox}


752 
MinC = Base +


753 
consts


754 
"&" :: "[o, o] => o" (infixr 30)


755 
\ttbreak


756 
rules


757 
conjI "[ P; Q ] ==> P & Q"


758 
conjE1 "P & Q ==> P"


759 
conjE2 "P & Q ==> Q"


760 
end


761 
\end{ttbox}


762 
And if we want to have all three connectives together, we create and load a


763 
theory file consisting of a single line:\footnote{We can combine the


764 
theories without creating a theory file using the ML declaration


765 
\begin{ttbox}


766 
val MinIFC_thy = merge_theories(MinIF,MinC)


767 
\end{ttbox}


768 
\index{*merge_theoriesfnote}}


769 
\begin{ttbox}


770 
MinIFC = MinIF + MinC


771 
\end{ttbox}


772 
Now we can prove mixed theorems like


773 
\begin{ttbox}


774 
goal MinIFC.thy "P & False > Q";


775 
by (resolve_tac [MinI.impI] 1);


776 
by (dresolve_tac [MinC.conjE2] 1);


777 
by (eresolve_tac [MinIF.FalseE] 1);


778 
\end{ttbox}


779 
Try this as an exercise!


780 


781 
