320

1 
%% $Id$


2 
\chapter{Defining Logics} \label{DefiningLogics}


3 
This chapter explains how to define new formal systems  in particular,


4 
their concrete syntax. While Isabelle can be regarded as a theorem prover


5 
for set theory, higherorder logic or the sequent calculus, its


6 
distinguishing feature is support for the definition of new logics.


7 


8 
Isabelle logics are hierarchies of theories, which are described and


9 
illustrated in


10 
\iflabelundefined{sec:definingtheories}{{\em Introduction to Isabelle}}%


11 
{\S\ref{sec:definingtheories}}. That material, together with the theory


12 
files provided in the examples directories, should suffice for all simple


13 
applications. The easiest way to define a new theory is by modifying a


14 
copy of an existing theory.


15 


16 
This chapter documents the metalogic syntax, mixfix declarations and


17 
pretty printing. The extended examples in \S\ref{sec:min_logics}


18 
demonstrate the logical aspects of the definition of theories.


19 


20 


21 
\section{Priority grammars} \label{sec:priority_grammars}


22 
\index{priority grammars(}


23 


24 
A contextfree grammar contains a set of {\bf nonterminal symbols}, a set of


25 
{\bf terminal symbols} and a set of {\bf productions}\index{productions}.


26 
Productions have the form ${A=\gamma}$, where $A$ is a nonterminal and


27 
$\gamma$ is a string of terminals and nonterminals. One designated


28 
nonterminal is called the {\bf start symbol}. The language defined by the


29 
grammar consists of all strings of terminals that can be derived from the


30 
start symbol by applying productions as rewrite rules.


31 


32 
The syntax of an Isabelle logic is specified by a {\bf priority


33 
grammar}.\index{priorities} Each nonterminal is decorated by an integer


34 
priority, as in~$A^{(p)}$. A nonterminal $A^{(p)}$ in a derivation may be


35 
rewritten using a production $A^{(q)} = \gamma$ only if~$p \le q$. Any


36 
priority grammar can be translated into a normal context free grammar by


37 
introducing new nonterminals and productions.


38 


39 
Formally, a set of context free productions $G$ induces a derivation


40 
relation $\longrightarrow@G$. Let $\alpha$ and $\beta$ denote strings of


41 
terminal or nonterminal symbols. Then


42 
\[ \alpha\, A^{(p)}\, \beta ~\longrightarrow@G~ \alpha\,\gamma\,\beta \]


43 
if and only if $G$ contains some production $A^{(q)}=\gamma$ for~$p \le q$.


44 


45 
The following simple grammar for arithmetic expressions demonstrates how


46 
binding power and associativity of operators can be enforced by priorities.


47 
\begin{center}


48 
\begin{tabular}{rclr}


49 
$A^{(9)}$ & = & {\tt0} \\


50 
$A^{(9)}$ & = & {\tt(} $A^{(0)}$ {\tt)} \\


51 
$A^{(0)}$ & = & $A^{(0)}$ {\tt+} $A^{(1)}$ \\


52 
$A^{(2)}$ & = & $A^{(3)}$ {\tt*} $A^{(2)}$ \\


53 
$A^{(3)}$ & = & {\tt} $A^{(3)}$


54 
\end{tabular}


55 
\end{center}


56 
The choice of priorities determines that {\tt } binds tighter than {\tt *},


57 
which binds tighter than {\tt +}. Furthermore {\tt +} associates to the


58 
left and {\tt *} to the right.


59 


60 
For clarity, grammars obey these conventions:


61 
\begin{itemize}


62 
\item All priorities must lie between~0 and \ttindex{max_pri}, which is a


63 
some fixed integer. Sometimes {\tt max_pri} is written as $\infty$.


64 
\item Priority 0 on the righthand side and priority \ttindex{max_pri} on


65 
the lefthand side may be omitted.


66 
\item The production $A^{(p)} = \alpha$ is written as $A = \alpha~(p)$; the


67 
priority of the lefthand side actually appears in a column on the far


68 
right.


69 
\item Alternatives are separated by~$$.


70 
\item Repetition is indicated by dots~(\dots) in an informal but obvious


71 
way.


72 
\end{itemize}


73 


74 
Using these conventions and assuming $\infty=9$, the grammar


75 
takes the form


76 
\begin{center}


77 
\begin{tabular}{rclc}


78 
$A$ & = & {\tt0} & \hspace*{4em} \\


79 
& $$ & {\tt(} $A$ {\tt)} \\


80 
& $$ & $A$ {\tt+} $A^{(1)}$ & (0) \\


81 
& $$ & $A^{(3)}$ {\tt*} $A^{(2)}$ & (2) \\


82 
& $$ & {\tt} $A^{(3)}$ & (3)


83 
\end{tabular}


84 
\end{center}


85 
\index{priority grammars)}


86 


87 


88 
\begin{figure}


89 
\begin{center}


90 
\begin{tabular}{rclc}


91 
$prop$ &=& {\tt PROP} $aprop$ ~~$$~~ {\tt(} $prop$ {\tt)} \\


92 
&$$& $logic^{(3)}$ {\tt ==} $logic^{(2)}$ & (2) \\


93 
&$$& $logic^{(3)}$ {\tt =?=} $logic^{(2)}$ & (2) \\


94 
&$$& $prop^{(2)}$ {\tt ==>} $prop^{(1)}$ & (1) \\


95 
&$$& {\tt[} $prop$ {\tt;} \dots {\tt;} $prop$ {\tt]} {\tt==>} $prop^{(1)}$ & (1) \\


96 
&$$& {\tt!!} $idts$ {\tt.} $prop$ & (0) \\\\


97 
$logic$ &=& $prop$ ~~$$~~ $fun$ \\\\


98 
$aprop$ &=& $id$ ~~$$~~ $var$


99 
~~$$~~ $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\\\


100 
$fun$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $fun$ {\tt)} \\


101 
&$$& $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)} \\


102 
&$$& $fun^{(\infty)}$ {\tt::} $type$ \\


103 
&$$& {\tt \%} $idts$ {\tt.} $logic$ & (0) \\\\


104 
$idts$ &=& $idt$ ~~$$~~ $idt^{(1)}$ $idts$ \\\\


105 
$idt$ &=& $id$ ~~$$~~ {\tt(} $idt$ {\tt)} \\


106 
&$$& $id$ {\tt ::} $type$ & (0) \\\\


107 
$type$ &=& $tid$ ~~$$~~ $tvar$ ~~$$~~ $tid$ {\tt::} $sort$


108 
~~$$~~ $tvar$ {\tt::} $sort$ \\


109 
&$$& $id$ ~~$$~~ $type^{(\infty)}$ $id$


110 
~~$$~~ {\tt(} $type$ {\tt,} \dots {\tt,} $type$ {\tt)} $id$ \\


111 
&$$& $type^{(1)}$ {\tt =>} $type$ & (0) \\


112 
&$$& {\tt[} $type$ {\tt,} \dots {\tt,} $type$ {\tt]} {\tt=>} $type$&(0)\\


113 
&$$& {\tt(} $type$ {\tt)} \\\\


114 
$sort$ &=& $id$ ~~$$~~ {\tt\ttlbrace\ttrbrace}


115 
~~$$~~ {\tt\ttlbrace} $id$ {\tt,} \dots {\tt,} $id$ {\tt\ttrbrace}


116 
\end{tabular}


117 
\index{*PROP symbol}


118 
\index{*== symbol}\index{*=?= symbol}\index{*==> symbol}


119 
\index{*:: symbol}\index{*=> symbol}


120 
%index command: percent is permitted, but braces must match!


121 
\index{%@{\tt\%} symbol}


122 
\index{{}@{\tt\ttlbrace} symbol}\index{{}@{\tt\ttrbrace} symbol}


123 
\index{*[ symbol}\index{*] symbol}


124 
\index{*"!"! symbol}


125 
\index{*"[" symbol}


126 
\index{*""] symbol}


127 
\end{center}


128 
\caption{Metalogic syntax}\label{fig:pure_gram}


129 
\end{figure}


130 


131 


132 
\section{The Pure syntax} \label{sec:basic_syntax}


133 
\index{syntax!Pure(}


134 


135 
At the root of all objectlogics lies the theory \thydx{Pure}. It


136 
contains, among many other things, the Pure syntax. An informal account of


137 
this basic syntax (types, terms and formulae) appears in


138 
\iflabelundefined{sec:forward}{{\em Introduction to Isabelle}}%


139 
{\S\ref{sec:forward}}. A more precise description using a priority grammar


140 
appears in Fig.\ts\ref{fig:pure_gram}. It defines the following


141 
nonterminals:


142 
\begin{ttdescription}


143 
\item[\ndxbold{prop}] denotes terms of type {\tt prop}. These are formulae


144 
of the metalogic.


145 


146 
\item[\ndxbold{aprop}] denotes atomic propositions. These typically


147 
include the judgement forms of the objectlogic; its definition


148 
introduces a metalevel predicate for each judgement form.


149 


150 
\item[\ndxbold{logic}] denotes terms whose type belongs to class


151 
\cldx{logic}. As the syntax is extended by new objectlogics, more


152 
productions for {\tt logic} are added automatically (see below).


153 


154 
\item[\ndxbold{fun}] denotes terms potentially of function type.


155 


156 
\item[\ndxbold{type}] denotes types of the metalogic.


157 


158 
\item[\ndxbold{idts}] denotes a list of identifiers, possibly constrained


159 
by types.


160 
\end{ttdescription}


161 


162 
\begin{warn}


163 
In {\tt idts}, note that \verbx::nat y is parsed as \verbx::(nat y),


164 
treating {\tt y} like a type constructor applied to {\tt nat}. The


165 
likely result is an error message. To avoid this interpretation, use


166 
parentheses and write \verb(x::nat) y.


167 


168 
Similarly, \verbx::nat y::nat is parsed as \verbx::(nat y::nat) and


169 
yields an error. The correct form is \verb(x::nat) (y::nat).


170 
\end{warn}


171 


172 
\subsection{Logical types and default syntax}\label{logicaltypes}


173 
\index{lambda calc@$\lambda$calculus}


174 


175 
Isabelle's representation of mathematical languages is based on the simply


176 
typed $\lambda$calculus. All logical types, namely those of class


177 
\cldx{logic}, are automatically equipped with a basic syntax of types,


178 
identifiers, variables, parentheses, $\lambda$abstractions and


179 
applications.


180 


181 
More precisely, for each type constructor $ty$ with arity $(\vec{s})c$,


182 
where $c$ is a subclass of \cldx{logic}, several productions are added:


183 
\begin{center}


184 
\begin{tabular}{rclc}


185 
$ty$ &=& $id$ ~~$$~~ $var$ ~~$$~~ {\tt(} $ty$ {\tt)} \\


186 
&$$& $fun^{(\infty)}$ {\tt(} $logic$ {\tt,} \dots {\tt,} $logic$ {\tt)}\\


187 
&$$& $ty^{(\infty)}$ {\tt::} $type$\\\\


188 
$logic$ &=& $ty$


189 
\end{tabular}


190 
\end{center}


191 


192 


193 
\subsection{Lexical matters}


194 
The parser does not process input strings directly. It operates on token


195 
lists provided by Isabelle's \bfindex{lexer}. There are two kinds of


196 
tokens: \bfindex{delimiters} and \bfindex{name tokens}.


197 


198 
\index{reserved words}


199 
Delimiters can be regarded as reserved words of the syntax. You can


200 
add new ones when extending theories. In Fig.\ts\ref{fig:pure_gram} they


201 
appear in typewriter font, for example {\tt ==}, {\tt =?=} and


202 
{\tt PROP}\@.


203 


204 
Name tokens have a predefined syntax. The lexer distinguishes four


205 
disjoint classes of names: \rmindex{identifiers}, \rmindex{unknowns}, type


206 
identifiers\index{type identifiers}, type unknowns\index{type unknowns}.


207 
They are denoted by \ndxbold{id}, \ndxbold{var}, \ndxbold{tid},


208 
\ndxbold{tvar}, respectively. Typical examples are {\tt x}, {\tt ?x7},


209 
{\tt 'a}, {\tt ?'a3}. Here is the precise syntax:


210 
\begin{eqnarray*}


211 
id & = & letter~quasiletter^* \\


212 
var & = & \mbox{\tt ?}id ~~~~ \mbox{\tt ?}id\mbox{\tt .}nat \\


213 
tid & = & \mbox{\tt '}id \\


214 
tvar & = & \mbox{\tt ?}tid ~~~~


215 
\mbox{\tt ?}tid\mbox{\tt .}nat \\[1ex]


216 
letter & = & \mbox{one of {\tt a}\dots {\tt z} {\tt A}\dots {\tt Z}} \\


217 
digit & = & \mbox{one of {\tt 0}\dots {\tt 9}} \\


218 
quasiletter & = & letter ~~~~ digit ~~~~ \mbox{\tt _} ~~~~ \mbox{\tt '} \\


219 
nat & = & digit^+


220 
\end{eqnarray*}


221 
A \ndxbold{var} or \ndxbold{tvar} describes an unknown, which is internally


222 
a pair of base name and index (\ML\ type \mltydx{indexname}). These


223 
components are either separated by a dot as in {\tt ?x.1} or {\tt ?x7.3} or


224 
run together as in {\tt ?x1}. The latter form is possible if the base name


225 
does not end with digits. If the index is 0, it may be dropped altogether:


226 
{\tt ?x} abbreviates both {\tt ?x0} and {\tt ?x.0}.


227 


228 
The lexer repeatedly takes the maximal prefix of the input string that


229 
forms a valid token. A maximal prefix that is both a delimiter and a name


230 
is treated as a delimiter. Spaces, tabs and newlines are separators; they


231 
never occur within tokens.


232 


233 
Delimiters need not be separated by white space. For example, if {\tt }


234 
is a delimiter but {\tt } is not, then the string {\tt } is treated as


235 
two consecutive occurrences of the token~{\tt }. In contrast, \ML\


236 
treats {\tt } as a single symbolic name. The consequence of Isabelle's


237 
more liberal scheme is that the same string may be parsed in different ways


238 
after extending the syntax: after adding {\tt } as a delimiter, the input


239 
{\tt } is treated as a single token.


240 


241 
Although name tokens are returned from the lexer rather than the parser, it


242 
is more logical to regard them as nonterminals. Delimiters, however, are


243 
terminals; they are just syntactic sugar and contribute nothing to the


244 
abstract syntax tree.


245 


246 


247 
\subsection{*Inspecting the syntax}


248 
\begin{ttbox}


249 
syn_of : theory > Syntax.syntax


250 
Syntax.print_syntax : Syntax.syntax > unit


251 
Syntax.print_gram : Syntax.syntax > unit


252 
Syntax.print_trans : Syntax.syntax > unit


253 
\end{ttbox}


254 
The abstract type \mltydx{Syntax.syntax} allows manipulation of syntaxes


255 
in \ML. You can display values of this type by calling the following


256 
functions:


257 
\begin{ttdescription}


258 
\item[\ttindexbold{syn_of} {\it thy}] returns the syntax of the Isabelle


259 
theory~{\it thy} as an \ML\ value.


260 


261 
\item[\ttindexbold{Syntax.print_syntax} {\it syn}] shows virtually all


262 
information contained in the syntax {\it syn}. The displayed output can


263 
be large. The following two functions are more selective.


264 


265 
\item[\ttindexbold{Syntax.print_gram} {\it syn}] shows the grammar part


266 
of~{\it syn}, namely the lexicon, roots and productions. These are


267 
discussed below.


268 


269 
\item[\ttindexbold{Syntax.print_trans} {\it syn}] shows the translation


270 
part of~{\it syn}, namely the constants, parse/print macros and


271 
parse/print translations.


272 
\end{ttdescription}


273 


274 
Let us demonstrate these functions by inspecting Pure's syntax. Even that


275 
is too verbose to display in full.


276 
\begin{ttbox}\index{*Pure theory}


277 
Syntax.print_syntax (syn_of Pure.thy);


278 
{\out lexicon: "!!" "\%" "(" ")" "," "." "::" ";" "==" "==>" \dots}


279 
{\out roots: logic type fun prop}


280 
{\out prods:}


281 
{\out type = tid (1000)}


282 
{\out type = tvar (1000)}


283 
{\out type = id (1000)}


284 
{\out type = tid "::" sort[0] => "_ofsort" (1000)}


285 
{\out type = tvar "::" sort[0] => "_ofsort" (1000)}


286 
{\out \vdots}


287 
\ttbreak


288 
{\out consts: "_K" "_appl" "_aprop" "_args" "_asms" "_bigimpl" \dots}


289 
{\out parse_ast_translation: "_appl" "_bigimpl" "_bracket"}


290 
{\out "_idtyp" "_lambda" "_tapp" "_tappl"}


291 
{\out parse_rules:}


292 
{\out parse_translation: "!!" "_K" "_abs" "_aprop"}


293 
{\out print_translation: "all"}


294 
{\out print_rules:}


295 
{\out print_ast_translation: "==>" "_abs" "_idts" "fun"}


296 
\end{ttbox}


297 


298 
As you can see, the output is divided into labeled sections. The grammar


299 
is represented by {\tt lexicon}, {\tt roots} and {\tt prods}. The rest


300 
refers to syntactic translations and macro expansion. Here is an


301 
explanation of the various sections.


302 
\begin{description}


303 
\item[{\tt lexicon}] lists the delimiters used for lexical


304 
analysis.\index{delimiters}


305 


306 
\item[{\tt roots}] lists the grammar's nonterminal symbols. You must


307 
name the desired root when calling lower level functions or specifying


308 
macros. Higher level functions usually expect a type and derive the


309 
actual root as described in~\S\ref{sec:grammar}.


310 


311 
\item[{\tt prods}] lists the \rmindex{productions} of the priority grammar.


312 
The nonterminal $A^{(n)}$ is rendered in {\sc ascii} as {\tt $A$[$n$]}.


313 
Each delimiter is quoted. Some productions are shown with {\tt =>} and


314 
an attached string. These strings later become the heads of parse


315 
trees; they also play a vital role when terms are printed (see


316 
\S\ref{sec:asts}).


317 


318 
Productions with no strings attached are called {\bf copy


319 
productions}\indexbold{productions!copy}. Their righthand side must


320 
have exactly one nonterminal symbol (or name token). The parser does


321 
not create a new parse tree node for copy productions, but simply


322 
returns the parse tree of the righthand symbol.


323 


324 
If the righthand side consists of a single nonterminal with no


325 
delimiters, then the copy production is called a {\bf chain


326 
production}. Chain productions act as abbreviations:


327 
conceptually, they are removed from the grammar by adding new


328 
productions. Priority information attached to chain productions is


329 
ignored; only the dummy value $1$ is displayed.


330 


331 
\item[{\tt consts}, {\tt parse_rules}, {\tt print_rules}]


332 
relate to macros (see \S\ref{sec:macros}).


333 


334 
\item[{\tt parse_ast_translation}, {\tt print_ast_translation}]


335 
list sets of constants that invoke translation functions for abstract


336 
syntax trees. Section \S\ref{sec:asts} below discusses this obscure


337 
matter.\index{constants!for translations}


338 


339 
\item[{\tt parse_translation}, {\tt print_translation}] list sets


340 
of constants that invoke translation functions for terms (see


341 
\S\ref{sec:tr_funs}).


342 
\end{description}


343 
\index{syntax!Pure)}


344 


345 


346 
\section{Mixfix declarations} \label{sec:mixfix}


347 
\index{mixfix declarations(}


348 


349 
When defining a theory, you declare new constants by giving their names,


350 
their type, and an optional {\bf mixfix annotation}. Mixfix annotations


351 
allow you to extend Isabelle's basic $\lambda$calculus syntax with


352 
readable notation. They can express any contextfree priority grammar.


353 
Isabelle syntax definitions are inspired by \OBJ~\cite{OBJ}; they are more


354 
general than the priority declarations of \ML\ and Prolog.


355 


356 
A mixfix annotation defines a production of the priority grammar. It


357 
describes the concrete syntax, the translation to abstract syntax, and the


358 
pretty printing. Special case annotations provide a simple means of


359 
specifying infix operators, binders and so forth.


360 


361 
\subsection{Grammar productions}\label{sec:grammar}\index{productions}


362 


363 
Let us examine the treatment of the production


364 
\[ A^{(p)}= w@0\, A@1^{(p@1)}\, w@1\, A@2^{(p@2)}\, \ldots\,


365 
A@n^{(p@n)}\, w@n. \]


366 
Here $A@i^{(p@i)}$ is a nonterminal with priority~$p@i$ for $i=1$,


367 
\ldots,~$n$, while $w@0$, \ldots,~$w@n$ are strings of terminals.


368 
In the corresponding mixfix annotation, the priorities are given separately


369 
as $[p@1,\ldots,p@n]$ and~$p$. The nonterminal symbols are identified with


370 
types~$\tau$, $\tau@1$, \ldots,~$\tau@n$ respectively, and the production's


371 
effect on nonterminals is expressed as the function type


372 
\[ [\tau@1, \ldots, \tau@n]\To \tau. \]


373 
Finally, the template


374 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n \]


375 
describes the strings of terminals.


376 


377 
A simple type is typically declared for each nonterminal symbol. In


378 
firstorder logic, type~$i$ stands for terms and~$o$ for formulae. Only


379 
the outermost type constructor is taken into account. For example, any


380 
type of the form $\sigma list$ stands for a list; productions may refer


381 
to the symbol {\tt list} and will apply lists of any type.


382 


383 
The symbol associated with a type is called its {\bf root} since it may


384 
serve as the root of a parse tree. Precisely, the root of $(\tau@1, \dots,


385 
\tau@n)ty$ is $ty$, where $\tau@1$, \ldots, $\tau@n$ are types and $ty$ is


386 
a type constructor. Type infixes are a special case of this; in


387 
particular, the root of $\tau@1 \To \tau@2$ is {\tt fun}. Finally, the


388 
root of a type variable is {\tt logic}; general productions might


389 
refer to this nonterminal.


390 


391 
Identifying nonterminals with types allows a constant's type to specify


392 
syntax as well. We can declare the function~$f$ to have type $[\tau@1,


393 
\ldots, \tau@n]\To \tau$ and, through a mixfix annotation, specify the


394 
layout of the function's $n$ arguments. The constant's name, in this


395 
case~$f$, will also serve as the label in the abstract syntax tree. There


396 
are two exceptions to this treatment of constants:


397 
\begin{enumerate}\index{constants!syntactic}


398 
\item A production need not map directly to a logical function. In this


399 
case, you must declare a constant whose purpose is purely syntactic.


400 
By convention such constants begin with the symbol~{\tt\at},


401 
ensuring that they can never be written in formulae.


402 


403 
\item A copy production has no associated constant.\index{productions!copy}


404 
\end{enumerate}


405 
There is something artificial about this representation of productions,


406 
but it is convenient, particularly for simple theory extensions.


407 


408 
\subsection{The general mixfix form}


409 
Here is a detailed account of mixfix declarations. Suppose the following


410 
line occurs within the {\tt consts} section of a {\tt .thy} file:


411 
\begin{center}


412 
{\tt $c$ ::\ "$\sigma$" ("$template$" $ps$ $p$)}


413 
\end{center}


414 
This constant declaration and mixfix annotation is interpreted as follows:


415 
\begin{itemize}\index{productions}


416 
\item The string {\tt $c$} is the name of the constant associated with the


417 
production; unless it is a valid identifier, it must be enclosed in


418 
quotes. If $c$ is empty (given as~{\tt ""}) then this is a copy


419 
production.\index{productions!copy} Otherwise, parsing an instance of the


420 
phrase $template$ generates the \AST{} {\tt ("$c$" $a@1$ $\ldots$


421 
$a@n$)}, where $a@i$ is the \AST{} generated by parsing the $i$th


422 
argument.


423 


424 
\item The constant $c$, if nonempty, is declared to have type $\sigma$.


425 


426 
\item The string $template$ specifies the righthand side of


427 
the production. It has the form


428 
\[ w@0 \;_\; w@1 \;_\; \ldots \;_\; w@n, \]


429 
where each occurrence of {\tt_} denotes an argument position and


430 
the~$w@i$ do not contain~{\tt _}. (If you want a literal~{\tt _} in


431 
the concrete syntax, you must escape it as described below.) The $w@i$


432 
may consist of \rmindex{delimiters}, spaces or


433 
\rmindex{pretty printing} annotations (see below).


434 


435 
\item The type $\sigma$ specifies the production's nonterminal symbols


436 
(or name tokens). If $template$ is of the form above then $\sigma$


437 
must be a function type with at least~$n$ argument positions, say


438 
$\sigma = [\tau@1, \dots, \tau@n] \To \tau$. Nonterminal symbols are


439 
derived from the types $\tau@1$, \ldots,~$\tau@n$, $\tau$ as described


440 
above. Any of these may be function types; the corresponding root is


441 
then \tydx{fun}.


442 


443 
\item The optional list~$ps$ may contain at most $n$ integers, say {\tt


444 
[$p@1$, $\ldots$, $p@m$]}, where $p@i$ is the minimal


445 
priority\indexbold{priorities} required of any phrase that may appear


446 
as the $i$th argument. Missing priorities default to~0.


447 


448 
\item The integer $p$ is the priority of this production. If omitted, it


449 
defaults to the maximal priority.


450 
Priorities range between 0 and \ttindexbold{max_pri} (= 1000).


451 
\end{itemize}


452 
%


453 
The declaration {\tt $c$ ::\ "$\sigma$" ("$template$")} specifies no


454 
priorities. The resulting production puts no priority constraints on any


455 
of its arguments and has maximal priority itself. Omitting priorities in


456 
this manner will introduce syntactic ambiguities unless the production's


457 
righthand side is fully bracketed, as in \verb"if _ then _ else _ fi".


458 


459 
Omitting the mixfix annotation completely, as in {\tt $c$ ::\ "$\sigma$"},


460 
is sensible only if~$c$ is an identifier. Otherwise you will be unable to


461 
write terms involving~$c$.


462 


463 
\begin{warn}


464 
Theories must sometimes declare types for purely syntactic purposes. One


465 
example is \tydx{type}, the builtin type of types. This is a `type of


466 
all types' in the syntactic sense only. Do not declare such types under


467 
{\tt arities} as belonging to class {\tt logic}\index{*logic class}, for


468 
that would allow their use in arbitrary Isabelle


469 
expressions~(\S\ref{logicaltypes}).


470 
\end{warn}


471 


472 
\subsection{Example: arithmetic expressions}


473 
\index{examples!of mixfix declarations}


474 
This theory specification contains a {\tt consts} section with mixfix


475 
declarations encoding the priority grammar from


476 
\S\ref{sec:priority_grammars}:


477 
\begin{ttbox}


478 
EXP = Pure +


479 
types


480 
exp


481 
arities


482 
exp :: logic


483 
consts


484 
"0" :: "exp" ("0" 9)


485 
"+" :: "[exp, exp] => exp" ("_ + _" [0, 1] 0)


486 
"*" :: "[exp, exp] => exp" ("_ * _" [3, 2] 2)


487 
"" :: "exp => exp" (" _" [3] 3)


488 
end


489 
\end{ttbox}


490 
The {\tt arities} declaration causes {\tt exp} to be added as a new root.


491 
If you put this into a file {\tt exp.thy} and load it via {\tt


492 
use_thy "EXP"}, you can run some tests:


493 
\begin{ttbox}


494 
val read_exp = Syntax.test_read (syn_of EXP.thy) "exp";


495 
{\out val it = fn : string > unit}


496 
read_exp "0 * 0 * 0 * 0 + 0 + 0 + 0";


497 
{\out tokens: "0" "*" "0" "*" "0" "*" "0" "+" "0" "+" "0" "+" "0"}


498 
{\out raw: ("+" ("+" ("+" ("*" "0" ("*" "0" ("*" "0" "0"))) "0") "0") "0")}


499 
{\out \vdots}


500 
read_exp "0 +  0 + 0";


501 
{\out tokens: "0" "+" "" "0" "+" "0"}


502 
{\out raw: ("+" ("+" "0" ("" "0")) "0")}


503 
{\out \vdots}


504 
\end{ttbox}


505 
The output of \ttindex{Syntax.test_read} includes the token list ({\tt


506 
tokens}) and the raw \AST{} directly derived from the parse tree,


507 
ignoring parse \AST{} translations. The rest is tracing information


508 
provided by the macro expander (see \S\ref{sec:macros}).


509 


510 
Executing {\tt Syntax.print_gram} reveals the productions derived


511 
from our mixfix declarations (lots of additional information deleted):


512 
\begin{ttbox}


513 
Syntax.print_gram (syn_of EXP.thy);


514 
{\out exp = "0" => "0" (9)}


515 
{\out exp = exp[0] "+" exp[1] => "+" (0)}


516 
{\out exp = exp[3] "*" exp[2] => "*" (2)}


517 
{\out exp = "" exp[3] => "" (3)}


518 
\end{ttbox}


519 


520 


521 
\subsection{The mixfix template}


522 
Let us take a closer look at the string $template$ appearing in mixfix


523 
annotations. This string specifies a list of parsing and printing


524 
directives: delimiters\index{delimiters}, arguments, spaces, blocks of


525 
indentation and line breaks. These are encoded by the following character


526 
sequences:


527 
\index{pretty printing(}


528 
\begin{description}


529 
\item[~$d$~] is a delimiter, namely a nonempty sequence of characters


530 
other than the special characters {\tt _}, {\tt(}, {\tt)} and~{\tt/}.


531 
Even these characters may appear if escaped; this means preceding it with


532 
a~{\tt '} (single quote). Thus you have to write {\tt ''} if you really


533 
want a single quote. Delimiters may never contain spaces.


534 


535 
\item[~{\tt_}~] is an argument position, which stands for a nonterminal symbol


536 
or name token.


537 


538 
\item[~$s$~] is a nonempty sequence of spaces for printing. This and the


539 
following specifications do not affect parsing at all.


540 


541 
\item[~{\tt(}$n$~] opens a pretty printing block. The optional number $n$


542 
specifies how much indentation to add when a line break occurs within the


543 
block. If {\tt(} is not followed by digits, the indentation defaults


544 
to~0.


545 


546 
\item[~{\tt)}~] closes a pretty printing block.


547 


548 
\item[~{\tt//}~] forces a line break.


549 


550 
\item[~{\tt/}$s$~] allows a line break. Here $s$ stands for the string of


551 
spaces (zero or more) right after the {\tt /} character. These spaces


552 
are printed if the break is not taken.


553 
\end{description}


554 
For example, the template {\tt"(_ +/ _)"} specifies an infix operator.


555 
There are two argument positions; the delimiter~{\tt+} is preceded by a


556 
space and followed by a space or line break; the entire phrase is a pretty


557 
printing block. Other examples appear in Fig.\ts\ref{fig:set_trans} below.


558 
Isabelle's pretty printer resembles the one described in


559 
Paulson~\cite{paulson91}.


560 


561 
\index{pretty printing)}


562 


563 


564 
\subsection{Infixes}


565 
\indexbold{infixes}


566 


567 
Infix operators associating to the left or right can be declared


568 
using {\tt infixl} or {\tt infixr}.


569 
Roughly speaking, the form {\tt $c$ ::\ "$\sigma$" (infixl $p$)}


570 
abbreviates the constant declarations


571 
\begin{ttbox}


572 
"op \(c\)" :: "\(\sigma\)" ("op \(c\)")


573 
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p\), \(p+1\)] \(p\))


574 
\end{ttbox}


575 
and {\tt $c$ ::\ "$\sigma$" (infixr $p$)} abbreviates the constant declarations


576 
\begin{ttbox}


577 
"op \(c\)" :: "\(\sigma\)" ("op \(c\)")


578 
"op \(c\)" :: "\(\sigma\)" ("(_ \(c\)/ _)" [\(p+1\), \(p\)] \(p\))


579 
\end{ttbox}


580 
The infix operator is declared as a constant with the prefix {\tt op}.


581 
Thus, prefixing infixes with \sdx{op} makes them behave like ordinary


582 
function symbols, as in \ML. Special characters occurring in~$c$ must be


583 
escaped, as in delimiters, using a single quote.


584 


585 
The expanded forms above would be actually illegal in a {\tt .thy} file


586 
because they declare the constant \hbox{\tt"op \(c\)"} twice.


587 


588 


589 
\subsection{Binders}


590 
\indexbold{binders}


591 
\begingroup


592 
\def\Q{{\cal Q}}


593 
A {\bf binder} is a variablebinding construct such as a quantifier. The


594 
constant declaration


595 
\begin{ttbox}


596 
\(c\) :: "\(\sigma\)" (binder "\(\Q\)" \(p\))


597 
\end{ttbox}


598 
introduces a constant~$c$ of type~$\sigma$, which must have the form


599 
$(\tau@1 \To \tau@2) \To \tau@3$. Its concrete syntax is $\Q~x.P$, where


600 
$x$ is a bound variable of type~$\tau@1$, the body~$P$ has type $\tau@2$


601 
and the whole term has type~$\tau@3$. Special characters in $\Q$ must be


602 
escaped using a single quote.


603 


604 
The declaration is expanded internally to


605 
\begin{ttbox}


606 
\(c\) :: "(\(\tau@1\) => \(\tau@2\)) => \(\tau@3\)"


607 
"\(\Q\)"\hskip3pt :: "[idts, \(\tau@2\)] => \(\tau@3\)" ("(3\(\Q\)_./ _)" \(p\))


608 
\end{ttbox}


609 
Here \ndx{idts} is the nonterminal symbol for a list of identifiers with


610 
optional type constraints (see Fig.\ts\ref{fig:pure_gram}). The


611 
declaration also installs a parse translation\index{translations!parse}


612 
for~$\Q$ and a print translation\index{translations!print} for~$c$ to


613 
translate between the internal and external forms.


614 


615 
A binder of type $(\sigma \To \tau) \To \tau$ can be nested by giving a


616 
list of variables. The external form $\Q~x@1~x@2 \ldots x@n. P$


617 
corresponds to the internal form


618 
\[ c(\lambda x@1. c(\lambda x@2. \ldots c(\lambda x@n. P) \ldots)). \]


619 


620 
\medskip


621 
For example, let us declare the quantifier~$\forall$:\index{quantifiers}


622 
\begin{ttbox}


623 
All :: "('a => o) => o" (binder "ALL " 10)


624 
\end{ttbox}


625 
This lets us write $\forall x.P$ as either {\tt All(\%$x$.$P$)} or {\tt ALL


626 
$x$.$P$}. When printing, Isabelle prefers the latter form, but must fall


627 
back on ${\tt All}(P)$ if $P$ is not an abstraction. Both $P$ and {\tt ALL


628 
$x$.$P$} have type~$o$, the type of formulae, while the bound variable


629 
can be polymorphic.


630 
\endgroup


631 


632 
\index{mixfix declarations)}


633 


634 


635 
\section{Example: some minimal logics} \label{sec:min_logics}


636 
\index{examples!of logic definitions}


637 


638 
This section presents some examples that have a simple syntax. They


639 
demonstrate how to define new objectlogics from scratch.


640 


641 
First we must define how an objectlogic syntax embedded into the


642 
metalogic. Since all theorems must conform to the syntax for~\ndx{prop} (see


643 
Fig.\ts\ref{fig:pure_gram}), that syntax has to be extended with the


644 
objectlevel syntax. Assume that the syntax of your objectlogic defines a


645 
nonterminal symbol~\ndx{o} of formulae. These formulae can now appear in


646 
axioms and theorems wherever \ndx{prop} does if you add the production


647 
\[ prop ~=~ o. \]


648 
This is not a copy production but a coercion from formulae to propositions:


649 
\begin{ttbox}


650 
Base = Pure +


651 
types


652 
o


653 
arities


654 
o :: logic


655 
consts


656 
Trueprop :: "o => prop" ("_" 5)


657 
end


658 
\end{ttbox}


659 
The constant \cdx{Trueprop} (the name is arbitrary) acts as an invisible


660 
coercion function. Assuming this definition resides in a file {\tt base.thy},


661 
you have to load it with the command {\tt use_thy "Base"}.


662 


663 
One of the simplest nontrivial logics is {\bf minimal logic} of


664 
implication. Its definition in Isabelle needs no advanced features but


665 
illustrates the overall mechanism nicely:


666 
\begin{ttbox}


667 
Hilbert = Base +


668 
consts


669 
">" :: "[o, o] => o" (infixr 10)


670 
rules


671 
K "P > Q > P"


672 
S "(P > Q > R) > (P > Q) > P > R"


673 
MP "[ P > Q; P ] ==> Q"


674 
end


675 
\end{ttbox}


676 
After loading this definition from the file {\tt hilbert.thy}, you can


677 
start to prove theorems in the logic:


678 
\begin{ttbox}


679 
goal Hilbert.thy "P > P";


680 
{\out Level 0}


681 
{\out P > P}


682 
{\out 1. P > P}


683 
\ttbreak


684 
by (resolve_tac [Hilbert.MP] 1);


685 
{\out Level 1}


686 
{\out P > P}


687 
{\out 1. ?P > P > P}


688 
{\out 2. ?P}


689 
\ttbreak


690 
by (resolve_tac [Hilbert.MP] 1);


691 
{\out Level 2}


692 
{\out P > P}


693 
{\out 1. ?P1 > ?P > P > P}


694 
{\out 2. ?P1}


695 
{\out 3. ?P}


696 
\ttbreak


697 
by (resolve_tac [Hilbert.S] 1);


698 
{\out Level 3}


699 
{\out P > P}


700 
{\out 1. P > ?Q2 > P}


701 
{\out 2. P > ?Q2}


702 
\ttbreak


703 
by (resolve_tac [Hilbert.K] 1);


704 
{\out Level 4}


705 
{\out P > P}


706 
{\out 1. P > ?Q2}


707 
\ttbreak


708 
by (resolve_tac [Hilbert.K] 1);


709 
{\out Level 5}


710 
{\out P > P}


711 
{\out No subgoals!}


712 
\end{ttbox}


713 
As we can see, this Hilbertstyle formulation of minimal logic is easy to


714 
define but difficult to use. The following natural deduction formulation is


715 
better:


716 
\begin{ttbox}


717 
MinI = Base +


718 
consts


719 
">" :: "[o, o] => o" (infixr 10)


720 
rules


721 
impI "(P ==> Q) ==> P > Q"


722 
impE "[ P > Q; P ] ==> Q"


723 
end


724 
\end{ttbox}


725 
Note, however, that although the two systems are equivalent, this fact


726 
cannot be proved within Isabelle. Axioms {\tt S} and {\tt K} can be


727 
derived in {\tt MinI} (exercise!), but {\tt impI} cannot be derived in {\tt


728 
Hilbert}. The reason is that {\tt impI} is only an {\bf admissible} rule


729 
in {\tt Hilbert}, something that can only be shown by induction over all


730 
possible proofs in {\tt Hilbert}.


731 


732 
We may easily extend minimal logic with falsity:


733 
\begin{ttbox}


734 
MinIF = MinI +


735 
consts


736 
False :: "o"


737 
rules


738 
FalseE "False ==> P"


739 
end


740 
\end{ttbox}


741 
On the other hand, we may wish to introduce conjunction only:


742 
\begin{ttbox}


743 
MinC = Base +


744 
consts


745 
"&" :: "[o, o] => o" (infixr 30)


746 
\ttbreak


747 
rules


748 
conjI "[ P; Q ] ==> P & Q"


749 
conjE1 "P & Q ==> P"


750 
conjE2 "P & Q ==> Q"


751 
end


752 
\end{ttbox}


753 
And if we want to have all three connectives together, we create and load a


754 
theory file consisting of a single line:\footnote{We can combine the


755 
theories without creating a theory file using the ML declaration


756 
\begin{ttbox}


757 
val MinIFC_thy = merge_theories(MinIF,MinC)


758 
\end{ttbox}


759 
\index{*merge_theoriesfnote}}


760 
\begin{ttbox}


761 
MinIFC = MinIF + MinC


762 
\end{ttbox}


763 
Now we can prove mixed theorems like


764 
\begin{ttbox}


765 
goal MinIFC.thy "P & False > Q";


766 
by (resolve_tac [MinI.impI] 1);


767 
by (dresolve_tac [MinC.conjE2] 1);


768 
by (eresolve_tac [MinIF.FalseE] 1);


769 
\end{ttbox}


770 
Try this as an exercise!


771 


772 
