105

1 
%% $Id$


2 
\part{Foundations}

296

3 
The following sections discuss Isabelle's logical foundations in detail:

105

4 
representing logical syntax in the typed $\lambda$calculus; expressing


5 
inference rules in Isabelle's metalogic; combining rules by resolution.

296

6 


7 
If you wish to use Isabelle immediately, please turn to


8 
page~\pageref{chap:getting}. You can always read about foundations later,


9 
either by returning to this point or by looking up particular items in the


10 
index.

105

11 


12 
\begin{figure}


13 
\begin{eqnarray*}


14 
\neg P & \hbox{abbreviates} & P\imp\bot \\


15 
P\bimp Q & \hbox{abbreviates} & (P\imp Q) \conj (Q\imp P)


16 
\end{eqnarray*}


17 
\vskip 4ex


18 


19 
\(\begin{array}{c@{\qquad\qquad}c}


20 
\infer[({\conj}I)]{P\conj Q}{P & Q} &


21 
\infer[({\conj}E1)]{P}{P\conj Q} \qquad


22 
\infer[({\conj}E2)]{Q}{P\conj Q} \\[4ex]


23 


24 
\infer[({\disj}I1)]{P\disj Q}{P} \qquad


25 
\infer[({\disj}I2)]{P\disj Q}{Q} &


26 
\infer[({\disj}E)]{R}{P\disj Q & \infer*{R}{[P]} & \infer*{R}{[Q]}}\\[4ex]


27 


28 
\infer[({\imp}I)]{P\imp Q}{\infer*{Q}{[P]}} &


29 
\infer[({\imp}E)]{Q}{P\imp Q & P} \\[4ex]


30 


31 
&


32 
\infer[({\bot}E)]{P}{\bot}\\[4ex]


33 


34 
\infer[({\forall}I)*]{\forall x.P}{P} &


35 
\infer[({\forall}E)]{P[t/x]}{\forall x.P} \\[3ex]


36 


37 
\infer[({\exists}I)]{\exists x.P}{P[t/x]} &


38 
\infer[({\exists}E)*]{Q}{{\exists x.P} & \infer*{Q}{[P]} } \\[3ex]


39 


40 
{t=t} \,(refl) & \vcenter{\infer[(subst)]{P[u/x]}{t=u & P[t/x]}}


41 
\end{array} \)


42 


43 
\bigskip\bigskip


44 
*{\em Eigenvariable conditions\/}:


45 


46 
$\forall I$: provided $x$ is not free in the assumptions


47 


48 
$\exists E$: provided $x$ is not free in $Q$ or any assumption except $P$


49 
\caption{Intuitionistic firstorder logic} \label{folfig}


50 
\end{figure}


51 

296

52 
\section{Formalizing logical syntax in Isabelle}\label{sec:logicalsyntax}

331

53 
\index{firstorder logic}


54 

105

55 
Figure~\ref{folfig} presents intuitionistic firstorder logic,

296

56 
including equality. Let us see how to formalize

105

57 
this logic in Isabelle, illustrating the main features of Isabelle's


58 
polymorphic metalogic.


59 

331

60 
\index{lambda calc@$\lambda$calculus}

312

61 
Isabelle represents syntax using the simply typed $\lambda$calculus. We


62 
declare a type for each syntactic category of the logic. We declare a


63 
constant for each symbol of the logic, giving each $n$place operation an


64 
$n$argument curried function type. Most importantly,


65 
$\lambda$abstraction represents variable binding in quantifiers.

105

66 

331

67 
\index{types!syntax of}\index{types!function}\index{*fun type}


68 
\index{type constructors}


69 
Isabelle has \MLstyle polymorphic types such as~$(\alpha)list$, where


70 
$list$ is a type constructor and $\alpha$ is a type variable; for example,

105

71 
$(bool)list$ is the type of lists of booleans. Function types have the

312

72 
form $(\sigma,\tau)fun$ or $\sigma\To\tau$, where $\sigma$ and $\tau$ are


73 
types. Curried function types may be abbreviated:

105

74 
\[ \sigma@1\To (\cdots \sigma@n\To \tau\cdots) \quad \hbox{as} \quad

312

75 
[\sigma@1, \ldots, \sigma@n] \To \tau \]

105

76 

312

77 
\index{terms!syntax of}

105

78 
The syntax for terms is summarised below. Note that function application is


79 
written $t(u)$ rather than the usual $t\,u$.


80 
\[

312

81 
\index{lambda abs@$\lambda$abstractions}\index{function applications}

105

82 
\begin{array}{ll}

296

83 
t :: \tau & \hbox{type constraint, on a term or bound variable} \\

105

84 
\lambda x.t & \hbox{abstraction} \\


85 
\lambda x@1\ldots x@n.t


86 
& \hbox{curried abstraction, $\lambda x@1. \ldots \lambda x@n.t$} \\


87 
t(u) & \hbox{application} \\


88 
t (u@1, \ldots, u@n) & \hbox{curried application, $t(u@1)\ldots(u@n)$}


89 
\end{array}


90 
\]


91 


92 

296

93 
\subsection{Simple types and constants}\index{types!simplebold}


94 


95 
The syntactic categories of our logic (Fig.\ts\ref{folfig}) are {\bf


96 
formulae} and {\bf terms}. Formulae denote truth values, so (following


97 
tradition) let us call their type~$o$. To allow~0 and~$Suc(t)$ as terms,


98 
let us declare a type~$nat$ of natural numbers. Later, we shall see


99 
how to admit terms of other types.

105

100 

312

101 
\index{constants}\index{*nat type}\index{*o type}

105

102 
After declaring the types~$o$ and~$nat$, we may declare constants for the


103 
symbols of our logic. Since $\bot$ denotes a truth value (falsity) and 0


104 
denotes a number, we put \begin{eqnarray*}


105 
\bot & :: & o \\


106 
0 & :: & nat.


107 
\end{eqnarray*}


108 
If a symbol requires operands, the corresponding constant must have a


109 
function type. In our logic, the successor function


110 
($Suc$) is from natural numbers to natural numbers, negation ($\neg$) is a


111 
function from truth values to truth values, and the binary connectives are


112 
curried functions taking two truth values as arguments:


113 
\begin{eqnarray*}


114 
Suc & :: & nat\To nat \\


115 
{\neg} & :: & o\To o \\


116 
\conj,\disj,\imp,\bimp & :: & [o,o]\To o


117 
\end{eqnarray*}

296

118 
The binary connectives can be declared as infixes, with appropriate


119 
precedences, so that we write $P\conj Q\disj R$ instead of


120 
$\disj(\conj(P,Q), R)$.

105

121 

331

122 
Section~\ref{sec:definingtheories} below describes the syntax of Isabelle


123 
theory files and illustrates it by extending our logic with mathematical


124 
induction.

105

125 


126 


127 
\subsection{Polymorphic types and constants} \label{polymorphic}


128 
\index{types!polymorphicbold}

296

129 
\index{equality!polymorphic}

312

130 
\index{constants!polymorphic}

296

131 

105

132 
Which type should we assign to the equality symbol? If we tried


133 
$[nat,nat]\To o$, then equality would be restricted to the natural

312

134 
numbers; we should have to declare different equality symbols for each

105

135 
type. Isabelle's type system is polymorphic, so we could declare


136 
\begin{eqnarray*}

296

137 
{=} & :: & [\alpha,\alpha]\To o,

105

138 
\end{eqnarray*}

296

139 
where the type variable~$\alpha$ ranges over all types.

105

140 
But this is also wrong. The declaration is too polymorphic; $\alpha$

296

141 
includes types like~$o$ and $nat\To nat$. Thus, it admits

105

142 
$\bot=\neg(\bot)$ and $Suc=Suc$ as formulae, which is acceptable in


143 
higherorder logic but not in firstorder logic.


144 

296

145 
Isabelle's {\bf type classes}\index{classes} control


146 
polymorphism~\cite{nipkowprehofer}. Each type variable belongs to a


147 
class, which denotes a set of types. Classes are partially ordered by the


148 
subclass relation, which is essentially the subset relation on the sets of


149 
types. They closely resemble the classes of the functional language


150 
Haskell~\cite{haskelltutorial,haskellreport}.

105

151 

312

152 
\index{*logic class}\index{*term class}

105

153 
Isabelle provides the builtin class $logic$, which consists of the logical


154 
types: the ones we want to reason about. Let us declare a class $term$, to


155 
consist of all legal types of terms in our logic. The subclass structure


156 
is now $term\le logic$.


157 

312

158 
\index{*nat type}

296

159 
We put $nat$ in class $term$ by declaring $nat{::}term$. We declare the


160 
equality constant by

105

161 
\begin{eqnarray*}


162 
{=} & :: & [\alpha{::}term,\alpha]\To o


163 
\end{eqnarray*}

296

164 
where $\alpha{::}term$ constrains the type variable~$\alpha$ to class


165 
$term$. Such type variables resemble Standard~\ML's equality type


166 
variables.


167 

312

168 
We give~$o$ and function types the class $logic$ rather than~$term$, since

105

169 
they are not legal types for terms. We may introduce new types of class


170 
$term$  for instance, type $string$ or $real$  at any time. We can

331

171 
even declare type constructors such as~$list$, and state that type

296

172 
$(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality

105

173 
applies to lists of natural numbers but not to lists of formulae. We may


174 
summarize this paragraph by a set of {\bf arity declarations} for type

312

175 
constructors:\index{arities!declaring}


176 
\begin{eqnarray*}\index{*o type}\index{*fun type}


177 
o & :: & logic \\


178 
fun & :: & (logic,logic)logic \\

105

179 
nat, string, real & :: & term \\

312

180 
list & :: & (term)term

105

181 
\end{eqnarray*}

312

182 
(Recall that $fun$ is the type constructor for function types.)

331

183 
In \rmindex{higherorder logic}, equality does apply to truth values and

296

184 
functions; this requires the arity declarations ${o::term}$

312

185 
and ${fun::(term,term)term}$. The class system can also handle

105

186 
overloading.\index{overloadingbold} We could declare $arith$ to be the


187 
subclass of $term$ consisting of the `arithmetic' types, such as~$nat$.


188 
Then we could declare the operators


189 
\begin{eqnarray*}


190 
{+},{},{\times},{/} & :: & [\alpha{::}arith,\alpha]\To \alpha


191 
\end{eqnarray*}


192 
If we declare new types $real$ and $complex$ of class $arith$, then we

331

193 
in effect have three sets of operators:

105

194 
\begin{eqnarray*}


195 
{+},{},{\times},{/} & :: & [nat,nat]\To nat \\


196 
{+},{},{\times},{/} & :: & [real,real]\To real \\


197 
{+},{},{\times},{/} & :: & [complex,complex]\To complex


198 
\end{eqnarray*}


199 
Isabelle will regard these as distinct constants, each of which can be defined

296

200 
separately. We could even introduce the type $(\alpha)vector$ and declare


201 
its arity as $(arith)arith$. Then we could declare the constant

105

202 
\begin{eqnarray*}

296

203 
{+} & :: & [(\alpha)vector,(\alpha)vector]\To (\alpha)vector

105

204 
\end{eqnarray*}

296

205 
and specify it in terms of ${+} :: [\alpha,\alpha]\To \alpha$.

105

206 

296

207 
A type variable may belong to any finite number of classes. Suppose that


208 
we had declared yet another class $ord \le term$, the class of all

105

209 
`ordered' types, and a constant


210 
\begin{eqnarray*}


211 
{\le} & :: & [\alpha{::}ord,\alpha]\To o.


212 
\end{eqnarray*}


213 
In this context the variable $x$ in $x \le (x+x)$ will be assigned type

296

214 
$\alpha{::}\{arith,ord\}$, which means $\alpha$ belongs to both $arith$ and


215 
$ord$. Semantically the set $\{arith,ord\}$ should be understood as the


216 
intersection of the sets of types represented by $arith$ and $ord$. Such


217 
intersections of classes are called \bfindex{sorts}. The empty


218 
intersection of classes, $\{\}$, contains all types and is thus the {\bf


219 
universal sort}.

105

220 

296

221 
Even with overloading, each term has a unique, most general type. For this


222 
to be possible, the class and type declarations must satisfy certain

331

223 
technical constraints; see


224 
\iflabelundefined{sec:refdefiningtheories}%


225 
{Sect.\ Defining Theories in the {\em Reference Manual}}%


226 
{\S\ref{sec:refdefiningtheories}}.

105

227 

296

228 

105

229 
\subsection{Higher types and quantifiers}

312

230 
\index{types!higherbold}\index{quantifiers}

105

231 
Quantifiers are regarded as operations upon functions. Ignoring polymorphism


232 
for the moment, consider the formula $\forall x. P(x)$, where $x$ ranges


233 
over type~$nat$. This is true if $P(x)$ is true for all~$x$. Abstracting


234 
$P(x)$ into a function, this is the same as saying that $\lambda x.P(x)$


235 
returns true for all arguments. Thus, the universal quantifier can be


236 
represented by a constant


237 
\begin{eqnarray*}


238 
\forall & :: & (nat\To o) \To o,


239 
\end{eqnarray*}


240 
which is essentially an infinitary truth table. The representation of $\forall


241 
x. P(x)$ is $\forall(\lambda x. P(x))$.


242 


243 
The existential quantifier is treated


244 
in the same way. Other binding operators are also easily handled; for


245 
instance, the summation operator $\Sigma@{k=i}^j f(k)$ can be represented as


246 
$\Sigma(i,j,\lambda k.f(k))$, where


247 
\begin{eqnarray*}


248 
\Sigma & :: & [nat,nat, nat\To nat] \To nat.


249 
\end{eqnarray*}


250 
Quantifiers may be polymorphic. We may define $\forall$ and~$\exists$ over


251 
all legal types of terms, not just the natural numbers, and


252 
allow summations over all arithmetic types:


253 
\begin{eqnarray*}


254 
\forall,\exists & :: & (\alpha{::}term\To o) \To o \\


255 
\Sigma & :: & [nat,nat, nat\To \alpha{::}arith] \To \alpha


256 
\end{eqnarray*}


257 
Observe that the index variables still have type $nat$, while the values


258 
being summed may belong to any arithmetic type.


259 


260 


261 
\section{Formalizing logical rules in Isabelle}

312

262 
\index{metaimplicationbold}


263 
\index{metaquantifiersbold}


264 
\index{metaequalitybold}

296

265 


266 
Objectlogics are formalized by extending Isabelle's

1878

267 
metalogic~\cite{paulsonfound}, which is intuitionistic higherorder logic.

296

268 
The metalevel connectives are {\bf implication}, the {\bf universal


269 
quantifier}, and {\bf equality}.

105

270 
\begin{itemize}


271 
\item The implication \(\phi\Imp \psi\) means `\(\phi\) implies


272 
\(\psi\)', and expresses logical {\bf entailment}.


273 


274 
\item The quantification \(\Forall x.\phi\) means `\(\phi\) is true for


275 
all $x$', and expresses {\bf generality} in rules and axiom schemes.


276 


277 
\item The equality \(a\equiv b\) means `$a$ equals $b$', for expressing

312

278 
{\bf definitions} (see~\S\ref{definitions}).\index{definitions}


279 
Equalities left over from the unification process, so called {\bf


280 
flexflex constraints},\index{flexflex constraints} are written $a\qeq


281 
b$. The two equality symbols have the same logical meaning.

105

282 


283 
\end{itemize}

312

284 
The syntax of the metalogic is formalized in the same manner


285 
as objectlogics, using the simply typed $\lambda$calculus. Analogous to

105

286 
type~$o$ above, there is a builtin type $prop$ of metalevel truth values.


287 
Metalevel formulae will have this type. Type $prop$ belongs to


288 
class~$logic$; also, $\sigma\To\tau$ belongs to $logic$ provided $\sigma$


289 
and $\tau$ do. Here are the types of the builtin connectives:

312

290 
\begin{eqnarray*}\index{*prop type}\index{*logic class}

105

291 
\Imp & :: & [prop,prop]\To prop \\


292 
\Forall & :: & (\alpha{::}logic\To prop) \To prop \\


293 
{\equiv} & :: & [\alpha{::}\{\},\alpha]\To prop \\

312

294 
\qeq & :: & [\alpha{::}\{\},\alpha]\To prop

105

295 
\end{eqnarray*}

312

296 
The polymorphism in $\Forall$ is restricted to class~$logic$ to exclude


297 
certain types, those used just for parsing. The type variable


298 
$\alpha{::}\{\}$ ranges over the universal sort.

105

299 


300 
In our formalization of firstorder logic, we declared a type~$o$ of


301 
objectlevel truth values, rather than using~$prop$ for this purpose. If


302 
we declared the objectlevel connectives to have types such as


303 
${\neg}::prop\To prop$, then these connectives would be applicable to


304 
metalevel formulae. Keeping $prop$ and $o$ as separate types maintains


305 
the distinction between the metalevel and the objectlevel. To formalize


306 
the inference rules, we shall need to relate the two levels; accordingly,


307 
we declare the constant

312

308 
\index{*Trueprop constant}

105

309 
\begin{eqnarray*}


310 
Trueprop & :: & o\To prop.


311 
\end{eqnarray*}


312 
We may regard $Trueprop$ as a metalevel predicate, reading $Trueprop(P)$ as


313 
`$P$ is true at the objectlevel.' Put another way, $Trueprop$ is a coercion


314 
from $o$ to $prop$.


315 


316 


317 
\subsection{Expressing propositional rules}

312

318 
\index{rules!propositional}

105

319 
We shall illustrate the use of the metalogic by formalizing the rules of

296

320 
Fig.\ts\ref{folfig}. Each objectlevel rule is expressed as a metalevel

105

321 
axiom.


322 


323 
One of the simplest rules is $(\conj E1)$. Making


324 
everything explicit, its formalization in the metalogic is


325 
$$ \Forall P\;Q. Trueprop(P\conj Q) \Imp Trueprop(P). \eqno(\conj E1) $$


326 
This may look formidable, but it has an obvious reading: for all objectlevel


327 
truth values $P$ and~$Q$, if $P\conj Q$ is true then so is~$P$. The


328 
reading is correct because the metalogic has simple models, where


329 
types denote sets and $\Forall$ really means `for all.'


330 

312

331 
\index{*Trueprop constant}

105

332 
Isabelle adopts notational conventions to ease the writing of rules. We may


333 
hide the occurrences of $Trueprop$ by making it an implicit coercion.


334 
Outer universal quantifiers may be dropped. Finally, the nested implication

312

335 
\index{metaimplication}

105

336 
\[ \phi@1\Imp(\cdots \phi@n\Imp\psi\cdots) \]


337 
may be abbreviated as $\List{\phi@1; \ldots; \phi@n} \Imp \psi$, which


338 
formalizes a rule of $n$~premises.


339 


340 
Using these conventions, the conjunction rules become the following axioms.


341 
These fully specify the properties of~$\conj$:


342 
$$ \List{P; Q} \Imp P\conj Q \eqno(\conj I) $$


343 
$$ P\conj Q \Imp P \qquad P\conj Q \Imp Q \eqno(\conj E1,2) $$


344 


345 
\noindent


346 
Next, consider the disjunction rules. The discharge of assumption in


347 
$(\disj E)$ is expressed using $\Imp$:

331

348 
\index{assumptions!discharge of}%

105

349 
$$ P \Imp P\disj Q \qquad Q \Imp P\disj Q \eqno(\disj I1,2) $$


350 
$$ \List{P\disj Q; P\Imp R; Q\Imp R} \Imp R \eqno(\disj E) $$

331

351 
%

312

352 
To understand this treatment of assumptions in natural

105

353 
deduction, look at implication. The rule $({\imp}I)$ is the classic


354 
example of natural deduction: to prove that $P\imp Q$ is true, assume $P$


355 
is true and show that $Q$ must then be true. More concisely, if $P$


356 
implies $Q$ (at the metalevel), then $P\imp Q$ is true (at the


357 
objectlevel). Showing the coercion explicitly, this is formalized as


358 
\[ (Trueprop(P)\Imp Trueprop(Q)) \Imp Trueprop(P\imp Q). \]


359 
The rule $({\imp}E)$ is straightforward; hiding $Trueprop$, the axioms to


360 
specify $\imp$ are


361 
$$ (P \Imp Q) \Imp P\imp Q \eqno({\imp}I) $$


362 
$$ \List{P\imp Q; P} \Imp Q. \eqno({\imp}E) $$


363 


364 
\noindent


365 
Finally, the intuitionistic contradiction rule is formalized as the axiom


366 
$$ \bot \Imp P. \eqno(\bot E) $$


367 


368 
\begin{warn}


369 
Earlier versions of Isabelle, and certain

1878

370 
papers~\cite{paulsonfound,paulson700}, use $\List{P}$ to mean $Trueprop(P)$.

105

371 
\end{warn}


372 


373 
\subsection{Quantifier rules and substitution}

312

374 
\index{quantifiers}\index{rules!quantifier}\index{substitutionbold}


375 
\index{variables!bound}\index{lambda abs@$\lambda$abstractions}


376 
\index{function applications}


377 

105

378 
Isabelle expresses variable binding using $\lambda$abstraction; for instance,


379 
$\forall x.P$ is formalized as $\forall(\lambda x.P)$. Recall that $F(t)$


380 
is Isabelle's syntax for application of the function~$F$ to the argument~$t$;


381 
it is not a metanotation for substitution. On the other hand, a substitution


382 
will take place if $F$ has the form $\lambda x.P$; Isabelle transforms


383 
$(\lambda x.P)(t)$ to~$P[t/x]$ by $\beta$conversion. Thus, we can express


384 
inference rules that involve substitution for bound variables.


385 


386 
\index{parametersbold}\index{eigenvariablessee{parameters}}


387 
A logic may attach provisos to certain of its rules, especially quantifier


388 
rules. We cannot hope to formalize arbitrary provisos. Fortunately, those


389 
typical of quantifier rules always have the same form, namely `$x$ not free in


390 
\ldots {\it (some set of formulae)},' where $x$ is a variable (called a {\bf


391 
parameter} or {\bf eigenvariable}) in some premise. Isabelle treats


392 
provisos using~$\Forall$, its inbuilt notion of `for all'.

312

393 
\index{metaquantifiers}

105

394 


395 
The purpose of the proviso `$x$ not free in \ldots' is


396 
to ensure that the premise may not make assumptions about the value of~$x$,


397 
and therefore holds for all~$x$. We formalize $(\forall I)$ by


398 
\[ \left(\Forall x. Trueprop(P(x))\right) \Imp Trueprop(\forall x.P(x)). \]


399 
This means, `if $P(x)$ is true for all~$x$, then $\forall x.P(x)$ is true.'


400 
The $\forall E$ rule exploits $\beta$conversion. Hiding $Trueprop$, the


401 
$\forall$ axioms are


402 
$$ \left(\Forall x. P(x)\right) \Imp \forall x.P(x) \eqno(\forall I) $$

296

403 
$$ (\forall x.P(x)) \Imp P(t). \eqno(\forall E)$$

105

404 


405 
\noindent


406 
We have defined the objectlevel universal quantifier~($\forall$)


407 
using~$\Forall$. But we do not require metalevel counterparts of all the


408 
connectives of the objectlogic! Consider the existential quantifier:


409 
$$ P(t) \Imp \exists x.P(x) \eqno(\exists I)$$


410 
$$ \List{\exists x.P(x);\; \Forall x. P(x)\Imp Q} \Imp Q \eqno(\exists E) $$


411 
Let us verify $(\exists E)$ semantically. Suppose that the premises

312

412 
hold; since $\exists x.P(x)$ is true, we may choose an~$a$ such that $P(a)$ is

105

413 
true. Instantiating $\Forall x. P(x)\Imp Q$ with $a$ yields $P(a)\Imp Q$, and


414 
we obtain the desired conclusion, $Q$.


415 


416 
The treatment of substitution deserves mention. The rule


417 
\[ \infer{P[u/t]}{t=u & P} \]


418 
would be hard to formalize in Isabelle. It calls for replacing~$t$ by $u$


419 
throughout~$P$, which cannot be expressed using $\beta$conversion. Our

312

420 
rule~$(subst)$ uses~$P$ as a template for substitution, inferring $P[u/x]$


421 
from~$P[t/x]$. When we formalize this as an axiom, the template becomes a


422 
function variable:

105

423 
$$ \List{t=u; P(t)} \Imp P(u). \eqno(subst)$$


424 


425 


426 
\subsection{Signatures and theories}

312

427 
\index{signaturesbold}


428 

105

429 
A {\bf signature} contains the information necessary for type checking,

312

430 
parsing and pretty printing a term. It specifies classes and their

331

431 
relationships, types and their arities, constants and their types, etc. It


432 
also contains syntax rules, specified using mixfix declarations.

105

433 


434 
Two signatures can be merged provided their specifications are compatible 


435 
they must not, for example, assign different types to the same constant.


436 
Under similar conditions, a signature can be extended. Signatures are


437 
managed internally by Isabelle; users seldom encounter them.


438 

312

439 
\index{theoriesbold}

105

440 
A {\bf theory} consists of a signature plus a collection of axioms. The


441 
{\bf pure} theory contains only the metalogic. Theories can be combined


442 
provided their signatures are compatible. A theory definition extends an


443 
existing theory with further signature specifications  classes, types,


444 
constants and mixfix declarations  plus a list of axioms, expressed as


445 
strings to be parsed. A theory can formalize a small piece of mathematics,


446 
such as lists and their operations, or an entire logic. A mathematical


447 
development typically involves many theories in a hierarchy. For example,


448 
the pure theory could be extended to form a theory for

296

449 
Fig.\ts\ref{folfig}; this could be extended in two separate ways to form a

105

450 
theory for natural numbers and a theory for lists; the union of these two


451 
could be extended into a theory defining the length of a list:

296

452 
\begin{tt}


453 
\[

105

454 
\begin{array}{c@{}c@{}c@{}c@{}c}


455 
{} & {} & \hbox{Length} & {} & {} \\


456 
{} & {} & \uparrow & {} & {} \\


457 
{} & {} &\hbox{Nat}+\hbox{List}& {} & {} \\


458 
{} & \nearrow & {} & \nwarrow & {} \\


459 
\hbox{Nat} & {} & {} & {} & \hbox{List} \\


460 
{} & \nwarrow & {} & \nearrow & {} \\


461 
{} & {} &\hbox{FOL} & {} & {} \\


462 
{} & {} & \uparrow & {} & {} \\


463 
{} & {} &\hbox{Pure}& {} & {}


464 
\end{array}


465 
\]

331

466 
\end{tt}%

105

467 
Each Isabelle proof typically works within a single theory, which is


468 
associated with the proof state. However, many different theories may


469 
coexist at the same time, and you may work in each of these during a single


470 
session.


471 

331

472 
\begin{warn}\index{constants!clashes with variables}%

296

473 
Confusing problems arise if you work in the wrong theory. Each theory


474 
defines its own syntax. An identifier may be regarded in one theory as a


475 
constant and in another as a variable.


476 
\end{warn}

105

477 


478 
\section{Proof construction in Isabelle}

296

479 
I have elsewhere described the metalogic and demonstrated it by

1878

480 
formalizing firstorder logic~\cite{paulsonfound}. There is a onetoone

296

481 
correspondence between metalevel proofs and objectlevel proofs. To each


482 
use of a metalevel axiom, such as $(\forall I)$, there is a use of the


483 
corresponding objectlevel rule. Objectlevel assumptions and parameters


484 
have metalevel counterparts. The metalevel formalization is {\bf


485 
faithful}, admitting no incorrect objectlevel inferences, and {\bf


486 
adequate}, admitting all correct objectlevel inferences. These


487 
properties must be demonstrated separately for each objectlogic.

105

488 


489 
The metalogic is defined by a collection of inference rules, including

331

490 
equational rules for the $\lambda$calculus and logical rules. The rules

105

491 
for~$\Imp$ and~$\Forall$ resemble those for~$\imp$ and~$\forall$ in

296

492 
Fig.\ts\ref{folfig}. Proofs performed using the primitive metarules

105

493 
would be lengthy; Isabelle proofs normally use certain derived rules.


494 
{\bf Resolution}, in particular, is convenient for backward proof.


495 


496 
Unification is central to theorem proving. It supports quantifier


497 
reasoning by allowing certain `unknown' terms to be instantiated later,


498 
possibly in stages. When proving that the time required to sort $n$


499 
integers is proportional to~$n^2$, we need not state the constant of


500 
proportionality; when proving that a hardware adder will deliver the sum of


501 
its inputs, we need not state how many clock ticks will be required. Such


502 
quantities often emerge from the proof.


503 

312

504 
Isabelle provides {\bf schematic variables}, or {\bf


505 
unknowns},\index{unknowns} for unification. Logically, unknowns are free


506 
variables. But while ordinary variables remain fixed, unification may


507 
instantiate unknowns. Unknowns are written with a ?\ prefix and are


508 
frequently subscripted: $\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots,


509 
$\Var{P}$, $\Var{P@1}$, \ldots.

105

510 


511 
Recall that an inference rule of the form


512 
\[ \infer{\phi}{\phi@1 & \ldots & \phi@n} \]


513 
is formalized in Isabelle's metalogic as the axiom

312

514 
$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.\index{resolution}

296

515 
Such axioms resemble Prolog's Horn clauses, and can be combined by

105

516 
resolution  Isabelle's principal proof method. Resolution yields both


517 
forward and backward proof. Backward proof works by unifying a goal with


518 
the conclusion of a rule, whose premises become new subgoals. Forward proof


519 
works by unifying theorems with the premises of a rule, deriving a new theorem.


520 

312

521 
Isabelle formulae require an extended notion of resolution.

296

522 
They differ from Horn clauses in two major respects:

105

523 
\begin{itemize}


524 
\item They are written in the typed $\lambda$calculus, and therefore must be


525 
resolved using higherorder unification.


526 

296

527 
\item The constituents of a clause need not be atomic formulae. Any


528 
formula of the form $Trueprop(\cdots)$ is atomic, but axioms such as


529 
${\imp}I$ and $\forall I$ contain nonatomic formulae.

105

530 
\end{itemize}

296

531 
Isabelle has little in common with classical resolution theorem provers

105

532 
such as Otter~\cite{wosbledsoe}. At the metalevel, Isabelle proves


533 
theorems in their positive form, not by refutation. However, an


534 
objectlogic that includes a contradiction rule may employ a refutation


535 
proof procedure.


536 


537 


538 
\subsection{Higherorder unification}


539 
\index{unification!higherorderbold}


540 
Unification is equation solving. The solution of $f(\Var{x},c) \qeq


541 
f(d,\Var{y})$ is $\Var{x}\equiv d$ and $\Var{y}\equiv c$. {\bf


542 
Higherorder unification} is equation solving for typed $\lambda$terms.


543 
To handle $\beta$conversion, it must reduce $(\lambda x.t)u$ to $t[u/x]$.


544 
That is easy  in the typed $\lambda$calculus, all reduction sequences


545 
terminate at a normal form. But it must guess the unknown


546 
function~$\Var{f}$ in order to solve the equation


547 
\begin{equation} \label{houeqn}


548 
\Var{f}(t) \qeq g(u@1,\ldots,u@k).


549 
\end{equation}


550 
Huet's~\cite{huet75} search procedure solves equations by imitation and

312

551 
projection. {\bf Imitation} makes~$\Var{f}$ apply the leading symbol (if a


552 
constant) of the righthand side. To solve equation~(\ref{houeqn}), it


553 
guesses

105

554 
\[ \Var{f} \equiv \lambda x. g(\Var{h@1}(x),\ldots,\Var{h@k}(x)), \]


555 
where $\Var{h@1}$, \ldots, $\Var{h@k}$ are new unknowns. Assuming there are no


556 
other occurrences of~$\Var{f}$, equation~(\ref{houeqn}) simplifies to the


557 
set of equations


558 
\[ \Var{h@1}(t)\qeq u@1 \quad\ldots\quad \Var{h@k}(t)\qeq u@k. \]


559 
If the procedure solves these equations, instantiating $\Var{h@1}$, \ldots,


560 
$\Var{h@k}$, then it yields an instantiation for~$\Var{f}$.


561 


562 
{\bf Projection} makes $\Var{f}$ apply one of its arguments. To solve


563 
equation~(\ref{houeqn}), if $t$ expects~$m$ arguments and delivers a


564 
result of suitable type, it guesses


565 
\[ \Var{f} \equiv \lambda x. x(\Var{h@1}(x),\ldots,\Var{h@m}(x)), \]


566 
where $\Var{h@1}$, \ldots, $\Var{h@m}$ are new unknowns. Assuming there are no


567 
other occurrences of~$\Var{f}$, equation~(\ref{houeqn}) simplifies to the


568 
equation


569 
\[ t(\Var{h@1}(t),\ldots,\Var{h@m}(t)) \qeq g(u@1,\ldots,u@k). $$


570 

331

571 
\begin{warn}\index{unification!incompleteness of}%

105

572 
Huet's unification procedure is complete. Isabelle's polymorphic version,


573 
which solves for type unknowns as well as for term unknowns, is incomplete.


574 
The problem is that projection requires type information. In


575 
equation~(\ref{houeqn}), if the type of~$t$ is unknown, then projections


576 
are possible for all~$m\geq0$, and the types of the $\Var{h@i}$ will be


577 
similarly unconstrained. Therefore, Isabelle never attempts such


578 
projections, and may fail to find unifiers where a type unknown turns out


579 
to be a function type.


580 
\end{warn}


581 

312

582 
\index{unknowns!functionbold}

105

583 
Given $\Var{f}(t@1,\ldots,t@n)\qeq u$, Huet's procedure could make up to


584 
$n+1$ guesses. The search tree and set of unifiers may be infinite. But


585 
higherorder unification can work effectively, provided you are careful


586 
with {\bf function unknowns}:


587 
\begin{itemize}


588 
\item Equations with no function unknowns are solved using firstorder


589 
unification, extended to treat bound variables. For example, $\lambda x.x


590 
\qeq \lambda x.\Var{y}$ has no solution because $\Var{y}\equiv x$ would


591 
capture the free variable~$x$.


592 


593 
\item An occurrence of the term $\Var{f}(x,y,z)$, where the arguments are


594 
distinct bound variables, causes no difficulties. Its projections can only


595 
match the corresponding variables.


596 


597 
\item Even an equation such as $\Var{f}(a)\qeq a+a$ is all right. It has


598 
four solutions, but Isabelle evaluates them lazily, trying projection before


599 
imitation. The first solution is usually the one desired:


600 
\[ \Var{f}\equiv \lambda x. x+x \quad


601 
\Var{f}\equiv \lambda x. a+x \quad


602 
\Var{f}\equiv \lambda x. x+a \quad


603 
\Var{f}\equiv \lambda x. a+a \]


604 
\item Equations such as $\Var{f}(\Var{x},\Var{y})\qeq t$ and


605 
$\Var{f}(\Var{g}(x))\qeq t$ admit vast numbers of unifiers, and must be


606 
avoided.


607 
\end{itemize}


608 
In problematic cases, you may have to instantiate some unknowns before


609 
invoking unification.


610 


611 


612 
\subsection{Joining rules by resolution} \label{joining}


613 
\index{resolutionbold}


614 
Let $\List{\psi@1; \ldots; \psi@m} \Imp \psi$ and $\List{\phi@1; \ldots;


615 
\phi@n} \Imp \phi$ be two Isabelle theorems, representing objectlevel rules.


616 
Choosing some~$i$ from~1 to~$n$, suppose that $\psi$ and $\phi@i$ have a


617 
higherorder unifier. Writing $Xs$ for the application of substitution~$s$ to


618 
expression~$X$, this means there is some~$s$ such that $\psi s\equiv \phi@i s$.


619 
By resolution, we may conclude


620 
\[ (\List{\phi@1; \ldots; \phi@{i1}; \psi@1; \ldots; \psi@m;


621 
\phi@{i+1}; \ldots; \phi@n} \Imp \phi)s.


622 
\]


623 
The substitution~$s$ may instantiate unknowns in both rules. In short,


624 
resolution is the following rule:


625 
\[ \infer[(\psi s\equiv \phi@i s)]


626 
{(\List{\phi@1; \ldots; \phi@{i1}; \psi@1; \ldots; \psi@m;


627 
\phi@{i+1}; \ldots; \phi@n} \Imp \phi)s}


628 
{\List{\psi@1; \ldots; \psi@m} \Imp \psi & &


629 
\List{\phi@1; \ldots; \phi@n} \Imp \phi}


630 
\]


631 
It operates at the metalevel, on Isabelle theorems, and is justified by


632 
the properties of $\Imp$ and~$\Forall$. It takes the number~$i$ (for


633 
$1\leq i\leq n$) as a parameter and may yield infinitely many conclusions,


634 
one for each unifier of $\psi$ with $\phi@i$. Isabelle returns these


635 
conclusions as a sequence (lazy list).


636 


637 
Resolution expects the rules to have no outer quantifiers~($\Forall$). It


638 
may rename or instantiate any schematic variables, but leaves free


639 
variables unchanged. When constructing a theory, Isabelle puts the rules


640 
into a standard form containing no free variables; for instance, $({\imp}E)$


641 
becomes


642 
\[ \List{\Var{P}\imp \Var{Q}; \Var{P}} \Imp \Var{Q}.


643 
\]


644 
When resolving two rules, the unknowns in the first rule are renamed, by


645 
subscripting, to make them distinct from the unknowns in the second rule. To

331

646 
resolve $({\imp}E)$ with itself, the first copy of the rule becomes

105

647 
\[ \List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}} \Imp \Var{Q@1}. \]


648 
Resolving this with $({\imp}E)$ in the first premise, unifying $\Var{Q@1}$ with


649 
$\Var{P}\imp \Var{Q}$, is the metalevel inference


650 
\[ \infer{\List{\Var{P@1}\imp (\Var{P}\imp \Var{Q}); \Var{P@1}; \Var{P}}


651 
\Imp\Var{Q}.}


652 
{\List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}} \Imp \Var{Q@1} & &


653 
\List{\Var{P}\imp \Var{Q}; \Var{P}} \Imp \Var{Q}}


654 
\]


655 
Renaming the unknowns in the resolvent, we have derived the

331

656 
objectlevel rule\index{rules!derived}

105

657 
\[ \infer{Q.}{R\imp (P\imp Q) & R & P} \]


658 
Joining rules in this fashion is a simple way of proving theorems. The


659 
derived rules are conservative extensions of the objectlogic, and may permit


660 
simpler proofs. Let us consider another example. Suppose we have the axiom


661 
$$ \forall x\,y. Suc(x)=Suc(y)\imp x=y. \eqno (inject) $$


662 


663 
\noindent


664 
The standard form of $(\forall E)$ is


665 
$\forall x.\Var{P}(x) \Imp \Var{P}(\Var{t})$.


666 
Resolving $(inject)$ with $(\forall E)$ replaces $\Var{P}$ by


667 
$\lambda x. \forall y. Suc(x)=Suc(y)\imp x=y$ and leaves $\Var{t}$


668 
unchanged, yielding


669 
\[ \forall y. Suc(\Var{t})=Suc(y)\imp \Var{t}=y. \]


670 
Resolving this with $(\forall E)$ puts a subscript on~$\Var{t}$


671 
and yields


672 
\[ Suc(\Var{t@1})=Suc(\Var{t})\imp \Var{t@1}=\Var{t}. \]


673 
Resolving this with $({\imp}E)$ increases the subscripts and yields


674 
\[ Suc(\Var{t@2})=Suc(\Var{t@1})\Imp \Var{t@2}=\Var{t@1}.


675 
\]


676 
We have derived the rule


677 
\[ \infer{m=n,}{Suc(m)=Suc(n)} \]


678 
which goes directly from $Suc(m)=Suc(n)$ to $m=n$. It is handy for simplifying


679 
an equation like $Suc(Suc(Suc(m)))=Suc(Suc(Suc(0)))$.


680 


681 

296

682 
\section{Lifting a rule into a context}

105

683 
The rules $({\imp}I)$ and $(\forall I)$ may seem unsuitable for


684 
resolution. They have nonatomic premises, namely $P\Imp Q$ and $\Forall


685 
x.P(x)$, while the conclusions of all the rules are atomic (they have the form


686 
$Trueprop(\cdots)$). Isabelle gets round the problem through a metainference


687 
called \bfindex{lifting}. Let us consider how to construct proofs such as


688 
\[ \infer[({\imp}I)]{P\imp(Q\imp R)}


689 
{\infer[({\imp}I)]{Q\imp R}

296

690 
{\infer*{R}{[P,Q]}}}

105

691 
\qquad


692 
\infer[(\forall I)]{\forall x\,y.P(x,y)}


693 
{\infer[(\forall I)]{\forall y.P(x,y)}{P(x,y)}}


694 
\]


695 

296

696 
\subsection{Lifting over assumptions}

312

697 
\index{assumptions!lifting over}

105

698 
Lifting over $\theta\Imp{}$ is the following metainference rule:


699 
\[ \infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp


700 
(\theta \Imp \phi)}


701 
{\List{\phi@1; \ldots; \phi@n} \Imp \phi} \]


702 
This is clearly sound: if $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ is true and

296

703 
$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$ and $\theta$ are all true then

105

704 
$\phi$ must be true. Iterated lifting over a series of metaformulae


705 
$\theta@k$, \ldots, $\theta@1$ yields an objectrule whose conclusion is


706 
$\List{\theta@1; \ldots; \theta@k} \Imp \phi$. Typically the $\theta@i$ are


707 
the assumptions in a natural deduction proof; lifting copies them into a rule's


708 
premises and conclusion.


709 


710 
When resolving two rules, Isabelle lifts the first one if necessary. The


711 
standard form of $({\imp}I)$ is


712 
\[ (\Var{P} \Imp \Var{Q}) \Imp \Var{P}\imp \Var{Q}. \]


713 
To resolve this rule with itself, Isabelle modifies one copy as follows: it


714 
renames the unknowns to $\Var{P@1}$ and $\Var{Q@1}$, then lifts the rule over

296

715 
$\Var{P}\Imp{}$ to obtain


716 
\[ (\Var{P}\Imp (\Var{P@1} \Imp \Var{Q@1})) \Imp (\Var{P} \Imp


717 
(\Var{P@1}\imp \Var{Q@1})). \]


718 
Using the $\List{\cdots}$ abbreviation, this can be written as

105

719 
\[ \List{\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}; \Var{P}}


720 
\Imp \Var{P@1}\imp \Var{Q@1}. \]


721 
Unifying $\Var{P}\Imp \Var{P@1}\imp\Var{Q@1}$ with $\Var{P} \Imp


722 
\Var{Q}$ instantiates $\Var{Q}$ to ${\Var{P@1}\imp\Var{Q@1}}$.


723 
Resolution yields


724 
\[ (\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}) \Imp


725 
\Var{P}\imp(\Var{P@1}\imp\Var{Q@1}). \]


726 
This represents the derived rule


727 
\[ \infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}} \]


728 

296

729 
\subsection{Lifting over parameters}

312

730 
\index{parameters!lifting over}

105

731 
An analogous form of lifting handles premises of the form $\Forall x\ldots\,$.


732 
Here, lifting prefixes an objectrule's premises and conclusion with $\Forall


733 
x$. At the same time, lifting introduces a dependence upon~$x$. It replaces


734 
each unknown $\Var{a}$ in the rule by $\Var{a'}(x)$, where $\Var{a'}$ is a new


735 
unknown (by subscripting) of suitable type  necessarily a function type. In


736 
short, lifting is the metainference


737 
\[ \infer{\List{\Forall x.\phi@1^x; \ldots; \Forall x.\phi@n^x}


738 
\Imp \Forall x.\phi^x,}


739 
{\List{\phi@1; \ldots; \phi@n} \Imp \phi} \]

296

740 
%


741 
where $\phi^x$ stands for the result of lifting unknowns over~$x$ in


742 
$\phi$. It is not hard to verify that this metainference is sound. If


743 
$\phi\Imp\psi$ then $\phi^x\Imp\psi^x$ for all~$x$; so if $\phi^x$ is true


744 
for all~$x$ then so is $\psi^x$. Thus, from $\phi\Imp\psi$ we conclude


745 
$(\Forall x.\phi^x) \Imp (\Forall x.\psi^x)$.


746 

105

747 
For example, $(\disj I)$ might be lifted to


748 
\[ (\Forall x.\Var{P@1}(x)) \Imp (\Forall x. \Var{P@1}(x)\disj \Var{Q@1}(x))\]


749 
and $(\forall I)$ to


750 
\[ (\Forall x\,y.\Var{P@1}(x,y)) \Imp (\Forall x. \forall y.\Var{P@1}(x,y)). \]


751 
Isabelle has renamed a bound variable in $(\forall I)$ from $x$ to~$y$,


752 
avoiding a clash. Resolving the above with $(\forall I)$ is the metainference


753 
\[ \infer{\Forall x\,y.\Var{P@1}(x,y)) \Imp \forall x\,y.\Var{P@1}(x,y)) }


754 
{(\Forall x\,y.\Var{P@1}(x,y)) \Imp


755 
(\Forall x. \forall y.\Var{P@1}(x,y)) &


756 
(\Forall x.\Var{P}(x)) \Imp (\forall x.\Var{P}(x))} \]


757 
Here, $\Var{P}$ is replaced by $\lambda x.\forall y.\Var{P@1}(x,y)$; the


758 
resolvent expresses the derived rule


759 
\[ \vcenter{ \infer{\forall x\,y.Q(x,y)}{Q(x,y)} }


760 
\quad\hbox{provided $x$, $y$ not free in the assumptions}


761 
\]

1878

762 
I discuss lifting and parameters at length elsewhere~\cite{paulsonfound}.

296

763 
Miller goes into even greater detail~\cite{millermixed}.

105

764 


765 


766 
\section{Backward proof by resolution}

312

767 
\index{resolution!in backward proof}

296

768 

105

769 
Resolution is convenient for deriving simple rules and for reasoning


770 
forward from facts. It can also support backward proof, where we start


771 
with a goal and refine it to progressively simpler subgoals until all have

296

772 
been solved. {\sc lcf} and its descendants {\sc hol} and Nuprl provide

105

773 
tactics and tacticals, which constitute a highlevel language for

296

774 
expressing proof searches. {\bf Tactics} refine subgoals while {\bf


775 
tacticals} combine tactics.

105

776 

312

777 
\index{LCF system}

105

778 
Isabelle's tactics and tacticals work differently from {\sc lcf}'s. An

296

779 
Isabelle rule is bidirectional: there is no distinction between

105

780 
inputs and outputs. {\sc lcf} has a separate tactic for each rule;


781 
Isabelle performs refinement by any rule in a uniform fashion, using


782 
resolution.


783 


784 
Isabelle works with metalevel theorems of the form


785 
\( \List{\phi@1; \ldots; \phi@n} \Imp \phi \).


786 
We have viewed this as the {\bf rule} with premises


787 
$\phi@1$,~\ldots,~$\phi@n$ and conclusion~$\phi$. It can also be viewed as

312

788 
the {\bf proof state}\index{proof state}


789 
with subgoals $\phi@1$,~\ldots,~$\phi@n$ and main

105

790 
goal~$\phi$.


791 


792 
To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof


793 
state. This assertion is, trivially, a theorem. At a later stage in the


794 
backward proof, a typical proof state is $\List{\phi@1; \ldots; \phi@n}

296

795 
\Imp \phi$. This proof state is a theorem, ensuring that the subgoals


796 
$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$. If $n=0$ then we have

105

797 
proved~$\phi$ outright. If $\phi$ contains unknowns, they may become


798 
instantiated during the proof; a proof state may be $\List{\phi@1; \ldots;


799 
\phi@n} \Imp \phi'$, where $\phi'$ is an instance of~$\phi$.


800 


801 
\subsection{Refinement by resolution}


802 
To refine subgoal~$i$ of a proof state by a rule, perform the following


803 
resolution:


804 
\[ \infer{\hbox{new proof state}}{\hbox{rule} & & \hbox{proof state}} \]

331

805 
Suppose the rule is $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$ after


806 
lifting over subgoal~$i$'s assumptions and parameters. If the proof state


807 
is $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, then the new proof state is


808 
(for~$1\leq i\leq n$)

105

809 
\[ (\List{\phi@1; \ldots; \phi@{i1}; \psi'@1;


810 
\ldots; \psi'@m; \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s. \]


811 
Substitution~$s$ unifies $\psi'$ with~$\phi@i$. In the proof state,


812 
subgoal~$i$ is replaced by $m$ new subgoals, the rule's instantiated premises.


813 
If some of the rule's unknowns are left uninstantiated, they become new


814 
unknowns in the proof state. Refinement by~$(\exists I)$, namely


815 
\[ \Var{P}(\Var{t}) \Imp \exists x. \Var{P}(x), \]


816 
inserts a new unknown derived from~$\Var{t}$ by subscripting and lifting.


817 
We do not have to specify an `existential witness' when


818 
applying~$(\exists I)$. Further resolutions may instantiate unknowns in


819 
the proof state.


820 


821 
\subsection{Proof by assumption}

312

822 
\index{assumptions!use of}

105

823 
In the course of a natural deduction proof, parameters $x@1$, \ldots,~$x@l$ and


824 
assumptions $\theta@1$, \ldots, $\theta@k$ accumulate, forming a context for


825 
each subgoal. Repeated lifting steps can lift a rule into any context. To


826 
aid readability, Isabelle puts contexts into a normal form, gathering the


827 
parameters at the front:


828 
\begin{equation} \label{contexteqn}


829 
\Forall x@1 \ldots x@l. \List{\theta@1; \ldots; \theta@k}\Imp\theta.


830 
\end{equation}


831 
Under the usual reading of the connectives, this expresses that $\theta$


832 
follows from $\theta@1$,~\ldots~$\theta@k$ for arbitrary


833 
$x@1$,~\ldots,~$x@l$. It is trivially true if $\theta$ equals any of


834 
$\theta@1$,~\ldots~$\theta@k$, or is unifiable with any of them. This


835 
models proof by assumption in natural deduction.


836 


837 
Isabelle automates the metainference for proof by assumption. Its arguments


838 
are the metatheorem $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, and some~$i$


839 
from~1 to~$n$, where $\phi@i$ has the form~(\ref{contexteqn}). Its results


840 
are metatheorems of the form


841 
\[ (\List{\phi@1; \ldots; \phi@{i1}; \phi@{i+1}; \phi@n} \Imp \phi)s \]


842 
for each $s$ and~$j$ such that $s$ unifies $\lambda x@1 \ldots x@l. \theta@j$


843 
with $\lambda x@1 \ldots x@l. \theta$. Isabelle supplies the parameters


844 
$x@1$,~\ldots,~$x@l$ to higherorder unification as bound variables, which


845 
regards them as unique constants with a limited scope  this enforces

1878

846 
parameter provisos~\cite{paulsonfound}.

105

847 

296

848 
The premise represents a proof state with~$n$ subgoals, of which the~$i$th


849 
is to be solved by assumption. Isabelle searches the subgoal's context for


850 
an assumption~$\theta@j$ that can solve it. For each unifier, the


851 
metainference returns an instantiated proof state from which the $i$th


852 
subgoal has been removed. Isabelle searches for a unifying assumption; for


853 
readability and robustness, proofs do not refer to assumptions by number.

105

854 

296

855 
Consider the proof state


856 
\[ (\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x}). \]

105

857 
Proof by assumption (with $i=1$, the only possibility) yields two results:


858 
\begin{itemize}


859 
\item $Q(a)$, instantiating $\Var{x}\equiv a$


860 
\item $Q(b)$, instantiating $\Var{x}\equiv b$


861 
\end{itemize}


862 
Here, proof by assumption affects the main goal. It could also affect

296

863 
other subgoals; if we also had the subgoal ${\List{P(b); P(c)} \Imp


864 
P(\Var{x})}$, then $\Var{x}\equiv a$ would transform it to ${\List{P(b);


865 
P(c)} \Imp P(a)}$, which might be unprovable.


866 

105

867 


868 
\subsection{A propositional proof} \label{propproof}


869 
\index{examples!propositional}


870 
Our first example avoids quantifiers. Given the main goal $P\disj P\imp


871 
P$, Isabelle creates the initial state

296

872 
\[ (P\disj P\imp P)\Imp (P\disj P\imp P). \]


873 
%

105

874 
Bear in mind that every proof state we derive will be a metatheorem,

296

875 
expressing that the subgoals imply the main goal. Our aim is to reach the


876 
state $P\disj P\imp P$; this metatheorem is the desired result.


877 


878 
The first step is to refine subgoal~1 by (${\imp}I)$, creating a new state


879 
where $P\disj P$ is an assumption:

105

880 
\[ (P\disj P\Imp P)\Imp (P\disj P\imp P) \]


881 
The next step is $(\disj E)$, which replaces subgoal~1 by three new subgoals.


882 
Because of lifting, each subgoal contains a copy of the context  the


883 
assumption $P\disj P$. (In fact, this assumption is now redundant; we shall


884 
shortly see how to get rid of it!) The new proof state is the following


885 
metatheorem, laid out for clarity:


886 
\[ \begin{array}{l@{}l@{\qquad\qquad}l}


887 
\lbrakk\;& P\disj P\Imp \Var{P@1}\disj\Var{Q@1}; & \hbox{(subgoal 1)} \\


888 
& \List{P\disj P; \Var{P@1}} \Imp P; & \hbox{(subgoal 2)} \\


889 
& \List{P\disj P; \Var{Q@1}} \Imp P & \hbox{(subgoal 3)} \\


890 
\rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)}


891 
\end{array}


892 
\]


893 
Notice the unknowns in the proof state. Because we have applied $(\disj E)$,


894 
we must prove some disjunction, $\Var{P@1}\disj\Var{Q@1}$. Of course,


895 
subgoal~1 is provable by assumption. This instantiates both $\Var{P@1}$ and


896 
$\Var{Q@1}$ to~$P$ throughout the proof state:


897 
\[ \begin{array}{l@{}l@{\qquad\qquad}l}


898 
\lbrakk\;& \List{P\disj P; P} \Imp P; & \hbox{(subgoal 1)} \\


899 
& \List{P\disj P; P} \Imp P & \hbox{(subgoal 2)} \\


900 
\rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)}


901 
\end{array} \]


902 
Both of the remaining subgoals can be proved by assumption. After two such

296

903 
steps, the proof state is $P\disj P\imp P$.


904 

105

905 


906 
\subsection{A quantifier proof}


907 
\index{examples!with quantifiers}


908 
To illustrate quantifiers and $\Forall$lifting, let us prove


909 
$(\exists x.P(f(x)))\imp(\exists x.P(x))$. The initial proof


910 
state is the trivial metatheorem


911 
\[ (\exists x.P(f(x)))\imp(\exists x.P(x)) \Imp


912 
(\exists x.P(f(x)))\imp(\exists x.P(x)). \]


913 
As above, the first step is refinement by (${\imp}I)$:


914 
\[ (\exists x.P(f(x))\Imp \exists x.P(x)) \Imp


915 
(\exists x.P(f(x)))\imp(\exists x.P(x))


916 
\]


917 
The next step is $(\exists E)$, which replaces subgoal~1 by two new subgoals.


918 
Both have the assumption $\exists x.P(f(x))$. The new proof


919 
state is the metatheorem


920 
\[ \begin{array}{l@{}l@{\qquad\qquad}l}


921 
\lbrakk\;& \exists x.P(f(x)) \Imp \exists x.\Var{P@1}(x); & \hbox{(subgoal 1)} \\


922 
& \Forall x.\List{\exists x.P(f(x)); \Var{P@1}(x)} \Imp


923 
\exists x.P(x) & \hbox{(subgoal 2)} \\


924 
\rbrakk\;& \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) & \hbox{(main goal)}


925 
\end{array}


926 
\]


927 
The unknown $\Var{P@1}$ appears in both subgoals. Because we have applied


928 
$(\exists E)$, we must prove $\exists x.\Var{P@1}(x)$, where $\Var{P@1}(x)$ may


929 
become any formula possibly containing~$x$. Proving subgoal~1 by assumption


930 
instantiates $\Var{P@1}$ to~$\lambda x.P(f(x))$:


931 
\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp


932 
\exists x.P(x)\right)


933 
\Imp (\exists x.P(f(x)))\imp(\exists x.P(x))


934 
\]


935 
The next step is refinement by $(\exists I)$. The rule is lifted into the

296

936 
context of the parameter~$x$ and the assumption $P(f(x))$. This copies


937 
the context to the subgoal and allows the existential witness to

105

938 
depend upon~$x$:


939 
\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp


940 
P(\Var{x@2}(x))\right)


941 
\Imp (\exists x.P(f(x)))\imp(\exists x.P(x))


942 
\]


943 
The existential witness, $\Var{x@2}(x)$, consists of an unknown


944 
applied to a parameter. Proof by assumption unifies $\lambda x.P(f(x))$


945 
with $\lambda x.P(\Var{x@2}(x))$, instantiating $\Var{x@2}$ to $f$. The final


946 
proof state contains no subgoals: $(\exists x.P(f(x)))\imp(\exists x.P(x))$.


947 


948 


949 
\subsection{Tactics and tacticals}


950 
\index{tacticsbold}\index{tacticalsbold}


951 
{\bf Tactics} perform backward proof. Isabelle tactics differ from those


952 
of {\sc lcf}, {\sc hol} and Nuprl by operating on entire proof states,


953 
rather than on individual subgoals. An Isabelle tactic is a function that


954 
takes a proof state and returns a sequence (lazy list) of possible

296

955 
successor states. Lazy lists are coded in ML as functions, a standard

105

956 
technique~\cite{paulson91}. Isabelle represents proof states by theorems.


957 


958 
Basic tactics execute the metarules described above, operating on a


959 
given subgoal. The {\bf resolution tactics} take a list of rules and


960 
return next states for each combination of rule and unifier. The {\bf


961 
assumption tactic} examines the subgoal's assumptions and returns next


962 
states for each combination of assumption and unifier. Lazy lists are


963 
essential because higherorder resolution may return infinitely many


964 
unifiers. If there are no matching rules or assumptions then no next


965 
states are generated; a tactic application that returns an empty list is


966 
said to {\bf fail}.


967 


968 
Sequences realize their full potential with {\bf tacticals}  operators


969 
for combining tactics. Depthfirst search, breadthfirst search and


970 
bestfirst search (where a heuristic function selects the best state to


971 
explore) return their outcomes as a sequence. Isabelle provides such


972 
procedures in the form of tacticals. Simpler procedures can be expressed

312

973 
directly using the basic tacticals {\tt THEN}, {\tt ORELSE} and {\tt REPEAT}:


974 
\begin{ttdescription}


975 
\item[$tac1$ THEN $tac2$] is a tactic for sequential composition. Applied

105

976 
to a proof state, it returns all states reachable in two steps by applying


977 
$tac1$ followed by~$tac2$.


978 

312

979 
\item[$tac1$ ORELSE $tac2$] is a choice tactic. Applied to a state, it

105

980 
tries~$tac1$ and returns the result if nonempty; otherwise, it uses~$tac2$.


981 

312

982 
\item[REPEAT $tac$] is a repetition tactic. Applied to a state, it

331

983 
returns all states reachable by applying~$tac$ as long as possible  until


984 
it would fail.

312

985 
\end{ttdescription}

105

986 
For instance, this tactic repeatedly applies $tac1$ and~$tac2$, giving


987 
$tac1$ priority:

312

988 
\begin{center} \tt


989 
REPEAT($tac1$ ORELSE $tac2$)


990 
\end{center}

105

991 


992 


993 
\section{Variations on resolution}


994 
In principle, resolution and proof by assumption suffice to prove all


995 
theorems. However, specialized forms of resolution are helpful for working


996 
with elimination rules. Elimresolution applies an elimination rule to an


997 
assumption; destructresolution is similar, but applies a rule in a forward


998 
style.


999 


1000 
The last part of the section shows how the techniques for proving theorems


1001 
can also serve to derive rules.


1002 


1003 
\subsection{Elimresolution}

312

1004 
\index{elimresolutionbold}\index{assumptions!deleting}


1005 

105

1006 
Consider proving the theorem $((R\disj R)\disj R)\disj R\imp R$. By

331

1007 
$({\imp}I)$, we prove~$R$ from the assumption $((R\disj R)\disj R)\disj R$.

105

1008 
Applying $(\disj E)$ to this assumption yields two subgoals, one that


1009 
assumes~$R$ (and is therefore trivial) and one that assumes $(R\disj


1010 
R)\disj R$. This subgoal admits another application of $(\disj E)$. Since


1011 
natural deduction never discards assumptions, we eventually generate a


1012 
subgoal containing much that is redundant:


1013 
\[ \List{((R\disj R)\disj R)\disj R; (R\disj R)\disj R; R\disj R; R} \Imp R. \]


1014 
In general, using $(\disj E)$ on the assumption $P\disj Q$ creates two new


1015 
subgoals with the additional assumption $P$ or~$Q$. In these subgoals,

331

1016 
$P\disj Q$ is redundant. Other elimination rules behave


1017 
similarly. In firstorder logic, only universally quantified

105

1018 
assumptions are sometimes needed more than once  say, to prove


1019 
$P(f(f(a)))$ from the assumptions $\forall x.P(x)\imp P(f(x))$ and~$P(a)$.


1020 


1021 
Many logics can be formulated as sequent calculi that delete redundant


1022 
assumptions after use. The rule $(\disj E)$ might become


1023 
\[ \infer[\disj\hbox{left}]


1024 
{\Gamma,P\disj Q,\Delta \turn \Theta}


1025 
{\Gamma,P,\Delta \turn \Theta && \Gamma,Q,\Delta \turn \Theta} \]


1026 
In backward proof, a goal containing $P\disj Q$ on the left of the~$\turn$


1027 
(that is, as an assumption) splits into two subgoals, replacing $P\disj Q$


1028 
by $P$ or~$Q$. But the sequent calculus, with its explicit handling of


1029 
assumptions, can be tiresome to use.


1030 


1031 
Elimresolution is Isabelle's way of getting sequent calculus behaviour


1032 
from natural deduction rules. It lets an elimination rule consume an

296

1033 
assumption. Elimresolution combines two metatheorems:


1034 
\begin{itemize}


1035 
\item a rule $\List{\psi@1; \ldots; \psi@m} \Imp \psi$


1036 
\item a proof state $\List{\phi@1; \ldots; \phi@n} \Imp \phi$


1037 
\end{itemize}


1038 
The rule must have at least one premise, thus $m>0$. Write the rule's


1039 
lifted form as $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$. Suppose we


1040 
wish to change subgoal number~$i$.


1041 


1042 
Ordinary resolution would attempt to reduce~$\phi@i$,


1043 
replacing subgoal~$i$ by $m$ new ones. Elimresolution tries


1044 
simultaneously to reduce~$\phi@i$ and to solve~$\psi'@1$ by assumption; it

105

1045 
returns a sequence of next states. Each of these replaces subgoal~$i$ by


1046 
instances of $\psi'@2$, \ldots, $\psi'@m$ from which the selected


1047 
assumption has been deleted. Suppose $\phi@i$ has the parameter~$x$ and


1048 
assumptions $\theta@1$,~\ldots,~$\theta@k$. Then $\psi'@1$, the rule's first


1049 
premise after lifting, will be


1050 
\( \Forall x. \List{\theta@1; \ldots; \theta@k}\Imp \psi^{x}@1 \).

296

1051 
Elimresolution tries to unify $\psi'\qeq\phi@i$ and


1052 
$\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1$ simultaneously, for


1053 
$j=1$,~\ldots,~$k$.

105

1054 


1055 
Let us redo the example from~\S\ref{propproof}. The elimination rule


1056 
is~$(\disj E)$,


1057 
\[ \List{\Var{P}\disj \Var{Q};\; \Var{P}\Imp \Var{R};\; \Var{Q}\Imp \Var{R}}


1058 
\Imp \Var{R} \]


1059 
and the proof state is $(P\disj P\Imp P)\Imp (P\disj P\imp P)$. The

331

1060 
lifted rule is

105

1061 
\[ \begin{array}{l@{}l}


1062 
\lbrakk\;& P\disj P \Imp \Var{P@1}\disj\Var{Q@1}; \\


1063 
& \List{P\disj P ;\; \Var{P@1}} \Imp \Var{R@1}; \\


1064 
& \List{P\disj P ;\; \Var{Q@1}} \Imp \Var{R@1} \\

1865

1065 
\rbrakk\;& \Imp (P\disj P \Imp \Var{R@1})

105

1066 
\end{array}


1067 
\]

331

1068 
Unification takes the simultaneous equations

105

1069 
$P\disj P \qeq \Var{P@1}\disj\Var{Q@1}$ and $\Var{R@1} \qeq P$, yielding


1070 
$\Var{P@1}\equiv\Var{Q@1}\equiv\Var{R@1} \equiv P$. The new proof state

331

1071 
is simply

105

1072 
\[ \List{P \Imp P;\; P \Imp P} \Imp (P\disj P\imp P).


1073 
\]


1074 
Elimresolution's simultaneous unification gives better control


1075 
than ordinary resolution. Recall the substitution rule:


1076 
$$ \List{\Var{t}=\Var{u}; \Var{P}(\Var{t})} \Imp \Var{P}(\Var{u})


1077 
\eqno(subst) $$


1078 
Unsuitable for ordinary resolution because $\Var{P}(\Var{u})$ admits many


1079 
unifiers, $(subst)$ works well with elimresolution. It deletes some


1080 
assumption of the form $x=y$ and replaces every~$y$ by~$x$ in the subgoal


1081 
formula. The simultaneous unification instantiates $\Var{u}$ to~$y$; if


1082 
$y$ is not an unknown, then $\Var{P}(y)$ can easily be unified with another


1083 
formula.


1084 


1085 
In logical parlance, the premise containing the connective to be eliminated


1086 
is called the \bfindex{major premise}. Elimresolution expects the major


1087 
premise to come first. The order of the premises is significant in


1088 
Isabelle.


1089 


1090 
\subsection{Destruction rules} \label{destruct}

312

1091 
\index{rules!destruction}\index{rules!elimination}


1092 
\index{forward proof}


1093 

296

1094 
Looking back to Fig.\ts\ref{folfig}, notice that there are two kinds of

105

1095 
elimination rule. The rules $({\conj}E1)$, $({\conj}E2)$, $({\imp}E)$ and


1096 
$({\forall}E)$ extract the conclusion from the major premise. In Isabelle

312

1097 
parlance, such rules are called {\bf destruction rules}; they are readable

105

1098 
and easy to use in forward proof. The rules $({\disj}E)$, $({\bot}E)$ and


1099 
$({\exists}E)$ work by discharging assumptions; they support backward proof


1100 
in a style reminiscent of the sequent calculus.


1101 


1102 
The latter style is the most general form of elimination rule. In natural


1103 
deduction, there is no way to recast $({\disj}E)$, $({\bot}E)$ or


1104 
$({\exists}E)$ as destruction rules. But we can write general elimination


1105 
rules for $\conj$, $\imp$ and~$\forall$:


1106 
\[


1107 
\infer{R}{P\conj Q & \infer*{R}{[P,Q]}} \qquad


1108 
\infer{R}{P\imp Q & P & \infer*{R}{[Q]}} \qquad


1109 
\infer{Q}{\forall x.P & \infer*{Q}{[P[t/x]]}}


1110 
\]


1111 
Because they are concise, destruction rules are simpler to derive than the


1112 
corresponding elimination rules. To facilitate their use in backward


1113 
proof, Isabelle provides a means of transforming a destruction rule such as


1114 
\[ \infer[\quad\hbox{to the elimination rule}\quad]{Q}{P@1 & \ldots & P@m}


1115 
\infer{R.}{P@1 & \ldots & P@m & \infer*{R}{[Q]}}


1116 
\]

331

1117 
{\bf Destructresolution}\index{destructresolution} combines this


1118 
transformation with elimresolution. It applies a destruction rule to some


1119 
assumption of a subgoal. Given the rule above, it replaces the


1120 
assumption~$P@1$ by~$Q$, with new subgoals of showing instances of $P@2$,


1121 
\ldots,~$P@m$. Destructresolution works forward from a subgoal's


1122 
assumptions. Ordinary resolution performs forward reasoning from theorems,


1123 
as illustrated in~\S\ref{joining}.

105

1124 


1125 


1126 
\subsection{Deriving rules by resolution} \label{deriving}

312

1127 
\index{rules!derivedbold}\index{metaassumptions!syntax of}

105

1128 
The metalogic, itself a form of the predicate calculus, is defined by a


1129 
system of natural deduction rules. Each theorem may depend upon


1130 
metaassumptions. The theorem that~$\phi$ follows from the assumptions


1131 
$\phi@1$, \ldots, $\phi@n$ is written


1132 
\[ \phi \quad [\phi@1,\ldots,\phi@n]. \]


1133 
A more conventional notation might be $\phi@1,\ldots,\phi@n \turn \phi$,


1134 
but Isabelle's notation is more readable with large formulae.


1135 


1136 
Metalevel natural deduction provides a convenient mechanism for deriving


1137 
new objectlevel rules. To derive the rule


1138 
\[ \infer{\phi,}{\theta@1 & \ldots & \theta@k} \]


1139 
assume the premises $\theta@1$,~\ldots,~$\theta@k$ at the


1140 
metalevel. Then prove $\phi$, possibly using these assumptions.


1141 
Starting with a proof state $\phi\Imp\phi$, assumptions may accumulate,


1142 
reaching a final state such as


1143 
\[ \phi \quad [\theta@1,\ldots,\theta@k]. \]


1144 
The metarule for $\Imp$ introduction discharges an assumption.


1145 
Discharging them in the order $\theta@k,\ldots,\theta@1$ yields the


1146 
metatheorem $\List{\theta@1; \ldots; \theta@k} \Imp \phi$, with no


1147 
assumptions. This represents the desired rule.


1148 
Let us derive the general $\conj$ elimination rule:


1149 
$$ \infer{R}{P\conj Q & \infer*{R}{[P,Q]}} \eqno(\conj E)$$


1150 
We assume $P\conj Q$ and $\List{P;Q}\Imp R$, and commence backward proof in


1151 
the state $R\Imp R$. Resolving this with the second assumption yields the


1152 
state


1153 
\[ \phantom{\List{P\conj Q;\; P\conj Q}}


1154 
\llap{$\List{P;Q}$}\Imp R \quad [\,\List{P;Q}\Imp R\,]. \]

331

1155 
Resolving subgoals~1 and~2 with~$({\conj}E1)$ and~$({\conj}E2)$,

105

1156 
respectively, yields the state

331

1157 
\[ \List{P\conj \Var{Q@1};\; \Var{P@2}\conj Q}\Imp R


1158 
\quad [\,\List{P;Q}\Imp R\,].


1159 
\]


1160 
The unknowns $\Var{Q@1}$ and~$\Var{P@2}$ arise from unconstrained


1161 
subformulae in the premises of~$({\conj}E1)$ and~$({\conj}E2)$. Resolving


1162 
both subgoals with the assumption $P\conj Q$ instantiates the unknowns to yield

105

1163 
\[ R \quad [\, \List{P;Q}\Imp R, P\conj Q \,]. \]


1164 
The proof may use the metaassumptions in any order, and as often as


1165 
necessary; when finished, we discharge them in the correct order to


1166 
obtain the desired form:


1167 
\[ \List{P\conj Q;\; \List{P;Q}\Imp R} \Imp R \]


1168 
We have derived the rule using free variables, which prevents their


1169 
premature instantiation during the proof; we may now replace them by


1170 
schematic variables.


1171 


1172 
\begin{warn}

331

1173 
Schematic variables are not allowed in metaassumptions, for a variety of


1174 
reasons. Metaassumptions remain fixed throughout a proof.

105

1175 
\end{warn}


1176 
