diff -r f51d4a302962 -r 5386df44a037 src/Doc/Intro/document/foundations.tex --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/Doc/Intro/document/foundations.tex Tue Aug 28 18:57:32 2012 +0200 @@ -0,0 +1,1178 @@ +\part{Foundations} +The following sections discuss Isabelle's logical foundations in detail: +representing logical syntax in the typed $\lambda$-calculus; expressing +inference rules in Isabelle's meta-logic; combining rules by resolution. + +If you wish to use Isabelle immediately, please turn to +page~\pageref{chap:getting}. You can always read about foundations later, +either by returning to this point or by looking up particular items in the +index. + +\begin{figure} +\begin{eqnarray*} + \neg P & \hbox{abbreviates} & P\imp\bot \\ + P\bimp Q & \hbox{abbreviates} & (P\imp Q) \conj (Q\imp P) +\end{eqnarray*} +\vskip 4ex + +\(\begin{array}{c@{\qquad\qquad}c} + \infer[({\conj}I)]{P\conj Q}{P & Q} & + \infer[({\conj}E1)]{P}{P\conj Q} \qquad + \infer[({\conj}E2)]{Q}{P\conj Q} \\[4ex] + + \infer[({\disj}I1)]{P\disj Q}{P} \qquad + \infer[({\disj}I2)]{P\disj Q}{Q} & + \infer[({\disj}E)]{R}{P\disj Q & \infer*{R}{[P]} & \infer*{R}{[Q]}}\\[4ex] + + \infer[({\imp}I)]{P\imp Q}{\infer*{Q}{[P]}} & + \infer[({\imp}E)]{Q}{P\imp Q & P} \\[4ex] + + & + \infer[({\bot}E)]{P}{\bot}\\[4ex] + + \infer[({\forall}I)*]{\forall x.P}{P} & + \infer[({\forall}E)]{P[t/x]}{\forall x.P} \\[3ex] + + \infer[({\exists}I)]{\exists x.P}{P[t/x]} & + \infer[({\exists}E)*]{Q}{{\exists x.P} & \infer*{Q}{[P]} } \\[3ex] + + {t=t} \,(refl) & \vcenter{\infer[(subst)]{P[u/x]}{t=u & P[t/x]}} +\end{array} \) + +\bigskip\bigskip +*{\em Eigenvariable conditions\/}: + +$\forall I$: provided $x$ is not free in the assumptions + +$\exists E$: provided $x$ is not free in $Q$ or any assumption except $P$ +\caption{Intuitionistic first-order logic} \label{fol-fig} +\end{figure} + +\section{Formalizing logical syntax in Isabelle}\label{sec:logical-syntax} +\index{first-order logic} + +Figure~\ref{fol-fig} presents intuitionistic first-order logic, +including equality. Let us see how to formalize +this logic in Isabelle, illustrating the main features of Isabelle's +polymorphic meta-logic. + +\index{lambda calc@$\lambda$-calculus} +Isabelle represents syntax using the simply typed $\lambda$-calculus. We +declare a type for each syntactic category of the logic. We declare a +constant for each symbol of the logic, giving each $n$-place operation an +$n$-argument curried function type. Most importantly, +$\lambda$-abstraction represents variable binding in quantifiers. + +\index{types!syntax of}\index{types!function}\index{*fun type} +\index{type constructors} +Isabelle has \ML-style polymorphic types such as~$(\alpha)list$, where +$list$ is a type constructor and $\alpha$ is a type variable; for example, +$(bool)list$ is the type of lists of booleans. Function types have the +form $(\sigma,\tau)fun$ or $\sigma\To\tau$, where $\sigma$ and $\tau$ are +types. Curried function types may be abbreviated: +\[ \sigma@1\To (\cdots \sigma@n\To \tau\cdots) \quad \hbox{as} \quad +[\sigma@1, \ldots, \sigma@n] \To \tau \] + +\index{terms!syntax of} The syntax for terms is summarised below. +Note that there are two versions of function application syntax +available in Isabelle: either $t\,u$, which is the usual form for +higher-order languages, or $t(u)$, trying to look more like +first-order. The latter syntax is used throughout the manual. +\[ +\index{lambda abs@$\lambda$-abstractions}\index{function applications} +\begin{array}{ll} + t :: \tau & \hbox{type constraint, on a term or bound variable} \\ + \lambda x.t & \hbox{abstraction} \\ + \lambda x@1\ldots x@n.t + & \hbox{curried abstraction, $\lambda x@1. \ldots \lambda x@n.t$} \\ + t(u) & \hbox{application} \\ + t (u@1, \ldots, u@n) & \hbox{curried application, $t(u@1)\ldots(u@n)$} +\end{array} +\] + + +\subsection{Simple types and constants}\index{types!simple|bold} + +The syntactic categories of our logic (Fig.\ts\ref{fol-fig}) are {\bf + formulae} and {\bf terms}. Formulae denote truth values, so (following +tradition) let us call their type~$o$. To allow~0 and~$Suc(t)$ as terms, +let us declare a type~$nat$ of natural numbers. Later, we shall see +how to admit terms of other types. + +\index{constants}\index{*nat type}\index{*o type} +After declaring the types~$o$ and~$nat$, we may declare constants for the +symbols of our logic. Since $\bot$ denotes a truth value (falsity) and 0 +denotes a number, we put \begin{eqnarray*} + \bot & :: & o \\ + 0 & :: & nat. +\end{eqnarray*} +If a symbol requires operands, the corresponding constant must have a +function type. In our logic, the successor function +($Suc$) is from natural numbers to natural numbers, negation ($\neg$) is a +function from truth values to truth values, and the binary connectives are +curried functions taking two truth values as arguments: +\begin{eqnarray*} + Suc & :: & nat\To nat \\ + {\neg} & :: & o\To o \\ + \conj,\disj,\imp,\bimp & :: & [o,o]\To o +\end{eqnarray*} +The binary connectives can be declared as infixes, with appropriate +precedences, so that we write $P\conj Q\disj R$ instead of +$\disj(\conj(P,Q), R)$. + +Section~\ref{sec:defining-theories} below describes the syntax of Isabelle +theory files and illustrates it by extending our logic with mathematical +induction. + + +\subsection{Polymorphic types and constants} \label{polymorphic} +\index{types!polymorphic|bold} +\index{equality!polymorphic} +\index{constants!polymorphic} + +Which type should we assign to the equality symbol? If we tried +$[nat,nat]\To o$, then equality would be restricted to the natural +numbers; we should have to declare different equality symbols for each +type. Isabelle's type system is polymorphic, so we could declare +\begin{eqnarray*} + {=} & :: & [\alpha,\alpha]\To o, +\end{eqnarray*} +where the type variable~$\alpha$ ranges over all types. +But this is also wrong. The declaration is too polymorphic; $\alpha$ +includes types like~$o$ and $nat\To nat$. Thus, it admits +$\bot=\neg(\bot)$ and $Suc=Suc$ as formulae, which is acceptable in +higher-order logic but not in first-order logic. + +Isabelle's {\bf type classes}\index{classes} control +polymorphism~\cite{nipkow-prehofer}. Each type variable belongs to a +class, which denotes a set of types. Classes are partially ordered by the +subclass relation, which is essentially the subset relation on the sets of +types. They closely resemble the classes of the functional language +Haskell~\cite{haskell-tutorial,haskell-report}. + +\index{*logic class}\index{*term class} +Isabelle provides the built-in class $logic$, which consists of the logical +types: the ones we want to reason about. Let us declare a class $term$, to +consist of all legal types of terms in our logic. The subclass structure +is now $term\le logic$. + +\index{*nat type} +We put $nat$ in class $term$ by declaring $nat{::}term$. We declare the +equality constant by +\begin{eqnarray*} + {=} & :: & [\alpha{::}term,\alpha]\To o +\end{eqnarray*} +where $\alpha{::}term$ constrains the type variable~$\alpha$ to class +$term$. Such type variables resemble Standard~\ML's equality type +variables. + +We give~$o$ and function types the class $logic$ rather than~$term$, since +they are not legal types for terms. We may introduce new types of class +$term$ --- for instance, type $string$ or $real$ --- at any time. We can +even declare type constructors such as~$list$, and state that type +$(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality +applies to lists of natural numbers but not to lists of formulae. We may +summarize this paragraph by a set of {\bf arity declarations} for type +constructors:\index{arities!declaring} +\begin{eqnarray*}\index{*o type}\index{*fun type} + o & :: & logic \\ + fun & :: & (logic,logic)logic \\ + nat, string, real & :: & term \\ + list & :: & (term)term +\end{eqnarray*} +(Recall that $fun$ is the type constructor for function types.) +In \rmindex{higher-order logic}, equality does apply to truth values and +functions; this requires the arity declarations ${o::term}$ +and ${fun::(term,term)term}$. The class system can also handle +overloading.\index{overloading|bold} We could declare $arith$ to be the +subclass of $term$ consisting of the `arithmetic' types, such as~$nat$. +Then we could declare the operators +\begin{eqnarray*} + {+},{-},{\times},{/} & :: & [\alpha{::}arith,\alpha]\To \alpha +\end{eqnarray*} +If we declare new types $real$ and $complex$ of class $arith$, then we +in effect have three sets of operators: +\begin{eqnarray*} + {+},{-},{\times},{/} & :: & [nat,nat]\To nat \\ + {+},{-},{\times},{/} & :: & [real,real]\To real \\ + {+},{-},{\times},{/} & :: & [complex,complex]\To complex +\end{eqnarray*} +Isabelle will regard these as distinct constants, each of which can be defined +separately. We could even introduce the type $(\alpha)vector$ and declare +its arity as $(arith)arith$. Then we could declare the constant +\begin{eqnarray*} + {+} & :: & [(\alpha)vector,(\alpha)vector]\To (\alpha)vector +\end{eqnarray*} +and specify it in terms of ${+} :: [\alpha,\alpha]\To \alpha$. + +A type variable may belong to any finite number of classes. Suppose that +we had declared yet another class $ord \le term$, the class of all +`ordered' types, and a constant +\begin{eqnarray*} + {\le} & :: & [\alpha{::}ord,\alpha]\To o. +\end{eqnarray*} +In this context the variable $x$ in $x \le (x+x)$ will be assigned type +$\alpha{::}\{arith,ord\}$, which means $\alpha$ belongs to both $arith$ and +$ord$. Semantically the set $\{arith,ord\}$ should be understood as the +intersection of the sets of types represented by $arith$ and $ord$. Such +intersections of classes are called \bfindex{sorts}. The empty +intersection of classes, $\{\}$, contains all types and is thus the {\bf + universal sort}. + +Even with overloading, each term has a unique, most general type. For this +to be possible, the class and type declarations must satisfy certain +technical constraints; see +\iflabelundefined{sec:ref-defining-theories}% + {Sect.\ Defining Theories in the {\em Reference Manual}}% + {\S\ref{sec:ref-defining-theories}}. + + +\subsection{Higher types and quantifiers} +\index{types!higher|bold}\index{quantifiers} +Quantifiers are regarded as operations upon functions. Ignoring polymorphism +for the moment, consider the formula $\forall x. P(x)$, where $x$ ranges +over type~$nat$. This is true if $P(x)$ is true for all~$x$. Abstracting +$P(x)$ into a function, this is the same as saying that $\lambda x.P(x)$ +returns true for all arguments. Thus, the universal quantifier can be +represented by a constant +\begin{eqnarray*} + \forall & :: & (nat\To o) \To o, +\end{eqnarray*} +which is essentially an infinitary truth table. The representation of $\forall +x. P(x)$ is $\forall(\lambda x. P(x))$. + +The existential quantifier is treated +in the same way. Other binding operators are also easily handled; for +instance, the summation operator $\Sigma@{k=i}^j f(k)$ can be represented as +$\Sigma(i,j,\lambda k.f(k))$, where +\begin{eqnarray*} + \Sigma & :: & [nat,nat, nat\To nat] \To nat. +\end{eqnarray*} +Quantifiers may be polymorphic. We may define $\forall$ and~$\exists$ over +all legal types of terms, not just the natural numbers, and +allow summations over all arithmetic types: +\begin{eqnarray*} + \forall,\exists & :: & (\alpha{::}term\To o) \To o \\ + \Sigma & :: & [nat,nat, nat\To \alpha{::}arith] \To \alpha +\end{eqnarray*} +Observe that the index variables still have type $nat$, while the values +being summed may belong to any arithmetic type. + + +\section{Formalizing logical rules in Isabelle} +\index{meta-implication|bold} +\index{meta-quantifiers|bold} +\index{meta-equality|bold} + +Object-logics are formalized by extending Isabelle's +meta-logic~\cite{paulson-found}, which is intuitionistic higher-order logic. +The meta-level connectives are {\bf implication}, the {\bf universal + quantifier}, and {\bf equality}. +\begin{itemize} + \item The implication \(\phi\Imp \psi\) means `\(\phi\) implies +\(\psi\)', and expresses logical {\bf entailment}. + + \item The quantification \(\Forall x.\phi\) means `\(\phi\) is true for +all $x$', and expresses {\bf generality} in rules and axiom schemes. + +\item The equality \(a\equiv b\) means `$a$ equals $b$', for expressing + {\bf definitions} (see~\S\ref{definitions}).\index{definitions} + Equalities left over from the unification process, so called {\bf + flex-flex constraints},\index{flex-flex constraints} are written $a\qeq + b$. The two equality symbols have the same logical meaning. + +\end{itemize} +The syntax of the meta-logic is formalized in the same manner +as object-logics, using the simply typed $\lambda$-calculus. Analogous to +type~$o$ above, there is a built-in type $prop$ of meta-level truth values. +Meta-level formulae will have this type. Type $prop$ belongs to +class~$logic$; also, $\sigma\To\tau$ belongs to $logic$ provided $\sigma$ +and $\tau$ do. Here are the types of the built-in connectives: +\begin{eqnarray*}\index{*prop type}\index{*logic class} + \Imp & :: & [prop,prop]\To prop \\ + \Forall & :: & (\alpha{::}logic\To prop) \To prop \\ + {\equiv} & :: & [\alpha{::}\{\},\alpha]\To prop \\ + \qeq & :: & [\alpha{::}\{\},\alpha]\To prop +\end{eqnarray*} +The polymorphism in $\Forall$ is restricted to class~$logic$ to exclude +certain types, those used just for parsing. The type variable +$\alpha{::}\{\}$ ranges over the universal sort. + +In our formalization of first-order logic, we declared a type~$o$ of +object-level truth values, rather than using~$prop$ for this purpose. If +we declared the object-level connectives to have types such as +${\neg}::prop\To prop$, then these connectives would be applicable to +meta-level formulae. Keeping $prop$ and $o$ as separate types maintains +the distinction between the meta-level and the object-level. To formalize +the inference rules, we shall need to relate the two levels; accordingly, +we declare the constant +\index{*Trueprop constant} +\begin{eqnarray*} + Trueprop & :: & o\To prop. +\end{eqnarray*} +We may regard $Trueprop$ as a meta-level predicate, reading $Trueprop(P)$ as +`$P$ is true at the object-level.' Put another way, $Trueprop$ is a coercion +from $o$ to $prop$. + + +\subsection{Expressing propositional rules} +\index{rules!propositional} +We shall illustrate the use of the meta-logic by formalizing the rules of +Fig.\ts\ref{fol-fig}. Each object-level rule is expressed as a meta-level +axiom. + +One of the simplest rules is $(\conj E1)$. Making +everything explicit, its formalization in the meta-logic is +$$ +\Forall P\;Q. Trueprop(P\conj Q) \Imp Trueprop(P). \eqno(\conj E1) +$$ +This may look formidable, but it has an obvious reading: for all object-level +truth values $P$ and~$Q$, if $P\conj Q$ is true then so is~$P$. The +reading is correct because the meta-logic has simple models, where +types denote sets and $\Forall$ really means `for all.' + +\index{*Trueprop constant} +Isabelle adopts notational conventions to ease the writing of rules. We may +hide the occurrences of $Trueprop$ by making it an implicit coercion. +Outer universal quantifiers may be dropped. Finally, the nested implication +\index{meta-implication} +\[ \phi@1\Imp(\cdots \phi@n\Imp\psi\cdots) \] +may be abbreviated as $\List{\phi@1; \ldots; \phi@n} \Imp \psi$, which +formalizes a rule of $n$~premises. + +Using these conventions, the conjunction rules become the following axioms. +These fully specify the properties of~$\conj$: +$$ \List{P; Q} \Imp P\conj Q \eqno(\conj I) $$ +$$ P\conj Q \Imp P \qquad P\conj Q \Imp Q \eqno(\conj E1,2) $$ + +\noindent +Next, consider the disjunction rules. The discharge of assumption in +$(\disj E)$ is expressed using $\Imp$: +\index{assumptions!discharge of}% +$$ P \Imp P\disj Q \qquad Q \Imp P\disj Q \eqno(\disj I1,2) $$ +$$ \List{P\disj Q; P\Imp R; Q\Imp R} \Imp R \eqno(\disj E) $$ +% +To understand this treatment of assumptions in natural +deduction, look at implication. The rule $({\imp}I)$ is the classic +example of natural deduction: to prove that $P\imp Q$ is true, assume $P$ +is true and show that $Q$ must then be true. More concisely, if $P$ +implies $Q$ (at the meta-level), then $P\imp Q$ is true (at the +object-level). Showing the coercion explicitly, this is formalized as +\[ (Trueprop(P)\Imp Trueprop(Q)) \Imp Trueprop(P\imp Q). \] +The rule $({\imp}E)$ is straightforward; hiding $Trueprop$, the axioms to +specify $\imp$ are +$$ (P \Imp Q) \Imp P\imp Q \eqno({\imp}I) $$ +$$ \List{P\imp Q; P} \Imp Q. \eqno({\imp}E) $$ + +\noindent +Finally, the intuitionistic contradiction rule is formalized as the axiom +$$ \bot \Imp P. \eqno(\bot E) $$ + +\begin{warn} +Earlier versions of Isabelle, and certain +papers~\cite{paulson-found,paulson700}, use $\List{P}$ to mean $Trueprop(P)$. +\end{warn} + +\subsection{Quantifier rules and substitution} +\index{quantifiers}\index{rules!quantifier}\index{substitution|bold} +\index{variables!bound}\index{lambda abs@$\lambda$-abstractions} +\index{function applications} + +Isabelle expresses variable binding using $\lambda$-abstraction; for instance, +$\forall x.P$ is formalized as $\forall(\lambda x.P)$. Recall that $F(t)$ +is Isabelle's syntax for application of the function~$F$ to the argument~$t$; +it is not a meta-notation for substitution. On the other hand, a substitution +will take place if $F$ has the form $\lambda x.P$; Isabelle transforms +$(\lambda x.P)(t)$ to~$P[t/x]$ by $\beta$-conversion. Thus, we can express +inference rules that involve substitution for bound variables. + +\index{parameters|bold}\index{eigenvariables|see{parameters}} +A logic may attach provisos to certain of its rules, especially quantifier +rules. We cannot hope to formalize arbitrary provisos. Fortunately, those +typical of quantifier rules always have the same form, namely `$x$ not free in +\ldots {\it (some set of formulae)},' where $x$ is a variable (called a {\bf +parameter} or {\bf eigenvariable}) in some premise. Isabelle treats +provisos using~$\Forall$, its inbuilt notion of `for all'. +\index{meta-quantifiers} + +The purpose of the proviso `$x$ not free in \ldots' is +to ensure that the premise may not make assumptions about the value of~$x$, +and therefore holds for all~$x$. We formalize $(\forall I)$ by +\[ \left(\Forall x. Trueprop(P(x))\right) \Imp Trueprop(\forall x.P(x)). \] +This means, `if $P(x)$ is true for all~$x$, then $\forall x.P(x)$ is true.' +The $\forall E$ rule exploits $\beta$-conversion. Hiding $Trueprop$, the +$\forall$ axioms are +$$ \left(\Forall x. P(x)\right) \Imp \forall x.P(x) \eqno(\forall I) $$ +$$ (\forall x.P(x)) \Imp P(t). \eqno(\forall E) $$ + +\noindent +We have defined the object-level universal quantifier~($\forall$) +using~$\Forall$. But we do not require meta-level counterparts of all the +connectives of the object-logic! Consider the existential quantifier: +$$ P(t) \Imp \exists x.P(x) \eqno(\exists I) $$ +$$ \List{\exists x.P(x);\; \Forall x. P(x)\Imp Q} \Imp Q \eqno(\exists E) $$ +Let us verify $(\exists E)$ semantically. Suppose that the premises +hold; since $\exists x.P(x)$ is true, we may choose an~$a$ such that $P(a)$ is +true. Instantiating $\Forall x. P(x)\Imp Q$ with $a$ yields $P(a)\Imp Q$, and +we obtain the desired conclusion, $Q$. + +The treatment of substitution deserves mention. The rule +\[ \infer{P[u/t]}{t=u & P} \] +would be hard to formalize in Isabelle. It calls for replacing~$t$ by $u$ +throughout~$P$, which cannot be expressed using $\beta$-conversion. Our +rule~$(subst)$ uses~$P$ as a template for substitution, inferring $P[u/x]$ +from~$P[t/x]$. When we formalize this as an axiom, the template becomes a +function variable: +$$ \List{t=u; P(t)} \Imp P(u). \eqno(subst) $$ + + +\subsection{Signatures and theories} +\index{signatures|bold} + +A {\bf signature} contains the information necessary for type-checking, +parsing and pretty printing a term. It specifies type classes and their +relationships, types and their arities, constants and their types, etc. It +also contains grammar rules, specified using mixfix declarations. + +Two signatures can be merged provided their specifications are compatible --- +they must not, for example, assign different types to the same constant. +Under similar conditions, a signature can be extended. Signatures are +managed internally by Isabelle; users seldom encounter them. + +\index{theories|bold} A {\bf theory} consists of a signature plus a collection +of axioms. The Pure theory contains only the meta-logic. Theories can be +combined provided their signatures are compatible. A theory definition +extends an existing theory with further signature specifications --- classes, +types, constants and mixfix declarations --- plus lists of axioms and +definitions etc., expressed as strings to be parsed. A theory can formalize a +small piece of mathematics, such as lists and their operations, or an entire +logic. A mathematical development typically involves many theories in a +hierarchy. For example, the Pure theory could be extended to form a theory +for Fig.\ts\ref{fol-fig}; this could be extended in two separate ways to form +a theory for natural numbers and a theory for lists; the union of these two +could be extended into a theory defining the length of a list: +\begin{tt} +\[ +\begin{array}{c@{}c@{}c@{}c@{}c} + {} & {} &\hbox{Pure}& {} & {} \\ + {} & {} & \downarrow & {} & {} \\ + {} & {} &\hbox{FOL} & {} & {} \\ + {} & \swarrow & {} & \searrow & {} \\ + \hbox{Nat} & {} & {} & {} & \hbox{List} \\ + {} & \searrow & {} & \swarrow & {} \\ + {} & {} &\hbox{Nat}+\hbox{List}& {} & {} \\ + {} & {} & \downarrow & {} & {} \\ + {} & {} & \hbox{Length} & {} & {} +\end{array} +\] +\end{tt}% +Each Isabelle proof typically works within a single theory, which is +associated with the proof state. However, many different theories may +coexist at the same time, and you may work in each of these during a single +session. + +\begin{warn}\index{constants!clashes with variables}% + Confusing problems arise if you work in the wrong theory. Each theory + defines its own syntax. An identifier may be regarded in one theory as a + constant and in another as a variable, for example. +\end{warn} + +\section{Proof construction in Isabelle} +I have elsewhere described the meta-logic and demonstrated it by +formalizing first-order logic~\cite{paulson-found}. There is a one-to-one +correspondence between meta-level proofs and object-level proofs. To each +use of a meta-level axiom, such as $(\forall I)$, there is a use of the +corresponding object-level rule. Object-level assumptions and parameters +have meta-level counterparts. The meta-level formalization is {\bf + faithful}, admitting no incorrect object-level inferences, and {\bf + adequate}, admitting all correct object-level inferences. These +properties must be demonstrated separately for each object-logic. + +The meta-logic is defined by a collection of inference rules, including +equational rules for the $\lambda$-calculus and logical rules. The rules +for~$\Imp$ and~$\Forall$ resemble those for~$\imp$ and~$\forall$ in +Fig.\ts\ref{fol-fig}. Proofs performed using the primitive meta-rules +would be lengthy; Isabelle proofs normally use certain derived rules. +{\bf Resolution}, in particular, is convenient for backward proof. + +Unification is central to theorem proving. It supports quantifier +reasoning by allowing certain `unknown' terms to be instantiated later, +possibly in stages. When proving that the time required to sort $n$ +integers is proportional to~$n^2$, we need not state the constant of +proportionality; when proving that a hardware adder will deliver the sum of +its inputs, we need not state how many clock ticks will be required. Such +quantities often emerge from the proof. + +Isabelle provides {\bf schematic variables}, or {\bf + unknowns},\index{unknowns} for unification. Logically, unknowns are free +variables. But while ordinary variables remain fixed, unification may +instantiate unknowns. Unknowns are written with a ?\ prefix and are +frequently subscripted: $\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots, +$\Var{P}$, $\Var{P@1}$, \ldots. + +Recall that an inference rule of the form +\[ \infer{\phi}{\phi@1 & \ldots & \phi@n} \] +is formalized in Isabelle's meta-logic as the axiom +$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.\index{resolution} +Such axioms resemble Prolog's Horn clauses, and can be combined by +resolution --- Isabelle's principal proof method. Resolution yields both +forward and backward proof. Backward proof works by unifying a goal with +the conclusion of a rule, whose premises become new subgoals. Forward proof +works by unifying theorems with the premises of a rule, deriving a new theorem. + +Isabelle formulae require an extended notion of resolution. +They differ from Horn clauses in two major respects: +\begin{itemize} + \item They are written in the typed $\lambda$-calculus, and therefore must be +resolved using higher-order unification. + +\item The constituents of a clause need not be atomic formulae. Any + formula of the form $Trueprop(\cdots)$ is atomic, but axioms such as + ${\imp}I$ and $\forall I$ contain non-atomic formulae. +\end{itemize} +Isabelle has little in common with classical resolution theorem provers +such as Otter~\cite{wos-bledsoe}. At the meta-level, Isabelle proves +theorems in their positive form, not by refutation. However, an +object-logic that includes a contradiction rule may employ a refutation +proof procedure. + + +\subsection{Higher-order unification} +\index{unification!higher-order|bold} +Unification is equation solving. The solution of $f(\Var{x},c) \qeq +f(d,\Var{y})$ is $\Var{x}\equiv d$ and $\Var{y}\equiv c$. {\bf +Higher-order unification} is equation solving for typed $\lambda$-terms. +To handle $\beta$-conversion, it must reduce $(\lambda x.t)u$ to $t[u/x]$. +That is easy --- in the typed $\lambda$-calculus, all reduction sequences +terminate at a normal form. But it must guess the unknown +function~$\Var{f}$ in order to solve the equation +\begin{equation} \label{hou-eqn} + \Var{f}(t) \qeq g(u@1,\ldots,u@k). +\end{equation} +Huet's~\cite{huet75} search procedure solves equations by imitation and +projection. {\bf Imitation} makes~$\Var{f}$ apply the leading symbol (if a +constant) of the right-hand side. To solve equation~(\ref{hou-eqn}), it +guesses +\[ \Var{f} \equiv \lambda x. g(\Var{h@1}(x),\ldots,\Var{h@k}(x)), \] +where $\Var{h@1}$, \ldots, $\Var{h@k}$ are new unknowns. Assuming there are no +other occurrences of~$\Var{f}$, equation~(\ref{hou-eqn}) simplifies to the +set of equations +\[ \Var{h@1}(t)\qeq u@1 \quad\ldots\quad \Var{h@k}(t)\qeq u@k. \] +If the procedure solves these equations, instantiating $\Var{h@1}$, \ldots, +$\Var{h@k}$, then it yields an instantiation for~$\Var{f}$. + +{\bf Projection} makes $\Var{f}$ apply one of its arguments. To solve +equation~(\ref{hou-eqn}), if $t$ expects~$m$ arguments and delivers a +result of suitable type, it guesses +\[ \Var{f} \equiv \lambda x. x(\Var{h@1}(x),\ldots,\Var{h@m}(x)), \] +where $\Var{h@1}$, \ldots, $\Var{h@m}$ are new unknowns. Assuming there are no +other occurrences of~$\Var{f}$, equation~(\ref{hou-eqn}) simplifies to the +equation +\[ t(\Var{h@1}(t),\ldots,\Var{h@m}(t)) \qeq g(u@1,\ldots,u@k). \] + +\begin{warn}\index{unification!incompleteness of}% +Huet's unification procedure is complete. Isabelle's polymorphic version, +which solves for type unknowns as well as for term unknowns, is incomplete. +The problem is that projection requires type information. In +equation~(\ref{hou-eqn}), if the type of~$t$ is unknown, then projections +are possible for all~$m\geq0$, and the types of the $\Var{h@i}$ will be +similarly unconstrained. Therefore, Isabelle never attempts such +projections, and may fail to find unifiers where a type unknown turns out +to be a function type. +\end{warn} + +\index{unknowns!function|bold} +Given $\Var{f}(t@1,\ldots,t@n)\qeq u$, Huet's procedure could make up to +$n+1$ guesses. The search tree and set of unifiers may be infinite. But +higher-order unification can work effectively, provided you are careful +with {\bf function unknowns}: +\begin{itemize} + \item Equations with no function unknowns are solved using first-order +unification, extended to treat bound variables. For example, $\lambda x.x +\qeq \lambda x.\Var{y}$ has no solution because $\Var{y}\equiv x$ would +capture the free variable~$x$. + + \item An occurrence of the term $\Var{f}(x,y,z)$, where the arguments are +distinct bound variables, causes no difficulties. Its projections can only +match the corresponding variables. + + \item Even an equation such as $\Var{f}(a)\qeq a+a$ is all right. It has +four solutions, but Isabelle evaluates them lazily, trying projection before +imitation. The first solution is usually the one desired: +\[ \Var{f}\equiv \lambda x. x+x \quad + \Var{f}\equiv \lambda x. a+x \quad + \Var{f}\equiv \lambda x. x+a \quad + \Var{f}\equiv \lambda x. a+a \] + \item Equations such as $\Var{f}(\Var{x},\Var{y})\qeq t$ and +$\Var{f}(\Var{g}(x))\qeq t$ admit vast numbers of unifiers, and must be +avoided. +\end{itemize} +In problematic cases, you may have to instantiate some unknowns before +invoking unification. + + +\subsection{Joining rules by resolution} \label{joining} +\index{resolution|bold} +Let $\List{\psi@1; \ldots; \psi@m} \Imp \psi$ and $\List{\phi@1; \ldots; +\phi@n} \Imp \phi$ be two Isabelle theorems, representing object-level rules. +Choosing some~$i$ from~1 to~$n$, suppose that $\psi$ and $\phi@i$ have a +higher-order unifier. Writing $Xs$ for the application of substitution~$s$ to +expression~$X$, this means there is some~$s$ such that $\psi s\equiv \phi@i s$. +By resolution, we may conclude +\[ (\List{\phi@1; \ldots; \phi@{i-1}; \psi@1; \ldots; \psi@m; + \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s. +\] +The substitution~$s$ may instantiate unknowns in both rules. In short, +resolution is the following rule: +\[ \infer[(\psi s\equiv \phi@i s)] + {(\List{\phi@1; \ldots; \phi@{i-1}; \psi@1; \ldots; \psi@m; + \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s} + {\List{\psi@1; \ldots; \psi@m} \Imp \psi & & + \List{\phi@1; \ldots; \phi@n} \Imp \phi} +\] +It operates at the meta-level, on Isabelle theorems, and is justified by +the properties of $\Imp$ and~$\Forall$. It takes the number~$i$ (for +$1\leq i\leq n$) as a parameter and may yield infinitely many conclusions, +one for each unifier of $\psi$ with $\phi@i$. Isabelle returns these +conclusions as a sequence (lazy list). + +Resolution expects the rules to have no outer quantifiers~($\Forall$). +It may rename or instantiate any schematic variables, but leaves free +variables unchanged. When constructing a theory, Isabelle puts the +rules into a standard form with all free variables converted into +schematic ones; for instance, $({\imp}E)$ becomes +\[ \List{\Var{P}\imp \Var{Q}; \Var{P}} \Imp \Var{Q}. +\] +When resolving two rules, the unknowns in the first rule are renamed, by +subscripting, to make them distinct from the unknowns in the second rule. To +resolve $({\imp}E)$ with itself, the first copy of the rule becomes +\[ \List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}} \Imp \Var{Q@1}. \] +Resolving this with $({\imp}E)$ in the first premise, unifying $\Var{Q@1}$ with +$\Var{P}\imp \Var{Q}$, is the meta-level inference +\[ \infer{\List{\Var{P@1}\imp (\Var{P}\imp \Var{Q}); \Var{P@1}; \Var{P}} + \Imp\Var{Q}.} + {\List{\Var{P@1}\imp \Var{Q@1}; \Var{P@1}} \Imp \Var{Q@1} & & + \List{\Var{P}\imp \Var{Q}; \Var{P}} \Imp \Var{Q}} +\] +Renaming the unknowns in the resolvent, we have derived the +object-level rule\index{rules!derived} +\[ \infer{Q.}{R\imp (P\imp Q) & R & P} \] +Joining rules in this fashion is a simple way of proving theorems. The +derived rules are conservative extensions of the object-logic, and may permit +simpler proofs. Let us consider another example. Suppose we have the axiom +$$ \forall x\,y. Suc(x)=Suc(y)\imp x=y. \eqno (inject) $$ + +\noindent +The standard form of $(\forall E)$ is +$\forall x.\Var{P}(x) \Imp \Var{P}(\Var{t})$. +Resolving $(inject)$ with $(\forall E)$ replaces $\Var{P}$ by +$\lambda x. \forall y. Suc(x)=Suc(y)\imp x=y$ and leaves $\Var{t}$ +unchanged, yielding +\[ \forall y. Suc(\Var{t})=Suc(y)\imp \Var{t}=y. \] +Resolving this with $(\forall E)$ puts a subscript on~$\Var{t}$ +and yields +\[ Suc(\Var{t@1})=Suc(\Var{t})\imp \Var{t@1}=\Var{t}. \] +Resolving this with $({\imp}E)$ increases the subscripts and yields +\[ Suc(\Var{t@2})=Suc(\Var{t@1})\Imp \Var{t@2}=\Var{t@1}. +\] +We have derived the rule +\[ \infer{m=n,}{Suc(m)=Suc(n)} \] +which goes directly from $Suc(m)=Suc(n)$ to $m=n$. It is handy for simplifying +an equation like $Suc(Suc(Suc(m)))=Suc(Suc(Suc(0)))$. + + +\section{Lifting a rule into a context} +The rules $({\imp}I)$ and $(\forall I)$ may seem unsuitable for +resolution. They have non-atomic premises, namely $P\Imp Q$ and $\Forall +x.P(x)$, while the conclusions of all the rules are atomic (they have the form +$Trueprop(\cdots)$). Isabelle gets round the problem through a meta-inference +called \bfindex{lifting}. Let us consider how to construct proofs such as +\[ \infer[({\imp}I)]{P\imp(Q\imp R)} + {\infer[({\imp}I)]{Q\imp R} + {\infer*{R}{[P,Q]}}} + \qquad + \infer[(\forall I)]{\forall x\,y.P(x,y)} + {\infer[(\forall I)]{\forall y.P(x,y)}{P(x,y)}} +\] + +\subsection{Lifting over assumptions} +\index{assumptions!lifting over} +Lifting over $\theta\Imp{}$ is the following meta-inference rule: +\[ \infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp + (\theta \Imp \phi)} + {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \] +This is clearly sound: if $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ is true and +$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$ and $\theta$ are all true then +$\phi$ must be true. Iterated lifting over a series of meta-formulae +$\theta@k$, \ldots, $\theta@1$ yields an object-rule whose conclusion is +$\List{\theta@1; \ldots; \theta@k} \Imp \phi$. Typically the $\theta@i$ are +the assumptions in a natural deduction proof; lifting copies them into a rule's +premises and conclusion. + +When resolving two rules, Isabelle lifts the first one if necessary. The +standard form of $({\imp}I)$ is +\[ (\Var{P} \Imp \Var{Q}) \Imp \Var{P}\imp \Var{Q}. \] +To resolve this rule with itself, Isabelle modifies one copy as follows: it +renames the unknowns to $\Var{P@1}$ and $\Var{Q@1}$, then lifts the rule over +$\Var{P}\Imp{}$ to obtain +\[ (\Var{P}\Imp (\Var{P@1} \Imp \Var{Q@1})) \Imp (\Var{P} \Imp + (\Var{P@1}\imp \Var{Q@1})). \] +Using the $\List{\cdots}$ abbreviation, this can be written as +\[ \List{\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}; \Var{P}} + \Imp \Var{P@1}\imp \Var{Q@1}. \] +Unifying $\Var{P}\Imp \Var{P@1}\imp\Var{Q@1}$ with $\Var{P} \Imp +\Var{Q}$ instantiates $\Var{Q}$ to ${\Var{P@1}\imp\Var{Q@1}}$. +Resolution yields +\[ (\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}) \Imp +\Var{P}\imp(\Var{P@1}\imp\Var{Q@1}). \] +This represents the derived rule +\[ \infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}} \] + +\subsection{Lifting over parameters} +\index{parameters!lifting over} +An analogous form of lifting handles premises of the form $\Forall x\ldots\,$. +Here, lifting prefixes an object-rule's premises and conclusion with $\Forall +x$. At the same time, lifting introduces a dependence upon~$x$. It replaces +each unknown $\Var{a}$ in the rule by $\Var{a'}(x)$, where $\Var{a'}$ is a new +unknown (by subscripting) of suitable type --- necessarily a function type. In +short, lifting is the meta-inference +\[ \infer{\List{\Forall x.\phi@1^x; \ldots; \Forall x.\phi@n^x} + \Imp \Forall x.\phi^x,} + {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \] +% +where $\phi^x$ stands for the result of lifting unknowns over~$x$ in +$\phi$. It is not hard to verify that this meta-inference is sound. If +$\phi\Imp\psi$ then $\phi^x\Imp\psi^x$ for all~$x$; so if $\phi^x$ is true +for all~$x$ then so is $\psi^x$. Thus, from $\phi\Imp\psi$ we conclude +$(\Forall x.\phi^x) \Imp (\Forall x.\psi^x)$. + +For example, $(\disj I)$ might be lifted to +\[ (\Forall x.\Var{P@1}(x)) \Imp (\Forall x. \Var{P@1}(x)\disj \Var{Q@1}(x))\] +and $(\forall I)$ to +\[ (\Forall x\,y.\Var{P@1}(x,y)) \Imp (\Forall x. \forall y.\Var{P@1}(x,y)). \] +Isabelle has renamed a bound variable in $(\forall I)$ from $x$ to~$y$, +avoiding a clash. Resolving the above with $(\forall I)$ is the meta-inference +\[ \infer{\Forall x\,y.\Var{P@1}(x,y)) \Imp \forall x\,y.\Var{P@1}(x,y)) } + {(\Forall x\,y.\Var{P@1}(x,y)) \Imp + (\Forall x. \forall y.\Var{P@1}(x,y)) & + (\Forall x.\Var{P}(x)) \Imp (\forall x.\Var{P}(x))} \] +Here, $\Var{P}$ is replaced by $\lambda x.\forall y.\Var{P@1}(x,y)$; the +resolvent expresses the derived rule +\[ \vcenter{ \infer{\forall x\,y.Q(x,y)}{Q(x,y)} } + \quad\hbox{provided $x$, $y$ not free in the assumptions} +\] +I discuss lifting and parameters at length elsewhere~\cite{paulson-found}. +Miller goes into even greater detail~\cite{miller-mixed}. + + +\section{Backward proof by resolution} +\index{resolution!in backward proof} + +Resolution is convenient for deriving simple rules and for reasoning +forward from facts. It can also support backward proof, where we start +with a goal and refine it to progressively simpler subgoals until all have +been solved. {\sc lcf} and its descendants {\sc hol} and Nuprl provide +tactics and tacticals, which constitute a sophisticated language for +expressing proof searches. {\bf Tactics} refine subgoals while {\bf + tacticals} combine tactics. + +\index{LCF system} +Isabelle's tactics and tacticals work differently from {\sc lcf}'s. An +Isabelle rule is bidirectional: there is no distinction between +inputs and outputs. {\sc lcf} has a separate tactic for each rule; +Isabelle performs refinement by any rule in a uniform fashion, using +resolution. + +Isabelle works with meta-level theorems of the form +\( \List{\phi@1; \ldots; \phi@n} \Imp \phi \). +We have viewed this as the {\bf rule} with premises +$\phi@1$,~\ldots,~$\phi@n$ and conclusion~$\phi$. It can also be viewed as +the {\bf proof state}\index{proof state} +with subgoals $\phi@1$,~\ldots,~$\phi@n$ and main +goal~$\phi$. + +To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof +state. This assertion is, trivially, a theorem. At a later stage in the +backward proof, a typical proof state is $\List{\phi@1; \ldots; \phi@n} +\Imp \phi$. This proof state is a theorem, ensuring that the subgoals +$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$. If $n=0$ then we have +proved~$\phi$ outright. If $\phi$ contains unknowns, they may become +instantiated during the proof; a proof state may be $\List{\phi@1; \ldots; +\phi@n} \Imp \phi'$, where $\phi'$ is an instance of~$\phi$. + +\subsection{Refinement by resolution} +To refine subgoal~$i$ of a proof state by a rule, perform the following +resolution: +\[ \infer{\hbox{new proof state}}{\hbox{rule} & & \hbox{proof state}} \] +Suppose the rule is $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$ after +lifting over subgoal~$i$'s assumptions and parameters. If the proof state +is $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, then the new proof state is +(for~$1\leq i\leq n$) +\[ (\List{\phi@1; \ldots; \phi@{i-1}; \psi'@1; + \ldots; \psi'@m; \phi@{i+1}; \ldots; \phi@n} \Imp \phi)s. \] +Substitution~$s$ unifies $\psi'$ with~$\phi@i$. In the proof state, +subgoal~$i$ is replaced by $m$ new subgoals, the rule's instantiated premises. +If some of the rule's unknowns are left un-instantiated, they become new +unknowns in the proof state. Refinement by~$(\exists I)$, namely +\[ \Var{P}(\Var{t}) \Imp \exists x. \Var{P}(x), \] +inserts a new unknown derived from~$\Var{t}$ by subscripting and lifting. +We do not have to specify an `existential witness' when +applying~$(\exists I)$. Further resolutions may instantiate unknowns in +the proof state. + +\subsection{Proof by assumption} +\index{assumptions!use of} +In the course of a natural deduction proof, parameters $x@1$, \ldots,~$x@l$ and +assumptions $\theta@1$, \ldots, $\theta@k$ accumulate, forming a context for +each subgoal. Repeated lifting steps can lift a rule into any context. To +aid readability, Isabelle puts contexts into a normal form, gathering the +parameters at the front: +\begin{equation} \label{context-eqn} +\Forall x@1 \ldots x@l. \List{\theta@1; \ldots; \theta@k}\Imp\theta. +\end{equation} +Under the usual reading of the connectives, this expresses that $\theta$ +follows from $\theta@1$,~\ldots~$\theta@k$ for arbitrary +$x@1$,~\ldots,~$x@l$. It is trivially true if $\theta$ equals any of +$\theta@1$,~\ldots~$\theta@k$, or is unifiable with any of them. This +models proof by assumption in natural deduction. + +Isabelle automates the meta-inference for proof by assumption. Its arguments +are the meta-theorem $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, and some~$i$ +from~1 to~$n$, where $\phi@i$ has the form~(\ref{context-eqn}). Its results +are meta-theorems of the form +\[ (\List{\phi@1; \ldots; \phi@{i-1}; \phi@{i+1}; \phi@n} \Imp \phi)s \] +for each $s$ and~$j$ such that $s$ unifies $\lambda x@1 \ldots x@l. \theta@j$ +with $\lambda x@1 \ldots x@l. \theta$. Isabelle supplies the parameters +$x@1$,~\ldots,~$x@l$ to higher-order unification as bound variables, which +regards them as unique constants with a limited scope --- this enforces +parameter provisos~\cite{paulson-found}. + +The premise represents a proof state with~$n$ subgoals, of which the~$i$th +is to be solved by assumption. Isabelle searches the subgoal's context for +an assumption~$\theta@j$ that can solve it. For each unifier, the +meta-inference returns an instantiated proof state from which the $i$th +subgoal has been removed. Isabelle searches for a unifying assumption; for +readability and robustness, proofs do not refer to assumptions by number. + +Consider the proof state +\[ (\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x}). \] +Proof by assumption (with $i=1$, the only possibility) yields two results: +\begin{itemize} + \item $Q(a)$, instantiating $\Var{x}\equiv a$ + \item $Q(b)$, instantiating $\Var{x}\equiv b$ +\end{itemize} +Here, proof by assumption affects the main goal. It could also affect +other subgoals; if we also had the subgoal ${\List{P(b); P(c)} \Imp + P(\Var{x})}$, then $\Var{x}\equiv a$ would transform it to ${\List{P(b); + P(c)} \Imp P(a)}$, which might be unprovable. + + +\subsection{A propositional proof} \label{prop-proof} +\index{examples!propositional} +Our first example avoids quantifiers. Given the main goal $P\disj P\imp +P$, Isabelle creates the initial state +\[ (P\disj P\imp P)\Imp (P\disj P\imp P). \] +% +Bear in mind that every proof state we derive will be a meta-theorem, +expressing that the subgoals imply the main goal. Our aim is to reach the +state $P\disj P\imp P$; this meta-theorem is the desired result. + +The first step is to refine subgoal~1 by (${\imp}I)$, creating a new state +where $P\disj P$ is an assumption: +\[ (P\disj P\Imp P)\Imp (P\disj P\imp P) \] +The next step is $(\disj E)$, which replaces subgoal~1 by three new subgoals. +Because of lifting, each subgoal contains a copy of the context --- the +assumption $P\disj P$. (In fact, this assumption is now redundant; we shall +shortly see how to get rid of it!) The new proof state is the following +meta-theorem, laid out for clarity: +\[ \begin{array}{l@{}l@{\qquad\qquad}l} + \lbrakk\;& P\disj P\Imp \Var{P@1}\disj\Var{Q@1}; & \hbox{(subgoal 1)} \\ + & \List{P\disj P; \Var{P@1}} \Imp P; & \hbox{(subgoal 2)} \\ + & \List{P\disj P; \Var{Q@1}} \Imp P & \hbox{(subgoal 3)} \\ + \rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)} + \end{array} +\] +Notice the unknowns in the proof state. Because we have applied $(\disj E)$, +we must prove some disjunction, $\Var{P@1}\disj\Var{Q@1}$. Of course, +subgoal~1 is provable by assumption. This instantiates both $\Var{P@1}$ and +$\Var{Q@1}$ to~$P$ throughout the proof state: +\[ \begin{array}{l@{}l@{\qquad\qquad}l} + \lbrakk\;& \List{P\disj P; P} \Imp P; & \hbox{(subgoal 1)} \\ + & \List{P\disj P; P} \Imp P & \hbox{(subgoal 2)} \\ + \rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)} + \end{array} \] +Both of the remaining subgoals can be proved by assumption. After two such +steps, the proof state is $P\disj P\imp P$. + + +\subsection{A quantifier proof} +\index{examples!with quantifiers} +To illustrate quantifiers and $\Forall$-lifting, let us prove +$(\exists x.P(f(x)))\imp(\exists x.P(x))$. The initial proof +state is the trivial meta-theorem +\[ (\exists x.P(f(x)))\imp(\exists x.P(x)) \Imp + (\exists x.P(f(x)))\imp(\exists x.P(x)). \] +As above, the first step is refinement by (${\imp}I)$: +\[ (\exists x.P(f(x))\Imp \exists x.P(x)) \Imp + (\exists x.P(f(x)))\imp(\exists x.P(x)) +\] +The next step is $(\exists E)$, which replaces subgoal~1 by two new subgoals. +Both have the assumption $\exists x.P(f(x))$. The new proof +state is the meta-theorem +\[ \begin{array}{l@{}l@{\qquad\qquad}l} + \lbrakk\;& \exists x.P(f(x)) \Imp \exists x.\Var{P@1}(x); & \hbox{(subgoal 1)} \\ + & \Forall x.\List{\exists x.P(f(x)); \Var{P@1}(x)} \Imp + \exists x.P(x) & \hbox{(subgoal 2)} \\ + \rbrakk\;& \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) & \hbox{(main goal)} + \end{array} +\] +The unknown $\Var{P@1}$ appears in both subgoals. Because we have applied +$(\exists E)$, we must prove $\exists x.\Var{P@1}(x)$, where $\Var{P@1}(x)$ may +become any formula possibly containing~$x$. Proving subgoal~1 by assumption +instantiates $\Var{P@1}$ to~$\lambda x.P(f(x))$: +\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp + \exists x.P(x)\right) + \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) +\] +The next step is refinement by $(\exists I)$. The rule is lifted into the +context of the parameter~$x$ and the assumption $P(f(x))$. This copies +the context to the subgoal and allows the existential witness to +depend upon~$x$: +\[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp + P(\Var{x@2}(x))\right) + \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) +\] +The existential witness, $\Var{x@2}(x)$, consists of an unknown +applied to a parameter. Proof by assumption unifies $\lambda x.P(f(x))$ +with $\lambda x.P(\Var{x@2}(x))$, instantiating $\Var{x@2}$ to $f$. The final +proof state contains no subgoals: $(\exists x.P(f(x)))\imp(\exists x.P(x))$. + + +\subsection{Tactics and tacticals} +\index{tactics|bold}\index{tacticals|bold} +{\bf Tactics} perform backward proof. Isabelle tactics differ from those +of {\sc lcf}, {\sc hol} and Nuprl by operating on entire proof states, +rather than on individual subgoals. An Isabelle tactic is a function that +takes a proof state and returns a sequence (lazy list) of possible +successor states. Lazy lists are coded in ML as functions, a standard +technique~\cite{paulson-ml2}. Isabelle represents proof states by theorems. + +Basic tactics execute the meta-rules described above, operating on a +given subgoal. The {\bf resolution tactics} take a list of rules and +return next states for each combination of rule and unifier. The {\bf +assumption tactic} examines the subgoal's assumptions and returns next +states for each combination of assumption and unifier. Lazy lists are +essential because higher-order resolution may return infinitely many +unifiers. If there are no matching rules or assumptions then no next +states are generated; a tactic application that returns an empty list is +said to {\bf fail}. + +Sequences realize their full potential with {\bf tacticals} --- operators +for combining tactics. Depth-first search, breadth-first search and +best-first search (where a heuristic function selects the best state to +explore) return their outcomes as a sequence. Isabelle provides such +procedures in the form of tacticals. Simpler procedures can be expressed +directly using the basic tacticals {\tt THEN}, {\tt ORELSE} and {\tt REPEAT}: +\begin{ttdescription} +\item[$tac1$ THEN $tac2$] is a tactic for sequential composition. Applied +to a proof state, it returns all states reachable in two steps by applying +$tac1$ followed by~$tac2$. + +\item[$tac1$ ORELSE $tac2$] is a choice tactic. Applied to a state, it +tries~$tac1$ and returns the result if non-empty; otherwise, it uses~$tac2$. + +\item[REPEAT $tac$] is a repetition tactic. Applied to a state, it +returns all states reachable by applying~$tac$ as long as possible --- until +it would fail. +\end{ttdescription} +For instance, this tactic repeatedly applies $tac1$ and~$tac2$, giving +$tac1$ priority: +\begin{center} \tt +REPEAT($tac1$ ORELSE $tac2$) +\end{center} + + +\section{Variations on resolution} +In principle, resolution and proof by assumption suffice to prove all +theorems. However, specialized forms of resolution are helpful for working +with elimination rules. Elim-resolution applies an elimination rule to an +assumption; destruct-resolution is similar, but applies a rule in a forward +style. + +The last part of the section shows how the techniques for proving theorems +can also serve to derive rules. + +\subsection{Elim-resolution} +\index{elim-resolution|bold}\index{assumptions!deleting} + +Consider proving the theorem $((R\disj R)\disj R)\disj R\imp R$. By +$({\imp}I)$, we prove~$R$ from the assumption $((R\disj R)\disj R)\disj R$. +Applying $(\disj E)$ to this assumption yields two subgoals, one that +assumes~$R$ (and is therefore trivial) and one that assumes $(R\disj +R)\disj R$. This subgoal admits another application of $(\disj E)$. Since +natural deduction never discards assumptions, we eventually generate a +subgoal containing much that is redundant: +\[ \List{((R\disj R)\disj R)\disj R; (R\disj R)\disj R; R\disj R; R} \Imp R. \] +In general, using $(\disj E)$ on the assumption $P\disj Q$ creates two new +subgoals with the additional assumption $P$ or~$Q$. In these subgoals, +$P\disj Q$ is redundant. Other elimination rules behave +similarly. In first-order logic, only universally quantified +assumptions are sometimes needed more than once --- say, to prove +$P(f(f(a)))$ from the assumptions $\forall x.P(x)\imp P(f(x))$ and~$P(a)$. + +Many logics can be formulated as sequent calculi that delete redundant +assumptions after use. The rule $(\disj E)$ might become +\[ \infer[\disj\hbox{-left}] + {\Gamma,P\disj Q,\Delta \turn \Theta} + {\Gamma,P,\Delta \turn \Theta && \Gamma,Q,\Delta \turn \Theta} \] +In backward proof, a goal containing $P\disj Q$ on the left of the~$\turn$ +(that is, as an assumption) splits into two subgoals, replacing $P\disj Q$ +by $P$ or~$Q$. But the sequent calculus, with its explicit handling of +assumptions, can be tiresome to use. + +Elim-resolution is Isabelle's way of getting sequent calculus behaviour +from natural deduction rules. It lets an elimination rule consume an +assumption. Elim-resolution combines two meta-theorems: +\begin{itemize} + \item a rule $\List{\psi@1; \ldots; \psi@m} \Imp \psi$ + \item a proof state $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ +\end{itemize} +The rule must have at least one premise, thus $m>0$. Write the rule's +lifted form as $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$. Suppose we +wish to change subgoal number~$i$. + +Ordinary resolution would attempt to reduce~$\phi@i$, +replacing subgoal~$i$ by $m$ new ones. Elim-resolution tries +simultaneously to reduce~$\phi@i$ and to solve~$\psi'@1$ by assumption; it +returns a sequence of next states. Each of these replaces subgoal~$i$ by +instances of $\psi'@2$, \ldots, $\psi'@m$ from which the selected +assumption has been deleted. Suppose $\phi@i$ has the parameter~$x$ and +assumptions $\theta@1$,~\ldots,~$\theta@k$. Then $\psi'@1$, the rule's first +premise after lifting, will be +\( \Forall x. \List{\theta@1; \ldots; \theta@k}\Imp \psi^{x}@1 \). +Elim-resolution tries to unify $\psi'\qeq\phi@i$ and +$\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1$ simultaneously, for +$j=1$,~\ldots,~$k$. + +Let us redo the example from~\S\ref{prop-proof}. The elimination rule +is~$(\disj E)$, +\[ \List{\Var{P}\disj \Var{Q};\; \Var{P}\Imp \Var{R};\; \Var{Q}\Imp \Var{R}} + \Imp \Var{R} \] +and the proof state is $(P\disj P\Imp P)\Imp (P\disj P\imp P)$. The +lifted rule is +\[ \begin{array}{l@{}l} + \lbrakk\;& P\disj P \Imp \Var{P@1}\disj\Var{Q@1}; \\ + & \List{P\disj P ;\; \Var{P@1}} \Imp \Var{R@1}; \\ + & \List{P\disj P ;\; \Var{Q@1}} \Imp \Var{R@1} \\ + \rbrakk\;& \Imp (P\disj P \Imp \Var{R@1}) + \end{array} +\] +Unification takes the simultaneous equations +$P\disj P \qeq \Var{P@1}\disj\Var{Q@1}$ and $\Var{R@1} \qeq P$, yielding +$\Var{P@1}\equiv\Var{Q@1}\equiv\Var{R@1} \equiv P$. The new proof state +is simply +\[ \List{P \Imp P;\; P \Imp P} \Imp (P\disj P\imp P). +\] +Elim-resolution's simultaneous unification gives better control +than ordinary resolution. Recall the substitution rule: +$$ \List{\Var{t}=\Var{u}; \Var{P}(\Var{t})} \Imp \Var{P}(\Var{u}) +\eqno(subst) $$ +Unsuitable for ordinary resolution because $\Var{P}(\Var{u})$ admits many +unifiers, $(subst)$ works well with elim-resolution. It deletes some +assumption of the form $x=y$ and replaces every~$y$ by~$x$ in the subgoal +formula. The simultaneous unification instantiates $\Var{u}$ to~$y$; if +$y$ is not an unknown, then $\Var{P}(y)$ can easily be unified with another +formula. + +In logical parlance, the premise containing the connective to be eliminated +is called the \bfindex{major premise}. Elim-resolution expects the major +premise to come first. The order of the premises is significant in +Isabelle. + +\subsection{Destruction rules} \label{destruct} +\index{rules!destruction}\index{rules!elimination} +\index{forward proof} + +Looking back to Fig.\ts\ref{fol-fig}, notice that there are two kinds of +elimination rule. The rules $({\conj}E1)$, $({\conj}E2)$, $({\imp}E)$ and +$({\forall}E)$ extract the conclusion from the major premise. In Isabelle +parlance, such rules are called {\bf destruction rules}; they are readable +and easy to use in forward proof. The rules $({\disj}E)$, $({\bot}E)$ and +$({\exists}E)$ work by discharging assumptions; they support backward proof +in a style reminiscent of the sequent calculus. + +The latter style is the most general form of elimination rule. In natural +deduction, there is no way to recast $({\disj}E)$, $({\bot}E)$ or +$({\exists}E)$ as destruction rules. But we can write general elimination +rules for $\conj$, $\imp$ and~$\forall$: +\[ +\infer{R}{P\conj Q & \infer*{R}{[P,Q]}} \qquad +\infer{R}{P\imp Q & P & \infer*{R}{[Q]}} \qquad +\infer{Q}{\forall x.P & \infer*{Q}{[P[t/x]]}} +\] +Because they are concise, destruction rules are simpler to derive than the +corresponding elimination rules. To facilitate their use in backward +proof, Isabelle provides a means of transforming a destruction rule such as +\[ \infer[\quad\hbox{to the elimination rule}\quad]{Q}{P@1 & \ldots & P@m} + \infer{R.}{P@1 & \ldots & P@m & \infer*{R}{[Q]}} +\] +{\bf Destruct-resolution}\index{destruct-resolution} combines this +transformation with elim-resolution. It applies a destruction rule to some +assumption of a subgoal. Given the rule above, it replaces the +assumption~$P@1$ by~$Q$, with new subgoals of showing instances of $P@2$, +\ldots,~$P@m$. Destruct-resolution works forward from a subgoal's +assumptions. Ordinary resolution performs forward reasoning from theorems, +as illustrated in~\S\ref{joining}. + + +\subsection{Deriving rules by resolution} \label{deriving} +\index{rules!derived|bold}\index{meta-assumptions!syntax of} +The meta-logic, itself a form of the predicate calculus, is defined by a +system of natural deduction rules. Each theorem may depend upon +meta-assumptions. The theorem that~$\phi$ follows from the assumptions +$\phi@1$, \ldots, $\phi@n$ is written +\[ \phi \quad [\phi@1,\ldots,\phi@n]. \] +A more conventional notation might be $\phi@1,\ldots,\phi@n \turn \phi$, +but Isabelle's notation is more readable with large formulae. + +Meta-level natural deduction provides a convenient mechanism for deriving +new object-level rules. To derive the rule +\[ \infer{\phi,}{\theta@1 & \ldots & \theta@k} \] +assume the premises $\theta@1$,~\ldots,~$\theta@k$ at the +meta-level. Then prove $\phi$, possibly using these assumptions. +Starting with a proof state $\phi\Imp\phi$, assumptions may accumulate, +reaching a final state such as +\[ \phi \quad [\theta@1,\ldots,\theta@k]. \] +The meta-rule for $\Imp$ introduction discharges an assumption. +Discharging them in the order $\theta@k,\ldots,\theta@1$ yields the +meta-theorem $\List{\theta@1; \ldots; \theta@k} \Imp \phi$, with no +assumptions. This represents the desired rule. +Let us derive the general $\conj$ elimination rule: +$$ \infer{R}{P\conj Q & \infer*{R}{[P,Q]}} \eqno(\conj E) $$ +We assume $P\conj Q$ and $\List{P;Q}\Imp R$, and commence backward proof in +the state $R\Imp R$. Resolving this with the second assumption yields the +state +\[ \phantom{\List{P\conj Q;\; P\conj Q}} + \llap{$\List{P;Q}$}\Imp R \quad [\,\List{P;Q}\Imp R\,]. \] +Resolving subgoals~1 and~2 with~$({\conj}E1)$ and~$({\conj}E2)$, +respectively, yields the state +\[ \List{P\conj \Var{Q@1};\; \Var{P@2}\conj Q}\Imp R + \quad [\,\List{P;Q}\Imp R\,]. +\] +The unknowns $\Var{Q@1}$ and~$\Var{P@2}$ arise from unconstrained +subformulae in the premises of~$({\conj}E1)$ and~$({\conj}E2)$. Resolving +both subgoals with the assumption $P\conj Q$ instantiates the unknowns to yield +\[ R \quad [\, \List{P;Q}\Imp R, P\conj Q \,]. \] +The proof may use the meta-assumptions in any order, and as often as +necessary; when finished, we discharge them in the correct order to +obtain the desired form: +\[ \List{P\conj Q;\; \List{P;Q}\Imp R} \Imp R \] +We have derived the rule using free variables, which prevents their +premature instantiation during the proof; we may now replace them by +schematic variables. + +\begin{warn} + Schematic variables are not allowed in meta-assumptions, for a variety of + reasons. Meta-assumptions remain fixed throughout a proof. +\end{warn} +