summary |
shortlog |
changelog |
graph |
tags |
bookmarks |
branches |
files |
changeset |
file |
latest |
revisions |
annotate |
diff |
comparison |
raw |
help

doc-src/Intro/foundations.tex

changeset 296 | e1f6cd9f682e |

parent 105 | 216d6ed87399 |

child 312 | 7ceea59b4748 |

--- a/doc-src/Intro/foundations.tex Wed Mar 23 16:56:44 1994 +0100 +++ b/doc-src/Intro/foundations.tex Thu Mar 24 13:25:12 1994 +0100 @@ -1,10 +1,13 @@ %% $Id$ \part{Foundations} -This Part presents Isabelle's logical foundations in detail: +The following sections discuss Isabelle's logical foundations in detail: representing logical syntax in the typed $\lambda$-calculus; expressing inference rules in Isabelle's meta-logic; combining rules by resolution. -Readers wishing to use Isabelle immediately may prefer to skip straight to -Part~II, using this Part (via the index) for reference only. + +If you wish to use Isabelle immediately, please turn to +page~\pageref{chap:getting}. You can always read about foundations later, +either by returning to this point or by looking up particular items in the +index. \begin{figure} \begin{eqnarray*} @@ -46,32 +49,32 @@ \caption{Intuitionistic first-order logic} \label{fol-fig} \end{figure} -\section{Formalizing logical syntax in Isabelle} +\section{Formalizing logical syntax in Isabelle}\label{sec:logical-syntax} \index{Isabelle!formalizing syntax|bold} Figure~\ref{fol-fig} presents intuitionistic first-order logic, -including equality and the natural numbers. Let us see how to formalize +including equality. Let us see how to formalize this logic in Isabelle, illustrating the main features of Isabelle's polymorphic meta-logic. Isabelle represents syntax using the typed $\lambda$-calculus. We declare a type for each syntactic category of the logic. We declare a constant for -each symbol of the logic, giving each $n$-ary operation an $n$-argument +each symbol of the logic, giving each $n$-place operation an $n$-argument curried function type. Most importantly, $\lambda$-abstraction represents variable binding in quantifiers. \index{$\To$|bold}\index{types} Isabelle has \ML-style type constructors such as~$(\alpha)list$, where $(bool)list$ is the type of lists of booleans. Function types have the -form $\sigma\To\tau$, where $\sigma$ and $\tau$ are types. Functions -taking $n$~arguments require curried function types; we may abbreviate +form $\sigma\To\tau$, where $\sigma$ and $\tau$ are types. Curried +function types may be abbreviated: \[ \sigma@1\To (\cdots \sigma@n\To \tau\cdots) \quad \hbox{as} \quad - [\sigma@1, \ldots, \sigma@n] \To \tau. $$ + [\sigma@1, \ldots, \sigma@n] \To \tau $$ The syntax for terms is summarised below. Note that function application is written $t(u)$ rather than the usual $t\,u$. \[ \begin{array}{ll} - t :: \sigma & \hbox{type constraint, on a term or variable} \\ + t :: \tau & \hbox{type constraint, on a term or bound variable} \\ \lambda x.t & \hbox{abstraction} \\ \lambda x@1\ldots x@n.t & \hbox{curried abstraction, $\lambda x@1. \ldots \lambda x@n.t$} \\ @@ -81,13 +84,13 @@ \] -\subsection{Simple types and constants} -\index{types!simple|bold} -The syntactic categories of our logic (Figure~\ref{fol-fig}) are -{\bf formulae} and {\bf terms}. Formulae denote truth -values, so (following logical tradition) we call their type~$o$. Terms can -be constructed using~0 and~$Suc$, requiring a type~$nat$ of natural -numbers. Later, we shall see how to admit terms of other types. +\subsection{Simple types and constants}\index{types!simple|bold} + +The syntactic categories of our logic (Fig.\ts\ref{fol-fig}) are {\bf + formulae} and {\bf terms}. Formulae denote truth values, so (following +tradition) let us call their type~$o$. To allow~0 and~$Suc(t)$ as terms, +let us declare a type~$nat$ of natural numbers. Later, we shall see +how to admit terms of other types. After declaring the types~$o$ and~$nat$, we may declare constants for the symbols of our logic. Since $\bot$ denotes a truth value (falsity) and 0 @@ -105,62 +108,68 @@ {\neg} & :: & o\To o \\ \conj,\disj,\imp,\bimp & :: & [o,o]\To o \end{eqnarray*} +The binary connectives can be declared as infixes, with appropriate +precedences, so that we write $P\conj Q\disj R$ instead of +$\disj(\conj(P,Q), R)$. -Isabelle allows us to declare the binary connectives as infixes, with -appropriate precedences, so that we write $P\conj Q\disj R$ instead of -$\disj(\conj(P,Q), R)$. +\S\ref{sec:defining-theories} below describes the syntax of Isabelle theory +files and illustrates it by extending our logic with mathematical induction. \subsection{Polymorphic types and constants} \label{polymorphic} \index{types!polymorphic|bold} +\index{equality!polymorphic} + Which type should we assign to the equality symbol? If we tried $[nat,nat]\To o$, then equality would be restricted to the natural numbers; we would have to declare different equality symbols for each type. Isabelle's type system is polymorphic, so we could declare \begin{eqnarray*} - {=} & :: & [\alpha,\alpha]\To o. + {=} & :: & [\alpha,\alpha]\To o, \end{eqnarray*} +where the type variable~$\alpha$ ranges over all types. But this is also wrong. The declaration is too polymorphic; $\alpha$ -ranges over all types, including~$o$ and $nat\To nat$. Thus, it admits +includes types like~$o$ and $nat\To nat$. Thus, it admits $\bot=\neg(\bot)$ and $Suc=Suc$ as formulae, which is acceptable in higher-order logic but not in first-order logic. -Isabelle's \bfindex{classes} control polymorphism. Each type variable -belongs to a class, which denotes a set of types. Classes are partially -ordered by the subclass relation, which is essentially the subset relation -on the sets of types. They closely resemble the classes of the functional -language Haskell~\cite{haskell-tutorial,haskell-report}. Nipkow and -Prehofer discuss type inference for type classes~\cite{nipkow-prehofer}. +Isabelle's {\bf type classes}\index{classes} control +polymorphism~\cite{nipkow-prehofer}. Each type variable belongs to a +class, which denotes a set of types. Classes are partially ordered by the +subclass relation, which is essentially the subset relation on the sets of +types. They closely resemble the classes of the functional language +Haskell~\cite{haskell-tutorial,haskell-report}. Isabelle provides the built-in class $logic$, which consists of the logical types: the ones we want to reason about. Let us declare a class $term$, to consist of all legal types of terms in our logic. The subclass structure is now $term\le logic$. -We declare $nat$ to be in class $term$. Type variables of class~$term$ -should resemble Standard~\ML's equality type variables, which range over -those types that possess an equality test. Thus, we declare the equality -constant by +We put $nat$ in class $term$ by declaring $nat{::}term$. We declare the +equality constant by \begin{eqnarray*} {=} & :: & [\alpha{::}term,\alpha]\To o \end{eqnarray*} +where $\alpha{::}term$ constrains the type variable~$\alpha$ to class +$term$. Such type variables resemble Standard~\ML's equality type +variables. + We give function types and~$o$ the class $logic$ rather than~$term$, since they are not legal types for terms. We may introduce new types of class $term$ --- for instance, type $string$ or $real$ --- at any time. We can even declare type constructors such as $(\alpha)list$, and state that type -$(\sigma)list$ belongs to class~$term$ provided $\sigma$ does; equality +$(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality applies to lists of natural numbers but not to lists of formulae. We may summarize this paragraph by a set of {\bf arity declarations} for type -constructors: -\index{$\To$|bold}\index{arities!declaring} +constructors: \index{$\To$|bold}\index{arities!declaring} \begin{eqnarray*} o & :: & logic \\ {\To} & :: & (logic,logic)logic \\ nat, string, real & :: & term \\ list & :: & (term)term \end{eqnarray*} -Higher-order logic, where equality does apply to truth values and -functions, would require different arity declarations, namely ${o::term}$ +In higher-order logic, equality does apply to truth values and +functions; this requires the arity declarations ${o::term}$ and ${{\To}::(term,term)term}$. The class system can also handle overloading.\index{overloading|bold} We could declare $arith$ to be the subclass of $term$ consisting of the `arithmetic' types, such as~$nat$. @@ -176,33 +185,32 @@ {+},{-},{\times},{/} & :: & [complex,complex]\To complex \end{eqnarray*} Isabelle will regard these as distinct constants, each of which can be defined -separately. We could even introduce the type $(\alpha)vector$, make -$(\sigma)vector$ belong to $arith$ provided $\sigma$ is in $arith$, and define +separately. We could even introduce the type $(\alpha)vector$ and declare +its arity as $(arith)arith$. Then we could declare the constant \begin{eqnarray*} - {+} & :: & [(\sigma)vector,(\sigma)vector]\To (\sigma)vector + {+} & :: & [(\alpha)vector,(\alpha)vector]\To (\alpha)vector \end{eqnarray*} -in terms of ${+} :: [\sigma,\sigma]\To \sigma$. +and specify it in terms of ${+} :: [\alpha,\alpha]\To \alpha$. -Although we have pretended so far that a type variable belongs to only one -class --- Isabelle's concrete syntax helps to uphold this illusion --- it -may in fact belong to any finite number of classes. For example suppose -that we had declared yet another class $ord \le term$, the class of all +A type variable may belong to any finite number of classes. Suppose that +we had declared yet another class $ord \le term$, the class of all `ordered' types, and a constant \begin{eqnarray*} {\le} & :: & [\alpha{::}ord,\alpha]\To o. \end{eqnarray*} In this context the variable $x$ in $x \le (x+x)$ will be assigned type -$\alpha{::}\{arith,ord\}$, i.e.\ $\alpha$ belongs to both $arith$ and $ord$. -Semantically the set $\{arith,ord\}$ should be understood -as the intersection of the sets of types represented by $arith$ and $ord$. -Such intersections of classes are called \bfindex{sorts}. The empty -intersection of classes, $\{\}$, contains all types and is thus the -{\bf universal sort}. +$\alpha{::}\{arith,ord\}$, which means $\alpha$ belongs to both $arith$ and +$ord$. Semantically the set $\{arith,ord\}$ should be understood as the +intersection of the sets of types represented by $arith$ and $ord$. Such +intersections of classes are called \bfindex{sorts}. The empty +intersection of classes, $\{\}$, contains all types and is thus the {\bf + universal sort}. -The type checker handles overloading, assigning each term a unique type. For -this to be possible, the class and type declarations must satisfy certain +Even with overloading, each term has a unique, most general type. For this +to be possible, the class and type declarations must satisfy certain technical constraints~\cite{nipkow-prehofer}. + \subsection{Higher types and quantifiers} \index{types!higher|bold} Quantifiers are regarded as operations upon functions. Ignoring polymorphism @@ -242,9 +250,11 @@ \index{implication!meta-level|bold} \index{quantifiers!meta-level|bold} \index{equality!meta-level|bold} -Object-logics are formalized by extending Isabelle's meta-logic, which is -intuitionistic higher-order logic. The meta-level connectives are {\bf -implication}, the {\bf universal quantifier}, and {\bf equality}. + +Object-logics are formalized by extending Isabelle's +meta-logic~\cite{paulson89}, which is intuitionistic higher-order logic. +The meta-level connectives are {\bf implication}, the {\bf universal + quantifier}, and {\bf equality}. \begin{itemize} \item The implication \(\phi\Imp \psi\) means `\(\phi\) implies \(\psi\)', and expresses logical {\bf entailment}. @@ -294,7 +304,7 @@ \subsection{Expressing propositional rules} \index{rules!propositional|bold} We shall illustrate the use of the meta-logic by formalizing the rules of -Figure~\ref{fol-fig}. Each object-level rule is expressed as a meta-level +Fig.\ts\ref{fol-fig}. Each object-level rule is expressed as a meta-level axiom. One of the simplest rules is $(\conj E1)$. Making @@ -375,7 +385,7 @@ The $\forall E$ rule exploits $\beta$-conversion. Hiding $Trueprop$, the $\forall$ axioms are $$ \left(\Forall x. P(x)\right) \Imp \forall x.P(x) \eqno(\forall I) $$ -$$ \forall x.P(x) \Imp P(t). \eqno(\forall E)$$ +$$ (\forall x.P(x)) \Imp P(t). \eqno(\forall E)$$ \noindent We have defined the object-level universal quantifier~($\forall$) @@ -420,10 +430,11 @@ such as lists and their operations, or an entire logic. A mathematical development typically involves many theories in a hierarchy. For example, the pure theory could be extended to form a theory for -Figure~\ref{fol-fig}; this could be extended in two separate ways to form a +Fig.\ts\ref{fol-fig}; this could be extended in two separate ways to form a theory for natural numbers and a theory for lists; the union of these two could be extended into a theory defining the length of a list: -\[ \bf +\begin{tt} +\[ \begin{array}{c@{}c@{}c@{}c@{}c} {} & {} & \hbox{Length} & {} & {} \\ {} & {} & \uparrow & {} & {} \\ @@ -436,27 +447,35 @@ {} & {} &\hbox{Pure}& {} & {} \end{array} \] +\end{tt} Each Isabelle proof typically works within a single theory, which is associated with the proof state. However, many different theories may coexist at the same time, and you may work in each of these during a single session. +\begin{warn} + Confusing problems arise if you work in the wrong theory. Each theory + defines its own syntax. An identifier may be regarded in one theory as a + constant and in another as a variable. +\end{warn} \section{Proof construction in Isabelle} -\index{Isabelle!proof construction in|bold} -There is a one-to-one correspondence between meta-level proofs and -object-level proofs~\cite{paulson89}. To each use of a meta-level axiom, -such as $(\forall I)$, there is a use of the corresponding object-level -rule. Object-level assumptions and parameters have meta-level -counterparts. The meta-level formalization is {\bf faithful}, admitting no -incorrect object-level inferences, and {\bf adequate}, admitting all -correct object-level inferences. These properties must be demonstrated -separately for each object-logic. +\index{Isabelle!proof construction in|bold} + +I have elsewhere described the meta-logic and demonstrated it by +formalizing first-order logic~\cite{paulson89}. There is a one-to-one +correspondence between meta-level proofs and object-level proofs. To each +use of a meta-level axiom, such as $(\forall I)$, there is a use of the +corresponding object-level rule. Object-level assumptions and parameters +have meta-level counterparts. The meta-level formalization is {\bf + faithful}, admitting no incorrect object-level inferences, and {\bf + adequate}, admitting all correct object-level inferences. These +properties must be demonstrated separately for each object-logic. The meta-logic is defined by a collection of inference rules, including equational rules for the $\lambda$-calculus, and logical rules. The rules for~$\Imp$ and~$\Forall$ resemble those for~$\imp$ and~$\forall$ in -Figure~\ref{fol-fig}. Proofs performed using the primitive meta-rules +Fig.\ts\ref{fol-fig}. Proofs performed using the primitive meta-rules would be lengthy; Isabelle proofs normally use certain derived rules. {\bf Resolution}, in particular, is convenient for backward proof. @@ -468,35 +487,33 @@ its inputs, we need not state how many clock ticks will be required. Such quantities often emerge from the proof. -\index{variables!schematic|see{unknowns}} Isabelle provides {\bf schematic variables}, or \bfindex{unknowns}, for -unification. Logically, unknowns are free variables. Pragmatically, they -may be instantiated during a proof, while ordinary variables remain fixed. -Unknowns are written with a ?\ prefix and are frequently subscripted: -$\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots, $\Var{P}$, $\Var{P@1}$, \ldots. +unification. Logically, unknowns are free variables. But while ordinary +variables remain fixed, unification may instantiate unknowns. Unknowns are +written with a ?\ prefix and are frequently subscripted: $\Var{a}$, +$\Var{a@1}$, $\Var{a@2}$, \ldots, $\Var{P}$, $\Var{P@1}$, \ldots. Recall that an inference rule of the form \[ \infer{\phi}{\phi@1 & \ldots & \phi@n} \] is formalized in Isabelle's meta-logic as the axiom $\List{\phi@1; \ldots; \phi@n} \Imp \phi$. -Such axioms resemble {\sc Prolog}'s Horn clauses, and can be combined by +Such axioms resemble Prolog's Horn clauses, and can be combined by resolution --- Isabelle's principal proof method. Resolution yields both forward and backward proof. Backward proof works by unifying a goal with the conclusion of a rule, whose premises become new subgoals. Forward proof works by unifying theorems with the premises of a rule, deriving a new theorem. -Isabelle axioms will require an extended notion of resolution, because -they differ from Horn clauses in two major respects: +Isabelle axioms require an extended notion of resolution. +They differ from Horn clauses in two major respects: \begin{itemize} \item They are written in the typed $\lambda$-calculus, and therefore must be resolved using higher-order unification. - \item Horn clauses are composed of atomic formulae. Any formula of the form -$Trueprop(\cdots)$ is atomic, but axioms such as ${\imp}I$ and $\forall I$ -contain non-atomic formulae. -\index{Trueprop@{$Trueprop$}} +\item The constituents of a clause need not be atomic formulae. Any + formula of the form $Trueprop(\cdots)$ is atomic, but axioms such as + ${\imp}I$ and $\forall I$ contain non-atomic formulae. \end{itemize} -Isabelle should not be confused with classical resolution theorem provers +Isabelle has little in common with classical resolution theorem provers such as Otter~\cite{wos-bledsoe}. At the meta-level, Isabelle proves theorems in their positive form, not by refutation. However, an object-logic that includes a contradiction rule may employ a refutation @@ -649,7 +666,7 @@ an equation like $Suc(Suc(Suc(m)))=Suc(Suc(Suc(0)))$. -\subsection{Lifting a rule into a context} +\section{Lifting a rule into a context} The rules $({\imp}I)$ and $(\forall I)$ may seem unsuitable for resolution. They have non-atomic premises, namely $P\Imp Q$ and $\Forall x.P(x)$, while the conclusions of all the rules are atomic (they have the form @@ -657,20 +674,20 @@ called \bfindex{lifting}. Let us consider how to construct proofs such as \[ \infer[({\imp}I)]{P\imp(Q\imp R)} {\infer[({\imp}I)]{Q\imp R} - {\infer*{R}{[P] & [Q]}}} + {\infer*{R}{[P,Q]}}} \qquad \infer[(\forall I)]{\forall x\,y.P(x,y)} {\infer[(\forall I)]{\forall y.P(x,y)}{P(x,y)}} \] -\subsubsection{Lifting over assumptions} +\subsection{Lifting over assumptions} \index{lifting!over assumptions|bold} Lifting over $\theta\Imp{}$ is the following meta-inference rule: \[ \infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp (\theta \Imp \phi)} {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \] This is clearly sound: if $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ is true and -$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$, $\theta$ are all true then +$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$ and $\theta$ are all true then $\phi$ must be true. Iterated lifting over a series of meta-formulae $\theta@k$, \ldots, $\theta@1$ yields an object-rule whose conclusion is $\List{\theta@1; \ldots; \theta@k} \Imp \phi$. Typically the $\theta@i$ are @@ -682,7 +699,10 @@ \[ (\Var{P} \Imp \Var{Q}) \Imp \Var{P}\imp \Var{Q}. \] To resolve this rule with itself, Isabelle modifies one copy as follows: it renames the unknowns to $\Var{P@1}$ and $\Var{Q@1}$, then lifts the rule over -$\Var{P}\Imp{}$: +$\Var{P}\Imp{}$ to obtain +\[ (\Var{P}\Imp (\Var{P@1} \Imp \Var{Q@1})) \Imp (\Var{P} \Imp + (\Var{P@1}\imp \Var{Q@1})). \] +Using the $\List{\cdots}$ abbreviation, this can be written as \[ \List{\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}; \Var{P}} \Imp \Var{P@1}\imp \Var{Q@1}. \] Unifying $\Var{P}\Imp \Var{P@1}\imp\Var{Q@1}$ with $\Var{P} \Imp @@ -693,7 +713,7 @@ This represents the derived rule \[ \infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}} \] -\subsubsection{Lifting over parameters} +\subsection{Lifting over parameters} \index{lifting!over parameters|bold} An analogous form of lifting handles premises of the form $\Forall x\ldots\,$. Here, lifting prefixes an object-rule's premises and conclusion with $\Forall @@ -704,8 +724,13 @@ \[ \infer{\List{\Forall x.\phi@1^x; \ldots; \Forall x.\phi@n^x} \Imp \Forall x.\phi^x,} {\List{\phi@1; \ldots; \phi@n} \Imp \phi} \] -where $\phi^x$ stands for the result of lifting unknowns over~$x$ in $\phi$. -It is not hard to verify that this meta-inference is sound. +% +where $\phi^x$ stands for the result of lifting unknowns over~$x$ in +$\phi$. It is not hard to verify that this meta-inference is sound. If +$\phi\Imp\psi$ then $\phi^x\Imp\psi^x$ for all~$x$; so if $\phi^x$ is true +for all~$x$ then so is $\psi^x$. Thus, from $\phi\Imp\psi$ we conclude +$(\Forall x.\phi^x) \Imp (\Forall x.\psi^x)$. + For example, $(\disj I)$ might be lifted to \[ (\Forall x.\Var{P@1}(x)) \Imp (\Forall x. \Var{P@1}(x)\disj \Var{Q@1}(x))\] and $(\forall I)$ to @@ -722,22 +747,24 @@ \quad\hbox{provided $x$, $y$ not free in the assumptions} \] I discuss lifting and parameters at length elsewhere~\cite{paulson89}. -Miller goes into even greater detail~\cite{miller-jsc}. +Miller goes into even greater detail~\cite{miller-mixed}. \section{Backward proof by resolution} -\index{resolution!in backward proof}\index{proof!backward|bold} +\index{resolution!in backward proof}\index{proof!backward|bold} +\index{tactics}\index{tacticals} + Resolution is convenient for deriving simple rules and for reasoning forward from facts. It can also support backward proof, where we start with a goal and refine it to progressively simpler subgoals until all have -been solved. {\sc lcf} (and its descendants {\sc hol} and Nuprl) provide +been solved. {\sc lcf} and its descendants {\sc hol} and Nuprl provide tactics and tacticals, which constitute a high-level language for -expressing proof searches. \bfindex{Tactics} perform primitive refinements -while \bfindex{tacticals} combine tactics. +expressing proof searches. {\bf Tactics} refine subgoals while {\bf + tacticals} combine tactics. \index{LCF} Isabelle's tactics and tacticals work differently from {\sc lcf}'s. An -Isabelle rule is {\bf bidirectional}: there is no distinction between +Isabelle rule is bidirectional: there is no distinction between inputs and outputs. {\sc lcf} has a separate tactic for each rule; Isabelle performs refinement by any rule in a uniform fashion, using resolution. @@ -753,8 +780,8 @@ To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof state. This assertion is, trivially, a theorem. At a later stage in the backward proof, a typical proof state is $\List{\phi@1; \ldots; \phi@n} -\Imp \phi$. This proof state is a theorem, guaranteeing that the subgoals -$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$. If $m=0$ then we have +\Imp \phi$. This proof state is a theorem, ensuring that the subgoals +$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$. If $n=0$ then we have proved~$\phi$ outright. If $\phi$ contains unknowns, they may become instantiated during the proof; a proof state may be $\List{\phi@1; \ldots; \phi@n} \Imp \phi'$, where $\phi'$ is an instance of~$\phi$. @@ -806,32 +833,38 @@ regards them as unique constants with a limited scope --- this enforces parameter provisos~\cite{paulson89}. -The premise represents a proof state with~$n$ subgoals, of which the~$i$th is -to be solved by assumption. Isabelle searches the subgoal's context for an -assumption, say $\theta@j$, that can solve it by unification. For each -unifier, the meta-inference returns an instantiated proof state from which the -$i$th subgoal has been removed. Isabelle searches for a unifying assumption; -for readability and robustness, proofs do not refer to assumptions by number. +The premise represents a proof state with~$n$ subgoals, of which the~$i$th +is to be solved by assumption. Isabelle searches the subgoal's context for +an assumption~$\theta@j$ that can solve it. For each unifier, the +meta-inference returns an instantiated proof state from which the $i$th +subgoal has been removed. Isabelle searches for a unifying assumption; for +readability and robustness, proofs do not refer to assumptions by number. -Consider the proof state $(\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x})$. +Consider the proof state +\[ (\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x}). \] Proof by assumption (with $i=1$, the only possibility) yields two results: \begin{itemize} \item $Q(a)$, instantiating $\Var{x}\equiv a$ \item $Q(b)$, instantiating $\Var{x}\equiv b$ \end{itemize} Here, proof by assumption affects the main goal. It could also affect -other subgoals. Consider the effect of having the -additional subgoal ${\List{P(b); P(c)} \Imp P(\Var{x})}$. +other subgoals; if we also had the subgoal ${\List{P(b); P(c)} \Imp + P(\Var{x})}$, then $\Var{x}\equiv a$ would transform it to ${\List{P(b); + P(c)} \Imp P(a)}$, which might be unprovable. + \subsection{A propositional proof} \label{prop-proof} \index{examples!propositional} Our first example avoids quantifiers. Given the main goal $P\disj P\imp P$, Isabelle creates the initial state -\[ (P\disj P\imp P)\Imp (P\disj P\imp P). \] +\[ (P\disj P\imp P)\Imp (P\disj P\imp P). \] +% Bear in mind that every proof state we derive will be a meta-theorem, -expressing that the subgoals imply the main goal. The first step is to refine -subgoal~1 by (${\imp}I)$, creating a new state where $P\disj P$ is an -assumption: +expressing that the subgoals imply the main goal. Our aim is to reach the +state $P\disj P\imp P$; this meta-theorem is the desired result. + +The first step is to refine subgoal~1 by (${\imp}I)$, creating a new state +where $P\disj P$ is an assumption: \[ (P\disj P\Imp P)\Imp (P\disj P\imp P) \] The next step is $(\disj E)$, which replaces subgoal~1 by three new subgoals. Because of lifting, each subgoal contains a copy of the context --- the @@ -855,8 +888,8 @@ \rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)} \end{array} \] Both of the remaining subgoals can be proved by assumption. After two such -steps, the proof state is simply the meta-theorem $P\disj P\imp P$. This is -our desired result. +steps, the proof state is $P\disj P\imp P$. + \subsection{A quantifier proof} \index{examples!with quantifiers} @@ -888,8 +921,8 @@ \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) \] The next step is refinement by $(\exists I)$. The rule is lifted into the -context of the parameter~$x$ and the assumption $P(f(x))$. This ensures that -the context is copied to the subgoal and allows the existential witness to +context of the parameter~$x$ and the assumption $P(f(x))$. This copies +the context to the subgoal and allows the existential witness to depend upon~$x$: \[ \left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp P(\Var{x@2}(x))\right) @@ -907,7 +940,7 @@ of {\sc lcf}, {\sc hol} and Nuprl by operating on entire proof states, rather than on individual subgoals. An Isabelle tactic is a function that takes a proof state and returns a sequence (lazy list) of possible -successor states. Sequences are coded in ML as functions, a standard +successor states. Lazy lists are coded in ML as functions, a standard technique~\cite{paulson91}. Isabelle represents proof states by theorems. Basic tactics execute the meta-rules described above, operating on a @@ -984,21 +1017,27 @@ Elim-resolution is Isabelle's way of getting sequent calculus behaviour from natural deduction rules. It lets an elimination rule consume an -assumption. Elim-resolution takes a rule $\List{\psi@1; \ldots; \psi@m} -\Imp \psi$ a proof state $\List{\phi@1; \ldots; \phi@n} \Imp \phi$, and a -subgoal number~$i$. The rule must have at least one premise, thus $m>0$. -Write the rule's lifted form as $\List{\psi'@1; \ldots; \psi'@m} \Imp -\psi'$. Ordinary resolution would attempt to reduce~$\phi@i$, -replacing subgoal~$i$ by $m$ new ones. Elim-resolution tries {\bf -simultaneously} to reduce~$\phi@i$ and to solve~$\psi'@1$ by assumption; it +assumption. Elim-resolution combines two meta-theorems: +\begin{itemize} + \item a rule $\List{\psi@1; \ldots; \psi@m} \Imp \psi$ + \item a proof state $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ +\end{itemize} +The rule must have at least one premise, thus $m>0$. Write the rule's +lifted form as $\List{\psi'@1; \ldots; \psi'@m} \Imp \psi'$. Suppose we +wish to change subgoal number~$i$. + +Ordinary resolution would attempt to reduce~$\phi@i$, +replacing subgoal~$i$ by $m$ new ones. Elim-resolution tries +simultaneously to reduce~$\phi@i$ and to solve~$\psi'@1$ by assumption; it returns a sequence of next states. Each of these replaces subgoal~$i$ by instances of $\psi'@2$, \ldots, $\psi'@m$ from which the selected assumption has been deleted. Suppose $\phi@i$ has the parameter~$x$ and assumptions $\theta@1$,~\ldots,~$\theta@k$. Then $\psi'@1$, the rule's first premise after lifting, will be \( \Forall x. \List{\theta@1; \ldots; \theta@k}\Imp \psi^{x}@1 \). -Elim-resolution tries to unify simultaneously $\psi'\qeq\phi@i$ and -$\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1$, for $j=1$,~\ldots,~$k$. +Elim-resolution tries to unify $\psi'\qeq\phi@i$ and +$\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1$ simultaneously, for +$j=1$,~\ldots,~$k$. Let us redo the example from~\S\ref{prop-proof}. The elimination rule is~$(\disj E)$, @@ -1037,7 +1076,7 @@ \subsection{Destruction rules} \label{destruct} \index{destruction rules|bold}\index{proof!forward} -Looking back to Figure~\ref{fol-fig}, notice that there are two kinds of +Looking back to Fig.\ts\ref{fol-fig}, notice that there are two kinds of elimination rule. The rules $({\conj}E1)$, $({\conj}E2)$, $({\imp}E)$ and $({\forall}E)$ extract the conclusion from the major premise. In Isabelle parlance, such rules are called \bfindex{destruction rules}; they are readable @@ -1113,11 +1152,7 @@ schematic variables. \begin{warn} -Schematic variables are not allowed in (meta) assumptions because they would -complicate lifting. Assumptions remain fixed throughout a proof. +Schematic variables are not allowed in meta-assumptions because they would +complicate lifting. Meta-assumptions remain fixed throughout a proof. \end{warn} -% Local Variables: -% mode: latex -% TeX-master: t -% End: