penultimate Springer draft
Fri, 15 Apr 1994 12:13:37 +0200
changeset 312 7ceea59b4748
parent 311 3fb8cdb32e10
child 313 a45ae7b38672
penultimate Springer draft
--- a/doc-src/Intro/foundations.tex	Fri Apr 15 11:48:23 1994 +0200
+++ b/doc-src/Intro/foundations.tex	Fri Apr 15 12:13:37 1994 +0200
@@ -50,29 +50,32 @@
 \section{Formalizing logical syntax in Isabelle}\label{sec:logical-syntax}
-\index{Isabelle!formalizing syntax|bold}
 Figure~\ref{fol-fig} presents intuitionistic first-order logic,
 including equality.  Let us see how to formalize
 this logic in Isabelle, illustrating the main features of Isabelle's
 polymorphic meta-logic.
-Isabelle represents syntax using the typed $\lambda$-calculus.  We declare
-a type for each syntactic category of the logic.  We declare a constant for
-each symbol of the logic, giving each $n$-place operation an $n$-argument
-curried function type.  Most importantly, $\lambda$-abstraction represents
-variable binding in quantifiers.
+Isabelle represents syntax using the simply typed $\lambda$-calculus.  We
+declare a type for each syntactic category of the logic.  We declare a
+constant for each symbol of the logic, giving each $n$-place operation an
+$n$-argument curried function type.  Most importantly,
+$\lambda$-abstraction represents variable binding in quantifiers.
+\index{lambda calc@$\lambda$-calculus} \index{lambda
+  abs@$\lambda$-abstractions}
+\index{types!syntax of}\index{types!function}\index{*fun type}
 Isabelle has \ML-style type constructors such as~$(\alpha)list$, where
 $(bool)list$ is the type of lists of booleans.  Function types have the
-form $\sigma\To\tau$, where $\sigma$ and $\tau$ are types.  Curried
-function types may be abbreviated:
+form $(\sigma,\tau)fun$ or $\sigma\To\tau$, where $\sigma$ and $\tau$ are
+types.  Curried function types may be abbreviated:
 \[  \sigma@1\To (\cdots \sigma@n\To \tau\cdots)  \quad \hbox{as} \quad
-   [\sigma@1, \ldots, \sigma@n] \To \tau $$ 
+   [\sigma@1, \ldots, \sigma@n] \To \tau \]
+\index{terms!syntax of}
 The syntax for terms is summarised below.  Note that function application is
 written $t(u)$ rather than the usual $t\,u$.
+\index{lambda abs@$\lambda$-abstractions}\index{function applications}
   t :: \tau   & \hbox{type constraint, on a term or bound variable} \\
   \lambda x.t   & \hbox{abstraction} \\
@@ -92,6 +95,7 @@
 let us declare a type~$nat$ of natural numbers.  Later, we shall see
 how to admit terms of other types.
+\index{constants}\index{*nat type}\index{*o type}
 After declaring the types~$o$ and~$nat$, we may declare constants for the
 symbols of our logic.  Since $\bot$ denotes a truth value (falsity) and 0
 denotes a number, we put \begin{eqnarray*}
@@ -119,10 +123,11 @@
 \subsection{Polymorphic types and constants} \label{polymorphic}
 Which type should we assign to the equality symbol?  If we tried
 $[nat,nat]\To o$, then equality would be restricted to the natural
-numbers; we would have to declare different equality symbols for each
+numbers; we should have to declare different equality symbols for each
 type.  Isabelle's type system is polymorphic, so we could declare
   {=}  & :: & [\alpha,\alpha]\To o,
@@ -140,11 +145,13 @@
 types.  They closely resemble the classes of the functional language
+\index{*logic class}\index{*term class}
 Isabelle provides the built-in class $logic$, which consists of the logical
 types: the ones we want to reason about.  Let us declare a class $term$, to
 consist of all legal types of terms in our logic.  The subclass structure
 is now $term\le logic$.
+\index{*nat type}
 We put $nat$ in class $term$ by declaring $nat{::}term$.  We declare the
 equality constant by
@@ -154,23 +161,24 @@
 $term$.  Such type variables resemble Standard~\ML's equality type
-We give function types and~$o$ the class $logic$ rather than~$term$, since
+We give~$o$ and function types the class $logic$ rather than~$term$, since
 they are not legal types for terms.  We may introduce new types of class
 $term$ --- for instance, type $string$ or $real$ --- at any time.  We can
 even declare type constructors such as $(\alpha)list$, and state that type
 $(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality
 applies to lists of natural numbers but not to lists of formulae.  We may
 summarize this paragraph by a set of {\bf arity declarations} for type
-constructors: \index{$\To$|bold}\index{arities!declaring}
-  o     & :: & logic \\
-  {\To} & :: & (logic,logic)logic \\
+\begin{eqnarray*}\index{*o type}\index{*fun type}
+  o             & :: & logic \\
+  fun           & :: & (logic,logic)logic \\
   nat, string, real     & :: & term \\
-  list  & :: & (term)term
+  list          & :: & (term)term
+(Recall that $fun$ is the type constructor for function types.)
 In higher-order logic, equality does apply to truth values and
 functions;  this requires the arity declarations ${o::term}$
-and ${{\To}::(term,term)term}$.  The class system can also handle
+and ${fun::(term,term)term}$.  The class system can also handle
 overloading.\index{overloading|bold} We could declare $arith$ to be the
 subclass of $term$ consisting of the `arithmetic' types, such as~$nat$.
 Then we could declare the operators
@@ -212,7 +220,7 @@
 \subsection{Higher types and quantifiers}
 Quantifiers are regarded as operations upon functions.  Ignoring polymorphism
 for the moment, consider the formula $\forall x. P(x)$, where $x$ ranges
 over type~$nat$.  This is true if $P(x)$ is true for all~$x$.  Abstracting
@@ -244,12 +252,9 @@
 \section{Formalizing logical rules in Isabelle}
-\index{Isabelle!formalizing rules|bold}
 Object-logics are formalized by extending Isabelle's
 meta-logic~\cite{paulson89}, which is intuitionistic higher-order logic.
@@ -263,26 +268,27 @@
 all $x$', and expresses {\bf generality} in rules and axiom schemes. 
 \item The equality \(a\equiv b\) means `$a$ equals $b$', for expressing
-  \bfindex{definitions} (see~\S\ref{definitions}).  Equalities left over
-  from the unification process, so called \bfindex{flex-flex equations},
-  are written $a\qeq b$.  The two equality symbols have the same logical
-  meaning. 
+  {\bf definitions} (see~\S\ref{definitions}).\index{definitions}
+  Equalities left over from the unification process, so called {\bf
+    flex-flex constraints},\index{flex-flex constraints} are written $a\qeq
+  b$.  The two equality symbols have the same logical meaning.
-The syntax of the meta-logic is formalized in precisely in the same manner
-as object-logics, using the typed $\lambda$-calculus.  Analogous to
+The syntax of the meta-logic is formalized in the same manner
+as object-logics, using the simply typed $\lambda$-calculus.  Analogous to
 type~$o$ above, there is a built-in type $prop$ of meta-level truth values.
 Meta-level formulae will have this type.  Type $prop$ belongs to
 class~$logic$; also, $\sigma\To\tau$ belongs to $logic$ provided $\sigma$
 and $\tau$ do.  Here are the types of the built-in connectives:
+\begin{eqnarray*}\index{*prop type}\index{*logic class}
   \Imp     & :: & [prop,prop]\To prop \\
   \Forall  & :: & (\alpha{::}logic\To prop) \To prop \\
   {\equiv} & :: & [\alpha{::}\{\},\alpha]\To prop \\
-  \qeq & :: & [\alpha{::}\{\},\alpha]\To prop c
+  \qeq & :: & [\alpha{::}\{\},\alpha]\To prop
-The restricted polymorphism in $\Forall$ excludes certain types, those used
-just for parsing. 
+The polymorphism in $\Forall$ is restricted to class~$logic$ to exclude
+certain types, those used just for parsing.  The type variable
+$\alpha{::}\{\}$ ranges over the universal sort.
 In our formalization of first-order logic, we declared a type~$o$ of
 object-level truth values, rather than using~$prop$ for this purpose.  If
@@ -292,7 +298,7 @@
 the distinction between the meta-level and the object-level.  To formalize
 the inference rules, we shall need to relate the two levels; accordingly,
 we declare the constant
+\index{*Trueprop constant}
   Trueprop & :: & o\To prop.
@@ -302,7 +308,7 @@
 \subsection{Expressing propositional rules}
 We shall illustrate the use of the meta-logic by formalizing the rules of
 Fig.\ts\ref{fol-fig}.  Each object-level rule is expressed as a meta-level
@@ -315,10 +321,11 @@
 reading is correct because the meta-logic has simple models, where
 types denote sets and $\Forall$ really means `for all.'
+\index{*Trueprop constant}
 Isabelle adopts notational conventions to ease the writing of rules.  We may
 hide the occurrences of $Trueprop$ by making it an implicit coercion.
 Outer universal quantifiers may be dropped.  Finally, the nested implication
 \[  \phi@1\Imp(\cdots \phi@n\Imp\psi\cdots) \]
 may be abbreviated as $\List{\phi@1; \ldots; \phi@n} \Imp \psi$, which
 formalizes a rule of $n$~premises.
@@ -334,8 +341,8 @@
 $$ P \Imp P\disj Q  \qquad  Q \Imp P\disj Q  \eqno(\disj I1,2) $$
 $$ \List{P\disj Q; P\Imp R; Q\Imp R} \Imp R  \eqno(\disj E) $$
-To understand this treatment of assumptions\index{assumptions} in natural
+\noindent\index{assumptions!discharge of}
+To understand this treatment of assumptions in natural
 deduction, look at implication.  The rule $({\imp}I)$ is the classic
 example of natural deduction: to prove that $P\imp Q$ is true, assume $P$
 is true and show that $Q$ must then be true.  More concisely, if $P$
@@ -354,12 +361,13 @@
 Earlier versions of Isabelle, and certain
 papers~\cite{paulson89,paulson700}, use $\List{P}$ to mean $Trueprop(P)$.
 \subsection{Quantifier rules and substitution}
+\index{variables!bound}\index{lambda abs@$\lambda$-abstractions}
+\index{function applications}
 Isabelle expresses variable binding using $\lambda$-abstraction; for instance,
 $\forall x.P$ is formalized as $\forall(\lambda x.P)$.  Recall that $F(t)$
 is Isabelle's syntax for application of the function~$F$ to the argument~$t$;
@@ -375,8 +383,8 @@
 \ldots {\it (some set of formulae)},' where $x$ is a variable (called a {\bf
 parameter} or {\bf eigenvariable}) in some premise.  Isabelle treats
 provisos using~$\Forall$, its inbuilt notion of `for all'.
 The purpose of the proviso `$x$ not free in \ldots' is
 to ensure that the premise may not make assumptions about the value of~$x$,
 and therefore holds for all~$x$.  We formalize $(\forall I)$ by
@@ -394,7 +402,7 @@
 $$ P(t)  \Imp  \exists x.P(x)  \eqno(\exists I)$$
 $$ \List{\exists x.P(x);\; \Forall x. P(x)\Imp Q} \Imp Q  \eqno(\exists E) $$
 Let us verify $(\exists E)$ semantically.  Suppose that the premises
-hold; since $\exists x.P(x)$ is true, we may choose $a$ such that $P(a)$ is
+hold; since $\exists x.P(x)$ is true, we may choose an~$a$ such that $P(a)$ is
 true.  Instantiating $\Forall x. P(x)\Imp Q$ with $a$ yields $P(a)\Imp Q$, and
 we obtain the desired conclusion, $Q$.
@@ -402,16 +410,17 @@
 \[ \infer{P[u/t]}{t=u & P} \]
 would be hard to formalize in Isabelle.  It calls for replacing~$t$ by $u$
 throughout~$P$, which cannot be expressed using $\beta$-conversion.  Our
-rule~$(subst)$ uses the occurrences of~$x$ in~$P$ as a template for
-substitution, inferring $P[u/x]$ from~$P[t/x]$.  When we formalize this as
-an axiom, the template becomes a function variable:
+rule~$(subst)$ uses~$P$ as a template for substitution, inferring $P[u/x]$
+from~$P[t/x]$.  When we formalize this as an axiom, the template becomes a
+function variable:
 $$ \List{t=u; P(t)} \Imp P(u).  \eqno(subst)$$
 \subsection{Signatures and theories}
 A {\bf signature} contains the information necessary for type checking,
-parsing and pretty printing.  It specifies classes and their
+parsing and pretty printing a term.  It specifies classes and their
 relationships; types, with their arities, and constants, with
 their types.  It also contains syntax rules, specified using mixfix
@@ -421,6 +430,7 @@
 Under similar conditions, a signature can be extended.  Signatures are
 managed internally by Isabelle; users seldom encounter them.
 A {\bf theory} consists of a signature plus a collection of axioms.  The
 {\bf pure} theory contains only the meta-logic.  Theories can be combined
 provided their signatures are compatible.  A theory definition extends an
@@ -453,15 +463,13 @@
 coexist at the same time, and you may work in each of these during a single
+\begin{warn}\index{constants!clashes with variables}
   Confusing problems arise if you work in the wrong theory.  Each theory
   defines its own syntax.  An identifier may be regarded in one theory as a
   constant and in another as a variable.
 \section{Proof construction in Isabelle}
-\index{Isabelle!proof construction in|bold} 
 I have elsewhere described the meta-logic and demonstrated it by
 formalizing first-order logic~\cite{paulson89}.  There is a one-to-one
 correspondence between meta-level proofs and object-level proofs.  To each
@@ -487,23 +495,24 @@
 its inputs, we need not state how many clock ticks will be required.  Such
 quantities often emerge from the proof.
-Isabelle provides {\bf schematic variables}, or \bfindex{unknowns}, for
-unification.  Logically, unknowns are free variables.  But while ordinary
-variables remain fixed, unification may instantiate unknowns.  Unknowns are
-written with a ?\ prefix and are frequently subscripted: $\Var{a}$,
-$\Var{a@1}$, $\Var{a@2}$, \ldots, $\Var{P}$, $\Var{P@1}$, \ldots.
+Isabelle provides {\bf schematic variables}, or {\bf
+  unknowns},\index{unknowns} for unification.  Logically, unknowns are free
+variables.  But while ordinary variables remain fixed, unification may
+instantiate unknowns.  Unknowns are written with a ?\ prefix and are
+frequently subscripted: $\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots,
+$\Var{P}$, $\Var{P@1}$, \ldots.
 Recall that an inference rule of the form
 \[ \infer{\phi}{\phi@1 & \ldots & \phi@n} \]
 is formalized in Isabelle's meta-logic as the axiom
-$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.
+$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.\index{resolution}
 Such axioms resemble Prolog's Horn clauses, and can be combined by
 resolution --- Isabelle's principal proof method.  Resolution yields both
 forward and backward proof.  Backward proof works by unifying a goal with
 the conclusion of a rule, whose premises become new subgoals.  Forward proof
 works by unifying theorems with the premises of a rule, deriving a new theorem.
-Isabelle axioms require an extended notion of resolution.
+Isabelle formulae require an extended notion of resolution.
 They differ from Horn clauses in two major respects:
   \item They are written in the typed $\lambda$-calculus, and therefore must be
@@ -533,9 +542,9 @@
  \Var{f}(t) \qeq g(u@1,\ldots,u@k).
 Huet's~\cite{huet75} search procedure solves equations by imitation and
-projection.  {\bf Imitation}\index{imitation|bold} makes~$\Var{f}$ apply
-leading symbol (if a constant) of the right-hand side.  To solve
-equation~(\ref{hou-eqn}), it guesses
+projection.  {\bf Imitation} makes~$\Var{f}$ apply the leading symbol (if a
+constant) of the right-hand side.  To solve equation~(\ref{hou-eqn}), it
 \[ \Var{f} \equiv \lambda x. g(\Var{h@1}(x),\ldots,\Var{h@k}(x)), \]
 where $\Var{h@1}$, \ldots, $\Var{h@k}$ are new unknowns.  Assuming there are no
 other occurrences of~$\Var{f}$, equation~(\ref{hou-eqn}) simplifies to the
@@ -544,7 +553,6 @@
 If the procedure solves these equations, instantiating $\Var{h@1}$, \ldots,
 $\Var{h@k}$, then it yields an instantiation for~$\Var{f}$.
 {\bf Projection} makes $\Var{f}$ apply one of its arguments.  To solve
 equation~(\ref{hou-eqn}), if $t$ expects~$m$ arguments and delivers a
 result of suitable type, it guesses
@@ -554,7 +562,7 @@
 \[ t(\Var{h@1}(t),\ldots,\Var{h@m}(t)) \qeq g(u@1,\ldots,u@k). $$ 
+\begin{warn}\index{unification!incompleteness of}
 Huet's unification procedure is complete.  Isabelle's polymorphic version,
 which solves for type unknowns as well as for term unknowns, is incomplete.
 The problem is that projection requires type information.  In
@@ -565,7 +573,7 @@
 to be a function type.
-\index{unknowns!of function type|bold}
 Given $\Var{f}(t@1,\ldots,t@n)\qeq u$, Huet's procedure could make up to
 $n+1$ guesses.  The search tree and set of unifiers may be infinite.  But
 higher-order unification can work effectively, provided you are careful
@@ -681,7 +689,7 @@
 \subsection{Lifting over assumptions}
-\index{lifting!over assumptions|bold}
+\index{assumptions!lifting over}
 Lifting over $\theta\Imp{}$ is the following meta-inference rule:
 \[ \infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp
           (\theta \Imp \phi)}
@@ -714,7 +722,7 @@
 \[ \infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}} \]
 \subsection{Lifting over parameters}
-\index{lifting!over parameters|bold}
+\index{parameters!lifting over}
 An analogous form of lifting handles premises of the form $\Forall x\ldots\,$. 
 Here, lifting prefixes an object-rule's premises and conclusion with $\Forall
 x$.  At the same time, lifting introduces a dependence upon~$x$.  It replaces
@@ -751,8 +759,7 @@
 \section{Backward proof by resolution}
-\index{resolution!in backward proof}\index{proof!backward|bold} 
+\index{resolution!in backward proof}
 Resolution is convenient for deriving simple rules and for reasoning
 forward from facts.  It can also support backward proof, where we start
@@ -762,19 +769,19 @@
 expressing proof searches.  {\bf Tactics} refine subgoals while {\bf
   tacticals} combine tactics.
+\index{LCF system}
 Isabelle's tactics and tacticals work differently from {\sc lcf}'s.  An
 Isabelle rule is bidirectional: there is no distinction between
 inputs and outputs.  {\sc lcf} has a separate tactic for each rule;
 Isabelle performs refinement by any rule in a uniform fashion, using
-\index{subgoals|bold}\index{main goal|bold}
 Isabelle works with meta-level theorems of the form
 \( \List{\phi@1; \ldots; \phi@n} \Imp \phi \).
 We have viewed this as the {\bf rule} with premises
 $\phi@1$,~\ldots,~$\phi@n$ and conclusion~$\phi$.  It can also be viewed as
-the \bfindex{proof state} with subgoals $\phi@1$,~\ldots,~$\phi@n$ and main
+the {\bf proof state}\index{proof state}
+with subgoals $\phi@1$,~\ldots,~$\phi@n$ and main
 To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof
@@ -787,7 +794,6 @@
 \phi@n} \Imp \phi'$, where $\phi'$ is an instance of~$\phi$.
 \subsection{Refinement by resolution}
 To refine subgoal~$i$ of a proof state by a rule, perform the following
 \[ \infer{\hbox{new proof state}}{\hbox{rule} & & \hbox{proof state}} \]
@@ -807,7 +813,7 @@
 the proof state.
 \subsection{Proof by assumption}
-\index{proof!by assumption|bold}
+\index{assumptions!use of}
 In the course of a natural deduction proof, parameters $x@1$, \ldots,~$x@l$ and
 assumptions $\theta@1$, \ldots, $\theta@k$ accumulate, forming a context for
 each subgoal.  Repeated lifting steps can lift a rule into any context.  To
@@ -958,23 +964,25 @@
 best-first search (where a heuristic function selects the best state to
 explore) return their outcomes as a sequence.  Isabelle provides such
 procedures in the form of tacticals.  Simpler procedures can be expressed
-directly using the basic tacticals {\it THEN}, {\it ORELSE} and {\it REPEAT}:
-\item[$tac1\;THEN\;tac2$] is a tactic for sequential composition.  Applied
+directly using the basic tacticals {\tt THEN}, {\tt ORELSE} and {\tt REPEAT}:
+\item[$tac1$ THEN $tac2$] is a tactic for sequential composition.  Applied
 to a proof state, it returns all states reachable in two steps by applying
 $tac1$ followed by~$tac2$.
-\item[$tac1\;ORELSE\;tac2$] is a choice tactic.  Applied to a state, it
+\item[$tac1$ ORELSE $tac2$] is a choice tactic.  Applied to a state, it
 tries~$tac1$ and returns the result if non-empty; otherwise, it uses~$tac2$.
-\item[$REPEAT\;tac$] is a repetition tactic.  Applied to a state, it
+\item[REPEAT $tac$] is a repetition tactic.  Applied to a state, it
 returns all states reachable by applying~$tac$ as long as possible (until
 it would fail).  $REPEAT$ can be expressed in a few lines of \ML{} using
-{\it THEN} and~{\it ORELSE}.
+{\tt THEN} and~{\tt ORELSE}.
 For instance, this tactic repeatedly applies $tac1$ and~$tac2$, giving
 $tac1$ priority:
-\[ REPEAT(tac1\;ORELSE\;tac2) \]
+\begin{center} \tt
+REPEAT($tac1$ ORELSE $tac2$)
 \section{Variations on resolution}
@@ -988,7 +996,8 @@
 can also serve to derive rules.
 Consider proving the theorem $((R\disj R)\disj R)\disj R\imp R$.  By
 $({\imp}I)$, we prove $R$ from the assumption $((R\disj R)\disj R)\disj R$.
 Applying $(\disj E)$ to this assumption yields two subgoals, one that
@@ -1075,11 +1084,13 @@
 \subsection{Destruction rules} \label{destruct}
-\index{destruction rules|bold}\index{proof!forward}
+\index{forward proof}
 Looking back to Fig.\ts\ref{fol-fig}, notice that there are two kinds of
 elimination rule.  The rules $({\conj}E1)$, $({\conj}E2)$, $({\imp}E)$ and
 $({\forall}E)$ extract the conclusion from the major premise.  In Isabelle
-parlance, such rules are called \bfindex{destruction rules}; they are readable
+parlance, such rules are called {\bf destruction rules}; they are readable
 and easy to use in forward proof.  The rules $({\disj}E)$, $({\bot}E)$ and
 $({\exists}E)$ work by discharging assumptions; they support backward proof
 in a style reminiscent of the sequent calculus.
@@ -1099,7 +1110,7 @@
 \[ \infer[\quad\hbox{to the elimination rule}\quad]{Q}{P@1 & \ldots & P@m} 
    \infer{R.}{P@1 & \ldots & P@m & \infer*{R}{[Q]}} 
 {\bf Destruct-resolution} combines this transformation with
 elim-resolution.  It applies a destruction rule to some assumption of a
 subgoal.  Given the rule above, it replaces the assumption~$P@1$ by~$Q$,
@@ -1110,7 +1121,7 @@
 \subsection{Deriving rules by resolution}  \label{deriving}
+\index{rules!derived|bold}\index{meta-assumptions!syntax of}
 The meta-logic, itself a form of the predicate calculus, is defined by a
 system of natural deduction rules.  Each theorem may depend upon
 meta-assumptions.  The theorem that~$\phi$ follows from the assumptions