author lcp Thu, 24 Mar 1994 13:25:12 +0100 changeset 296 e1f6cd9f682e parent 295 dcde5024895d child 297 5ef75ff3baeb
revisions to first Springer draft
 doc-src/Intro/advanced.tex file | annotate | diff | comparison | revisions doc-src/Intro/foundations.tex file | annotate | diff | comparison | revisions doc-src/Intro/getting.tex file | annotate | diff | comparison | revisions doc-src/Intro/intro.tex file | annotate | diff | comparison | revisions
--- a/doc-src/Intro/advanced.tex	Wed Mar 23 16:56:44 1994 +0100
+++ b/doc-src/Intro/advanced.tex	Thu Mar 24 13:25:12 1994 +0100
@@ -21,8 +21,6 @@
We have covered only the bare essentials of Isabelle, but enough to perform
substantial proofs.  By occasionally dipping into the {\em Reference
Manual}, you can learn additional tactics, subgoal commands and tacticals.
-Isabelle's simplifier and classical theorem prover are
-difficult to learn, and can be ignored at first.

\section{Deriving rules in Isabelle}
@@ -34,13 +32,15 @@
definitions.

-\subsection{Deriving a rule using tactics} \label{deriving-example}
+\subsection{Deriving a rule using tactics and meta-level assumptions}
+\label{deriving-example}
\index{examples!of deriving rules}
+
The subgoal module supports the derivation of rules.  The \ttindex{goal}
command, when supplied a goal of the form $\List{\theta@1; \ldots; \theta@k} \Imp \phi$, creates $\phi\Imp\phi$ as the initial proof state and
returns a list consisting of the theorems
-${\theta@i\;[\theta@i]}$, for $i=1$, \ldots,~$k$.  These assumptions are
+${\theta@i\;[\theta@i]}$, for $i=1$, \ldots,~$k$.  These meta-assumptions are
also recorded internally, allowing \ttindex{result} to discharge them in the
original order.

@@ -124,14 +124,6 @@
{\bf Folding} a definition replaces occurrences of the right-hand side by
the left.  The occurrences need not be free in the entire formula.

-\begin{warn}
-Isabelle does not distinguish sensible definitions, like $1\equiv Suc(0)$, from
-equations like $1\equiv Suc(1)$.  However, meta-rewriting fails for
-equations like ${f(\Var{x})\equiv g(\Var{x},\Var{y})}$: all variables on
-the right-hand side must also be present on the left.
-\index{rewriting!meta-level}
-\end{warn}
-
When you define new concepts, you should derive rules asserting their
abstract properties, and then forget their definitions.  This supports
modularity: if you later change the definitions, without affecting their
@@ -149,16 +141,16 @@
peculiar.  Using Isabelle, we shall derive pleasanter negation rules:
$\infer[({\neg}I)]{\neg P}{\infer*{\bot}{[P]}} \qquad \infer[({\neg}E)]{Q}{\neg P & P}$
-This requires proving the following formulae:
+This requires proving the following meta-formulae:
$$(P\Imp\bot) \Imp \neg P \eqno(\neg I)$$
$$\List{\neg P; P} \Imp Q. \eqno(\neg E)$$

-\subsubsection{Deriving the introduction rule}
+\subsection{Deriving the $\neg$ introduction rule}
To derive $(\neg I)$, we may call \ttindex{goal} with the appropriate
formula.  Again, {\tt goal} returns a list consisting of the rule's
-premises.  We bind this list, which contains the one element $P\Imp\bot$,
-to the \ML\ identifier {\tt prems}.
+premises.  We bind this one-element list to the \ML\ identifier {\tt
+  prems}.
\begin{ttbox}
val prems = goal FOL.thy "(P ==> False) ==> ~P";
{\out Level 0}
@@ -189,21 +181,22 @@
{\out ~P}
{\out  1. P ==> P}
\end{ttbox}
-The rest of the proof is routine.
+The rest of the proof is routine.  Note the form of the final result.
\begin{ttbox}
by (assume_tac 1);
{\out Level 4}
{\out ~P}
{\out No subgoals!}
+\ttbreak
val notI = result();
{\out val notI = "(?P ==> False) ==> ~?P" : thm}
\end{ttbox}
\indexbold{*notI}

-\medskip
There is a simpler way of conducting this proof.  The \ttindex{goalw}
command starts a backward proof, as does \ttindex{goal}, but it also
-unfolds definitions:
+unfolds definitions.  Thus there is no need to call
+\ttindex{rewrite_goals_tac}:
\begin{ttbox}
val prems = goalw FOL.thy [not_def]
"(P ==> False) ==> ~P";
@@ -212,17 +205,14 @@
{\out  1. P --> False}
{\out val prems = ["P ==> False  [P ==> False]"] : thm list}
\end{ttbox}
-The proof continues as above, but without calling \ttindex{rewrite_goals_tac}.

-\subsubsection{Deriving the elimination rule}
+\subsection{Deriving the $\neg$ elimination rule}
Let us derive the rule $(\neg E)$.  The proof follows that of~{\tt conjE}
-(\S\ref{deriving-example}), with an additional step to unfold negation in
-the major premise.  Although the {\tt goalw} command is best for this, let
-us try~\ttindex{goal} and examine another way of unfolding definitions.
-
-As usual, we bind the premises to \ML\ identifiers.  We then apply
-\ttindex{FalseE}, which stands for~$(\bot E)$:
+above, with an additional step to unfold negation in the major premise.
+Although the {\tt goalw} command is best for this, let us
+try~\ttindex{goal} to see another way of unfolding definitions.  After
+binding the premises to \ML\ identifiers, we apply \ttindex{FalseE}:
\begin{ttbox}
val [major,minor] = goal FOL.thy "[| ~P;  P |] ==> R";
{\out Level 0}
@@ -247,7 +237,7 @@
\end{ttbox}
For subgoal~1, we transform the major premise from~$\neg P$
to~${P\imp\bot}$.  The function \ttindex{rewrite_rule}, given a list of
-definitions, unfolds them in a theorem.  Rewriting does {\bf not}
+definitions, unfolds them in a theorem.  Rewriting does not
affect the theorem's hypothesis, which remains~$\neg P$:
\begin{ttbox}
rewrite_rule [not_def] major;
@@ -257,7 +247,7 @@
{\out R}
{\out  1. P}
\end{ttbox}
-The subgoal {\tt?P1} has been instantiate to~{\tt P}, which we can prove
+The subgoal {\tt?P1} has been instantiated to~{\tt P}, which we can prove
using the minor premise:
\begin{ttbox}
by (resolve_tac [minor] 1);
@@ -279,9 +269,9 @@
{\out val major = "P --> False  [~ P]" : thm}
{\out val minor = "P  [P]" : thm}
\end{ttbox}
-Observe the difference in {\tt major}; the premises are now {\bf unfolded}
-and we need not call~\ttindex{rewrite_rule}.  Incidentally, the four calls
-to \ttindex{resolve_tac} above can be collapsed to one, with the help
+Observe the difference in {\tt major}; the premises are unfolded without
+calling~\ttindex{rewrite_rule}.  Incidentally, the four calls to
+\ttindex{resolve_tac} above can be collapsed to one, with the help
of~\ttindex{RS}; this is a typical example of forward reasoning from a
complex premise.
\begin{ttbox}
@@ -387,20 +377,17 @@
all goes well, {\tt use_thy} will finally read the file {\it t}{\tt.ML}, if
it exists.  This file typically begins with the {\ML} declaration {\tt
open}~$T$ and contains proofs that refer to the components of~$T$.
-Theories can be defined directly by issuing {\ML} declarations to Isabelle,
-but the calling sequences are extremely cumbersome.

-If theory~$T$ is later redeclared in order to delete an incorrect rule,
-bindings to the old rule may persist.  Isabelle ensures that the old and
-new versions of~$T$ are not involved in the same proof.  Attempting to
-combine different versions of~$T$ yields the fatal error
-\begin{ttbox}
-Attempt to merge different versions of theory: $$T$$
-\end{ttbox}
+When a theory file is modified, many theories may have to be reloaded.
+Isabelle records the modification times and dependencies of theory files.
+See the {\em Reference Manual\/}
+for more details.
+

\subsection{Declaring constants and rules}
\indexbold{constants!declaring}\indexbold{rules!declaring}
-Most theories simply declare constants and some rules.  The {\bf constant
+Most theories simply declare constants and rules.  The {\bf constant
declaration part} has the form
\begin{ttbox}
consts  $$c@1$$ :: "$$\tau@1$$"
@@ -476,7 +463,7 @@
\end{ttbox}
where $tycon@1$, \ldots, $tycon@n$ are identifiers and $arity@1$, \ldots,
$arity@n$ are arities.  Arity declarations add arities to existing
-types; they complement type declarations.
+types; they do not declare the types themselves.
In the simplest case, for an 0-place type constructor, an arity is simply
the type's class.  Let us declare a type~$bool$ of class $term$, with
constants $tt$ and~$ff$.  (In first-order logic, booleans are
@@ -489,23 +476,26 @@
consts  tt,ff   :: "bool"
end
\end{ttbox}
-Type constructors can take arguments.  Each type constructor has an {\bf
-  arity} with respect to classes~(\S\ref{polymorphic}).  A $k$-place type
-constructor may have arities of the form $(s@1,\ldots,s@k)c$, where
-$s@1,\ldots,s@n$ are sorts and $c$ is a class.  Each sort specifies a type
-argument; it has the form $\{c@1,\ldots,c@m\}$, where $c@1$, \dots,~$c@m$
-are classes.  Mostly we deal with singleton sorts, and may abbreviate them
-by dropping the braces.  The arity $(term)term$ is short for
-$(\{term\})term$.
+A $k$-place type constructor may have arities of the form
+$(s@1,\ldots,s@k)c$, where $s@1,\ldots,s@n$ are sorts and $c$ is a class.
+Each sort specifies a type argument; it has the form $\{c@1,\ldots,c@m\}$,
+where $c@1$, \dots,~$c@m$ are classes.  Mostly we deal with singleton
+sorts, and may abbreviate them by dropping the braces.  The arity
+$(term)term$ is short for $(\{term\})term$.  Recall the discussion in
+\S\ref{polymorphic}.

A type constructor may be overloaded (subject to certain conditions) by
-appearing in several arity declarations.  For instance, the built-in type
+appearing in several arity declarations.  For instance, the function type
constructor~$\To$ has the arity $(logic,logic)logic$; in higher-order
logic, it is declared also to have arity $(term,term)term$.

Theory {\tt List} declares the 1-place type constructor $list$, gives
it arity $(term)term$, and declares constants $Nil$ and $Cons$ with
-polymorphic types:
+polymorphic types:%
+\footnote{In the {\tt consts} part, type variable {\tt'a} has the default
+  sort, which is {\tt term}.  See the {\em Reference Manual\/}
+\iflabelundefined{sec:ref-defining-theories}{}%
\index{examples!of theories}
\begin{ttbox}
List = FOL +
@@ -553,7 +543,7 @@
If :: "[o,o,o] => o"       ("if _ then _ else _")
\end{ttbox}
declares a constant $If$ of type $[o,o,o] \To o$ with concrete syntax {\tt
-  if~$P$ then~$Q$ else~$R$} instead of {\tt If($P$,$Q$,$R$)}.  Underscores
+  if~$P$ then~$Q$ else~$R$} as well as {\tt If($P$,$Q$,$R$)}.  Underscores
denote argument positions.  Pretty-printing information can be specified in
order to improve the layout of formulae with mixfix operations.  For
details, see {\em Isabelle's Object-Logics}.
@@ -580,7 +570,6 @@
\begin{quote}\tt
if (if $P$ then $Q$ else $R$) then $S$ else $T$
\end{quote}
-Conditional expressions can also be written using the constant {\tt If}.

Binary type constructors, like products and sums, may also be declared as
infixes.  The type declaration below introduces a type constructor~$*$ with
@@ -621,11 +610,11 @@
\begin{ttbox}
$$id$$ < $$c@1$$, \ldots, $$c@k$$
\end{ttbox}
-Type classes allow constants to be overloaded~(\S\ref{polymorphic}).  As an
-example, we define the class $arith$ of arithmetic'' types with the
-constants ${+} :: [\alpha,\alpha]\To \alpha$ and $0,1 :: \alpha$, for
-$\alpha{::}arith$.  We introduce $arith$ as a subclass of $term$ and add
-the three polymorphic constants of this class.
+Type classes allow constants to be overloaded.  As suggested in
+\S\ref{polymorphic}, let us define the class $arith$ of arithmetic''
+types with the constants ${+} :: [\alpha,\alpha]\To \alpha$ and $0,1 {::} +\alpha$, for $\alpha{::}arith$.  We introduce $arith$ as a subclass of
+$term$ and add the three polymorphic constants of this class.
\index{examples!of theories}
\begin{ttbox}
Arith = FOL +
@@ -665,15 +654,11 @@
and the axiom would hold for any type of class $arith$.  This would
collapse $nat$ to a trivial type:
$Suc(1) = Suc(0+1) = Suc(0)+1 = 1+1 = 1!$
-The class $arith$ as defined above is more specific than necessary.  Many
-types come with a binary operation and identity~(0).  On lists,
-$+$ could be concatenation and 0 the empty list --- but what is 1?  Hence it
-may be better to define $+$ and 0 on $arith$ and introduce a separate
-class, say $k$, containing~1.  Should $k$ be a subclass of $term$ or of
-$arith$?  This depends on the structure of your theories; the design of an
-appropriate class hierarchy may require some experimentation.
+

-We will now work through a small example of formalized mathematics
+\section{Theory example: the natural numbers}
+
+We shall now work through a small example of formalized mathematics
demonstrating many of the theory extension features.

@@ -722,9 +707,9 @@
0+n      & = & n \\
Suc(m)+n & = & Suc(m+n)
\end{eqnarray*}
-This appears to pose difficulties: first-order logic has no functions.
-Following the previous examples, we take advantage of the meta-logic, which
-does have functions.  We also generalise primitive recursion to be
+Primitive recursion appears to pose difficulties: first-order logic has no
+function-valued expressions.  We again take advantage of the meta-logic,
+which does have functions.  We also generalise primitive recursion to be
polymorphic over any type of class~$term$, and declare the addition
function:
\begin{eqnarray*}
@@ -742,24 +727,27 @@
Nat = FOL +
types   nat
arities nat         :: term
-consts  "0"         :: "nat"    ("0")
+consts  "0"         :: "nat"                              ("0")
Suc         :: "nat=>nat"
rec         :: "[nat, 'a, [nat,'a]=>'a] => 'a"
-        "+"         :: "[nat, nat] => nat"              (infixl 60)
-rules   induct      "[| P(0);  !!x. P(x) ==> P(Suc(x)) |]  ==> P(n)"
-        Suc_inject  "Suc(m)=Suc(n) ==> m=n"
+        "+"         :: "[nat, nat] => nat"                (infixl 60)
+rules   Suc_inject  "Suc(m)=Suc(n) ==> m=n"
Suc_neq_0   "Suc(m)=0      ==> R"
+        induct      "[| P(0);  !!x. P(x) ==> P(Suc(x)) |]  ==> P(n)"
rec_0       "rec(0,a,f) = a"
rec_Suc     "rec(Suc(m), a, f) = f(m, rec(m,a,f))"
-        add_def     "m+n == rec(m, n, %x y. Suc(y))"
+        add_def     "m+n == rec(m, n, \%x y. Suc(y))"
end
\end{ttbox}
In axiom {\tt add_def}, recall that \verb|%| stands for~$\lambda$.
-Opening the \ML\ structure {\tt Nat} permits reference to the axioms by \ML\
-identifiers; we may write {\tt induct} instead of {\tt Nat.induct}.
+contains the theory and axioms.  Opening structure {\tt Nat} lets us write
+{\tt induct} instead of {\tt Nat.induct}, and so forth.
\begin{ttbox}
open Nat;
\end{ttbox}
+
+\subsection{Proving some recursion equations}
File {\tt FOL/ex/nat.ML} contains proofs involving this theory of the
natural numbers.  As a trivial example, let us derive recursion equations
for \verb$+$.  Here is the zero case:
@@ -817,7 +805,7 @@
\end{description}
The list {\it insts} consists of pairs $[(v@1,e@1), \ldots, (v@n,e@n)]$,
where $v@1$, \ldots, $v@n$ are names of schematic variables in the rule ---
-with {\bf no} leading question marks!! --- and $e@1$, \ldots, $e@n$ are
+with no leading question marks!! --- and $e@1$, \ldots, $e@n$ are
expressions giving their instantiations.  The expressions are type-checked
in the context of a particular subgoal: free variables receive the same
types as they have in the subgoal, and parameters may appear.  Type
@@ -1024,7 +1012,7 @@
rules of~\ttindex{FOL}.
\begin{ttbox}
Prolog = FOL +
-types   list 1
+types   'a list
arities list    :: (term)term
consts  Nil     :: "'a list"
":"     :: "['a, 'a list]=> 'a list"            (infixr 60)
@@ -1086,10 +1074,10 @@

\subsection{Backtracking}\index{backtracking}
-Prolog backtracking can handle questions that have multiple solutions.
-Which lists $x$ and $y$ can be appended to form the list $[a,b,c,d]$?
-Using \ttindex{REPEAT} to apply the rules, we quickly find the first
-solution, namely $x=[]$ and $y=[a,b,c,d]$:
+Prolog backtracking can answer questions that have multiple solutions.
+Which lists $x$ and $y$ can be appended to form the list $[a,b,c,d]$?  This
+question has five solutions.  Using \ttindex{REPEAT} to apply the rules, we
+quickly find the first solution, namely $x=[]$ and $y=[a,b,c,d]$:
\begin{ttbox}
goal Prolog.thy "app(?x, ?y, a:b:c:d:Nil)";
{\out Level 0}
@@ -1188,8 +1176,8 @@
(resolve_tac rules 1);
\end{ttbox}
Since Prolog uses depth-first search, this tactic is a (slow!)
-\ttindex{choplev}), and apply {\tt prolog_tac}:
+\ttindex{choplev}, and apply {\tt prolog_tac}:
\begin{ttbox}
choplev 0;
{\out Level 0}
--- a/doc-src/Intro/foundations.tex	Wed Mar 23 16:56:44 1994 +0100
+++ b/doc-src/Intro/foundations.tex	Thu Mar 24 13:25:12 1994 +0100
@@ -1,10 +1,13 @@
%% $Id$
\part{Foundations}
-This Part presents Isabelle's logical foundations in detail:
+The following sections discuss Isabelle's logical foundations in detail:
representing logical syntax in the typed $\lambda$-calculus; expressing
inference rules in Isabelle's meta-logic; combining rules by resolution.
-Readers wishing to use Isabelle immediately may prefer to skip straight to
-Part~II, using this Part (via the index) for reference only.
+
+If you wish to use Isabelle immediately, please turn to
+either by returning to this point or by looking up particular items in the
+index.

\begin{figure}
\begin{eqnarray*}
@@ -46,32 +49,32 @@
\caption{Intuitionistic first-order logic} \label{fol-fig}
\end{figure}

-\section{Formalizing logical syntax in Isabelle}
+\section{Formalizing logical syntax in Isabelle}\label{sec:logical-syntax}
\index{Isabelle!formalizing syntax|bold}
Figure~\ref{fol-fig} presents intuitionistic first-order logic,
-including equality and the natural numbers.  Let us see how to formalize
+including equality.  Let us see how to formalize
this logic in Isabelle, illustrating the main features of Isabelle's
polymorphic meta-logic.

Isabelle represents syntax using the typed $\lambda$-calculus.  We declare
a type for each syntactic category of the logic.  We declare a constant for
-each symbol of the logic, giving each $n$-ary operation an $n$-argument
+each symbol of the logic, giving each $n$-place operation an $n$-argument
curried function type.  Most importantly, $\lambda$-abstraction represents
variable binding in quantifiers.

\index{$\To$|bold}\index{types}
Isabelle has \ML-style type constructors such as~$(\alpha)list$, where
$(bool)list$ is the type of lists of booleans.  Function types have the
-form $\sigma\To\tau$, where $\sigma$ and $\tau$ are types.  Functions
-taking $n$~arguments require curried function types; we may abbreviate
+form $\sigma\To\tau$, where $\sigma$ and $\tau$ are types.  Curried
+function types may be abbreviated:
$\sigma@1\To (\cdots \sigma@n\To \tau\cdots) \quad \hbox{as} \quad - [\sigma@1, \ldots, \sigma@n] \To \tau.  + [\sigma@1, \ldots, \sigma@n] \To \tau  The syntax for terms is summarised below. Note that function application is written t(u) rather than the usual t\,u. \[ \begin{array}{ll} - t :: \sigma & \hbox{type constraint, on a term or variable} \\ + t :: \tau & \hbox{type constraint, on a term or bound variable} \\ \lambda x.t & \hbox{abstraction} \\ \lambda x@1\ldots x@n.t & \hbox{curried abstraction, \lambda x@1. \ldots \lambda x@n.t} \\ @@ -81,13 +84,13 @@$

-\subsection{Simple types and constants}
-\index{types!simple|bold}
-The syntactic categories of our logic (Figure~\ref{fol-fig}) are
-{\bf formulae} and {\bf terms}.  Formulae denote truth
-values, so (following logical tradition) we call their type~$o$.  Terms can
-be constructed using~0 and~$Suc$, requiring a type~$nat$ of natural
-numbers.  Later, we shall see how to admit terms of other types.
+\subsection{Simple types and constants}\index{types!simple|bold}
+
+The syntactic categories of our logic (Fig.\ts\ref{fol-fig}) are {\bf
+  formulae} and {\bf terms}.  Formulae denote truth values, so (following
+tradition) let us call their type~$o$.  To allow~0 and~$Suc(t)$ as terms,
+let us declare a type~$nat$ of natural numbers.  Later, we shall see
+how to admit terms of other types.

After declaring the types~$o$ and~$nat$, we may declare constants for the
symbols of our logic.  Since $\bot$ denotes a truth value (falsity) and 0
@@ -105,62 +108,68 @@
{\neg} & :: & o\To o      \\
\conj,\disj,\imp,\bimp  & :: & [o,o]\To o
\end{eqnarray*}
+The binary connectives can be declared as infixes, with appropriate
+precedences, so that we write $P\conj Q\disj R$ instead of
+$\disj(\conj(P,Q), R)$.

-Isabelle allows us to declare the binary connectives as infixes, with
-appropriate precedences, so that we write $P\conj Q\disj R$ instead of
-$\disj(\conj(P,Q), R)$.
+\S\ref{sec:defining-theories} below describes the syntax of Isabelle theory
+files and illustrates it by extending our logic with mathematical induction.

\subsection{Polymorphic types and constants} \label{polymorphic}
\index{types!polymorphic|bold}
+\index{equality!polymorphic}
+
Which type should we assign to the equality symbol?  If we tried
$[nat,nat]\To o$, then equality would be restricted to the natural
numbers; we would have to declare different equality symbols for each
type.  Isabelle's type system is polymorphic, so we could declare
\begin{eqnarray*}
-  {=}  & :: & [\alpha,\alpha]\To o.
+  {=}  & :: & [\alpha,\alpha]\To o,
\end{eqnarray*}
+where the type variable~$\alpha$ ranges over all types.
But this is also wrong.  The declaration is too polymorphic; $\alpha$
-ranges over all types, including~$o$ and $nat\To nat$.  Thus, it admits
+includes types like~$o$ and $nat\To nat$.  Thus, it admits
$\bot=\neg(\bot)$ and $Suc=Suc$ as formulae, which is acceptable in
higher-order logic but not in first-order logic.

-Isabelle's \bfindex{classes} control polymorphism.  Each type variable
-belongs to a class, which denotes a set of types.  Classes are partially
-ordered by the subclass relation, which is essentially the subset relation
-on the sets of types.  They closely resemble the classes of the functional
-Prehofer discuss type inference for type classes~\cite{nipkow-prehofer}.
+Isabelle's {\bf type classes}\index{classes} control
+polymorphism~\cite{nipkow-prehofer}.  Each type variable belongs to a
+class, which denotes a set of types.  Classes are partially ordered by the
+subclass relation, which is essentially the subset relation on the sets of
+types.  They closely resemble the classes of the functional language

Isabelle provides the built-in class $logic$, which consists of the logical
types: the ones we want to reason about.  Let us declare a class $term$, to
consist of all legal types of terms in our logic.  The subclass structure
is now $term\le logic$.

-We declare $nat$ to be in class $term$.  Type variables of class~$term$
-should resemble Standard~\ML's equality type variables, which range over
-those types that possess an equality test.  Thus, we declare the equality
-constant by
+We put $nat$ in class $term$ by declaring $nat{::}term$.  We declare the
+equality constant by
\begin{eqnarray*}
{=}  & :: & [\alpha{::}term,\alpha]\To o
\end{eqnarray*}
+where $\alpha{::}term$ constrains the type variable~$\alpha$ to class
+$term$.  Such type variables resemble Standard~\ML's equality type
+variables.
+
We give function types and~$o$ the class $logic$ rather than~$term$, since
they are not legal types for terms.  We may introduce new types of class
$term$ --- for instance, type $string$ or $real$ --- at any time.  We can
even declare type constructors such as $(\alpha)list$, and state that type
-$(\sigma)list$ belongs to class~$term$ provided $\sigma$ does;  equality
+$(\tau)list$ belongs to class~$term$ provided $\tau$ does; equality
applies to lists of natural numbers but not to lists of formulae.  We may
summarize this paragraph by a set of {\bf arity declarations} for type
-constructors:
-\index{$\To$|bold}\index{arities!declaring}
+constructors: \index{$\To$|bold}\index{arities!declaring}
\begin{eqnarray*}
o     & :: & logic \\
{\To} & :: & (logic,logic)logic \\
nat, string, real     & :: & term \\
list  & :: & (term)term
\end{eqnarray*}
-Higher-order logic, where equality does apply to truth values and
-functions, would require different arity declarations, namely ${o::term}$
+In higher-order logic, equality does apply to truth values and
+functions;  this requires the arity declarations ${o::term}$
and ${{\To}::(term,term)term}$.  The class system can also handle
overloading.\index{overloading|bold} We could declare $arith$ to be the
subclass of $term$ consisting of the arithmetic' types, such as~$nat$.
@@ -176,33 +185,32 @@
{+},{-},{\times},{/}  & :: & [complex,complex]\To complex
\end{eqnarray*}
Isabelle will regard these as distinct constants, each of which can be defined
-separately.  We could even introduce the type $(\alpha)vector$, make
-$(\sigma)vector$ belong to $arith$ provided $\sigma$ is in $arith$, and define
+separately.  We could even introduce the type $(\alpha)vector$ and declare
+its arity as $(arith)arith$.  Then we could declare the constant
\begin{eqnarray*}
-  {+}  & :: & [(\sigma)vector,(\sigma)vector]\To (\sigma)vector
+  {+}  & :: & [(\alpha)vector,(\alpha)vector]\To (\alpha)vector
\end{eqnarray*}
-in terms of ${+} :: [\sigma,\sigma]\To \sigma$.
+and specify it in terms of ${+} :: [\alpha,\alpha]\To \alpha$.

-Although we have pretended so far that a type variable belongs to only one
-class --- Isabelle's concrete syntax helps to uphold this illusion --- it
-may in fact belong to any finite number of classes.  For example suppose
-that we had declared yet another class $ord \le term$, the class of all
+A type variable may belong to any finite number of classes.  Suppose that
+we had declared yet another class $ord \le term$, the class of all
ordered' types, and a constant
\begin{eqnarray*}
{\le}  & :: & [\alpha{::}ord,\alpha]\To o.
\end{eqnarray*}
In this context the variable $x$ in $x \le (x+x)$ will be assigned type
-$\alpha{::}\{arith,ord\}$, i.e.\ $\alpha$ belongs to both $arith$ and $ord$.
-Semantically the set $\{arith,ord\}$ should be understood
-as the intersection of the sets of types represented by $arith$ and $ord$.
-Such intersections of classes are called \bfindex{sorts}.  The empty
-intersection of classes, $\{\}$, contains all types and is thus the
-{\bf universal sort}.
+$\alpha{::}\{arith,ord\}$, which means $\alpha$ belongs to both $arith$ and
+$ord$.  Semantically the set $\{arith,ord\}$ should be understood as the
+intersection of the sets of types represented by $arith$ and $ord$.  Such
+intersections of classes are called \bfindex{sorts}.  The empty
+intersection of classes, $\{\}$, contains all types and is thus the {\bf
+  universal sort}.

-The type checker handles overloading, assigning each term a unique type.  For
-this to be possible, the class and type declarations must satisfy certain
+Even with overloading, each term has a unique, most general type.  For this
+to be possible, the class and type declarations must satisfy certain
technical constraints~\cite{nipkow-prehofer}.

+
\subsection{Higher types and quantifiers}
\index{types!higher|bold}
Quantifiers are regarded as operations upon functions.  Ignoring polymorphism
@@ -242,9 +250,11 @@
\index{implication!meta-level|bold}
\index{quantifiers!meta-level|bold}
\index{equality!meta-level|bold}
-Object-logics are formalized by extending Isabelle's meta-logic, which is
-intuitionistic higher-order logic.  The meta-level connectives are {\bf
-implication}, the {\bf universal quantifier}, and {\bf equality}.
+
+Object-logics are formalized by extending Isabelle's
+meta-logic~\cite{paulson89}, which is intuitionistic higher-order logic.
+The meta-level connectives are {\bf implication}, the {\bf universal
+  quantifier}, and {\bf equality}.
\begin{itemize}
\item The implication $$\phi\Imp \psi$$ means $$\phi$$ implies
$$\psi$$', and expresses logical {\bf entailment}.
@@ -294,7 +304,7 @@
\subsection{Expressing propositional rules}
\index{rules!propositional|bold}
We shall illustrate the use of the meta-logic by formalizing the rules of
-Figure~\ref{fol-fig}.  Each object-level rule is expressed as a meta-level
+Fig.\ts\ref{fol-fig}.  Each object-level rule is expressed as a meta-level
axiom.

One of the simplest rules is $(\conj E1)$.  Making
@@ -375,7 +385,7 @@
The $\forall E$ rule exploits $\beta$-conversion.  Hiding $Trueprop$, the
$\forall$ axioms are
$$\left(\Forall x. P(x)\right) \Imp \forall x.P(x) \eqno(\forall I)$$
-$$\forall x.P(x) \Imp P(t). \eqno(\forall E)$$
+$$(\forall x.P(x)) \Imp P(t). \eqno(\forall E)$$

\noindent
We have defined the object-level universal quantifier~($\forall$)
@@ -420,10 +430,11 @@
such as lists and their operations, or an entire logic.  A mathematical
development typically involves many theories in a hierarchy.  For example,
the pure theory could be extended to form a theory for
-Figure~\ref{fol-fig}; this could be extended in two separate ways to form a
+Fig.\ts\ref{fol-fig}; this could be extended in two separate ways to form a
theory for natural numbers and a theory for lists; the union of these two
could be extended into a theory defining the length of a list:
-$\bf +\begin{tt} +\[ \begin{array}{c@{}c@{}c@{}c@{}c} {} & {} & \hbox{Length} & {} & {} \\ {} & {} & \uparrow & {} & {} \\ @@ -436,27 +447,35 @@ {} & {} &\hbox{Pure}& {} & {} \end{array}$
+\end{tt}
Each Isabelle proof typically works within a single theory, which is
associated with the proof state.  However, many different theories may
coexist at the same time, and you may work in each of these during a single
session.

+\begin{warn}
+  Confusing problems arise if you work in the wrong theory.  Each theory
+  defines its own syntax.  An identifier may be regarded in one theory as a
+  constant and in another as a variable.
+\end{warn}

\section{Proof construction in Isabelle}
-\index{Isabelle!proof construction in|bold}
-There is a one-to-one correspondence between meta-level proofs and
-object-level proofs~\cite{paulson89}.  To each use of a meta-level axiom,
-such as $(\forall I)$, there is a use of the corresponding object-level
-rule.  Object-level assumptions and parameters have meta-level
-counterparts.  The meta-level formalization is {\bf faithful}, admitting no
-correct object-level inferences.  These properties must be demonstrated
-separately for each object-logic.
+\index{Isabelle!proof construction in|bold}
+
+I have elsewhere described the meta-logic and demonstrated it by
+formalizing first-order logic~\cite{paulson89}.  There is a one-to-one
+correspondence between meta-level proofs and object-level proofs.  To each
+use of a meta-level axiom, such as $(\forall I)$, there is a use of the
+corresponding object-level rule.  Object-level assumptions and parameters
+have meta-level counterparts.  The meta-level formalization is {\bf
+  faithful}, admitting no incorrect object-level inferences, and {\bf
+properties must be demonstrated separately for each object-logic.

The meta-logic is defined by a collection of inference rules, including
equational rules for the $\lambda$-calculus, and logical rules.  The rules
for~$\Imp$ and~$\Forall$ resemble those for~$\imp$ and~$\forall$ in
-Figure~\ref{fol-fig}.  Proofs performed using the primitive meta-rules
+Fig.\ts\ref{fol-fig}.  Proofs performed using the primitive meta-rules
would be lengthy; Isabelle proofs normally use certain derived rules.
{\bf Resolution}, in particular, is convenient for backward proof.

@@ -468,35 +487,33 @@
its inputs, we need not state how many clock ticks will be required.  Such
quantities often emerge from the proof.

-\index{variables!schematic|see{unknowns}}
Isabelle provides {\bf schematic variables}, or \bfindex{unknowns}, for
-unification.  Logically, unknowns are free variables.  Pragmatically, they
-may be instantiated during a proof, while ordinary variables remain fixed.
-Unknowns are written with a ?\ prefix and are frequently subscripted:
-$\Var{a}$, $\Var{a@1}$, $\Var{a@2}$, \ldots, $\Var{P}$, $\Var{P@1}$, \ldots.
+unification.  Logically, unknowns are free variables.  But while ordinary
+variables remain fixed, unification may instantiate unknowns.  Unknowns are
+written with a ?\ prefix and are frequently subscripted: $\Var{a}$,
+$\Var{a@1}$, $\Var{a@2}$, \ldots, $\Var{P}$, $\Var{P@1}$, \ldots.

Recall that an inference rule of the form
$\infer{\phi}{\phi@1 & \ldots & \phi@n}$
is formalized in Isabelle's meta-logic as the axiom
$\List{\phi@1; \ldots; \phi@n} \Imp \phi$.
-Such axioms resemble {\sc Prolog}'s Horn clauses, and can be combined by
+Such axioms resemble Prolog's Horn clauses, and can be combined by
resolution --- Isabelle's principal proof method.  Resolution yields both
forward and backward proof.  Backward proof works by unifying a goal with
the conclusion of a rule, whose premises become new subgoals.  Forward proof
works by unifying theorems with the premises of a rule, deriving a new theorem.

-Isabelle axioms will require an extended notion of resolution, because
-they differ from Horn clauses in two major respects:
+Isabelle axioms require an extended notion of resolution.
+They differ from Horn clauses in two major respects:
\begin{itemize}
\item They are written in the typed $\lambda$-calculus, and therefore must be
resolved using higher-order unification.

-  \item Horn clauses are composed of atomic formulae.  Any formula of the form
-$Trueprop(\cdots)$ is atomic, but axioms such as ${\imp}I$ and $\forall I$
-contain non-atomic formulae.
-\index{Trueprop@{$Trueprop$}}
+\item The constituents of a clause need not be atomic formulae.  Any
+  formula of the form $Trueprop(\cdots)$ is atomic, but axioms such as
+  ${\imp}I$ and $\forall I$ contain non-atomic formulae.
\end{itemize}
-Isabelle should not be confused with classical resolution theorem provers
+Isabelle has little in common with classical resolution theorem provers
such as Otter~\cite{wos-bledsoe}.  At the meta-level, Isabelle proves
theorems in their positive form, not by refutation.  However, an
object-logic that includes a contradiction rule may employ a refutation
@@ -649,7 +666,7 @@
an equation like $Suc(Suc(Suc(m)))=Suc(Suc(Suc(0)))$.

-\subsection{Lifting a rule into a context}
+\section{Lifting a rule into a context}
The rules $({\imp}I)$ and $(\forall I)$ may seem unsuitable for
resolution.  They have non-atomic premises, namely $P\Imp Q$ and $\Forall x.P(x)$, while the conclusions of all the rules are atomic (they have the form
@@ -657,20 +674,20 @@
called \bfindex{lifting}.  Let us consider how to construct proofs such as
$\infer[({\imp}I)]{P\imp(Q\imp R)} {\infer[({\imp}I)]{Q\imp R} - {\infer*{R}{[P] & [Q]}}} + {\infer*{R}{[P,Q]}}} \qquad \infer[(\forall I)]{\forall x\,y.P(x,y)} {\infer[(\forall I)]{\forall y.P(x,y)}{P(x,y)}}$

-\subsubsection{Lifting over assumptions}
+\subsection{Lifting over assumptions}
\index{lifting!over assumptions|bold}
Lifting over $\theta\Imp{}$ is the following meta-inference rule:
$\infer{\List{\theta\Imp\phi@1; \ldots; \theta\Imp\phi@n} \Imp (\theta \Imp \phi)} {\List{\phi@1; \ldots; \phi@n} \Imp \phi}$
This is clearly sound: if $\List{\phi@1; \ldots; \phi@n} \Imp \phi$ is true and
-$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$, $\theta$ are all true then
+$\theta\Imp\phi@1$, \ldots, $\theta\Imp\phi@n$ and $\theta$ are all true then
$\phi$ must be true.  Iterated lifting over a series of meta-formulae
$\theta@k$, \ldots, $\theta@1$ yields an object-rule whose conclusion is
$\List{\theta@1; \ldots; \theta@k} \Imp \phi$.  Typically the $\theta@i$ are
@@ -682,7 +699,10 @@
$(\Var{P} \Imp \Var{Q}) \Imp \Var{P}\imp \Var{Q}.$
To resolve this rule with itself, Isabelle modifies one copy as follows: it
renames the unknowns to $\Var{P@1}$ and $\Var{Q@1}$, then lifts the rule over
-$\Var{P}\Imp{}$:
+$\Var{P}\Imp{}$ to obtain
+$(\Var{P}\Imp (\Var{P@1} \Imp \Var{Q@1})) \Imp (\Var{P} \Imp + (\Var{P@1}\imp \Var{Q@1})).$
+Using the $\List{\cdots}$ abbreviation, this can be written as
$\List{\List{\Var{P}; \Var{P@1}} \Imp \Var{Q@1}; \Var{P}} \Imp \Var{P@1}\imp \Var{Q@1}.$
Unifying $\Var{P}\Imp \Var{P@1}\imp\Var{Q@1}$ with $\Var{P} \Imp @@ -693,7 +713,7 @@ This represents the derived rule $\infer{P\imp(Q\imp R).}{\infer*{R}{[P,Q]}}$ -\subsubsection{Lifting over parameters} +\subsection{Lifting over parameters} \index{lifting!over parameters|bold} An analogous form of lifting handles premises of the form$\Forall x\ldots\,$. Here, lifting prefixes an object-rule's premises and conclusion with$\Forall
@@ -704,8 +724,13 @@
$\infer{\List{\Forall x.\phi@1^x; \ldots; \Forall x.\phi@n^x} \Imp \Forall x.\phi^x,} {\List{\phi@1; \ldots; \phi@n} \Imp \phi}$
-where $\phi^x$ stands for the result of lifting unknowns over~$x$ in $\phi$.
-It is not hard to verify that this meta-inference is sound.
+%
+where $\phi^x$ stands for the result of lifting unknowns over~$x$ in
+$\phi$.  It is not hard to verify that this meta-inference is sound.  If
+$\phi\Imp\psi$ then $\phi^x\Imp\psi^x$ for all~$x$; so if $\phi^x$ is true
+for all~$x$ then so is $\psi^x$.  Thus, from $\phi\Imp\psi$ we conclude
+$(\Forall x.\phi^x) \Imp (\Forall x.\psi^x)$.
+
For example, $(\disj I)$ might be lifted to
$(\Forall x.\Var{P@1}(x)) \Imp (\Forall x. \Var{P@1}(x)\disj \Var{Q@1}(x))$
and $(\forall I)$ to
@@ -722,22 +747,24 @@
\quad\hbox{provided $x$, $y$ not free in the assumptions}
\]
I discuss lifting and parameters at length elsewhere~\cite{paulson89}.
-Miller goes into even greater detail~\cite{miller-jsc}.
+Miller goes into even greater detail~\cite{miller-mixed}.

\section{Backward proof by resolution}
-\index{resolution!in backward proof}\index{proof!backward|bold}
+\index{resolution!in backward proof}\index{proof!backward|bold}
+\index{tactics}\index{tacticals}
+
Resolution is convenient for deriving simple rules and for reasoning
forward from facts.  It can also support backward proof, where we start
with a goal and refine it to progressively simpler subgoals until all have
-been solved.  {\sc lcf} (and its descendants {\sc hol} and Nuprl) provide
+been solved.  {\sc lcf} and its descendants {\sc hol} and Nuprl provide
tactics and tacticals, which constitute a high-level language for
-expressing proof searches.  \bfindex{Tactics} perform primitive refinements
-while \bfindex{tacticals} combine tactics.
+expressing proof searches.  {\bf Tactics} refine subgoals while {\bf
+  tacticals} combine tactics.

\index{LCF}
Isabelle's tactics and tacticals work differently from {\sc lcf}'s.  An
-Isabelle rule is {\bf bidirectional}: there is no distinction between
+Isabelle rule is bidirectional: there is no distinction between
inputs and outputs.  {\sc lcf} has a separate tactic for each rule;
Isabelle performs refinement by any rule in a uniform fashion, using
resolution.
@@ -753,8 +780,8 @@
To prove the formula~$\phi$, take $\phi\Imp \phi$ as the initial proof
state.  This assertion is, trivially, a theorem.  At a later stage in the
backward proof, a typical proof state is $\List{\phi@1; \ldots; \phi@n} -\Imp \phi$.  This proof state is a theorem, guaranteeing that the subgoals
-$\phi@1$,~\ldots,~$\phi@n$ imply~$\phi$.  If $m=0$ then we have
+\Imp \phi$. This proof state is a theorem, ensuring that the subgoals +$\phi@1$,~\ldots,~$\phi@n$imply~$\phi$. If$n=0$then we have proved~$\phi$outright. If$\phi$contains unknowns, they may become instantiated during the proof; a proof state may be$\List{\phi@1; \ldots;
\phi@n} \Imp \phi'$, where$\phi'$is an instance of~$\phi$. @@ -806,32 +833,38 @@ regards them as unique constants with a limited scope --- this enforces parameter provisos~\cite{paulson89}. -The premise represents a proof state with~$n$subgoals, of which the~$i$th is -to be solved by assumption. Isabelle searches the subgoal's context for an -assumption, say$\theta@j$, that can solve it by unification. For each -unifier, the meta-inference returns an instantiated proof state from which the -$i$th subgoal has been removed. Isabelle searches for a unifying assumption; -for readability and robustness, proofs do not refer to assumptions by number. +The premise represents a proof state with~$n$subgoals, of which the~$i$th +is to be solved by assumption. Isabelle searches the subgoal's context for +an assumption~$\theta@j$that can solve it. For each unifier, the +meta-inference returns an instantiated proof state from which the$i$th +subgoal has been removed. Isabelle searches for a unifying assumption; for +readability and robustness, proofs do not refer to assumptions by number. -Consider the proof state$(\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x})$. +Consider the proof state +$(\List{P(a); P(b)} \Imp P(\Var{x})) \Imp Q(\Var{x}).$ Proof by assumption (with$i=1$, the only possibility) yields two results: \begin{itemize} \item$Q(a)$, instantiating$\Var{x}\equiv a$\item$Q(b)$, instantiating$\Var{x}\equiv b$\end{itemize} Here, proof by assumption affects the main goal. It could also affect -other subgoals. Consider the effect of having the -additional subgoal${\List{P(b); P(c)} \Imp P(\Var{x})}$. +other subgoals; if we also had the subgoal${\List{P(b); P(c)} \Imp
+  P(\Var{x})}$, then$\Var{x}\equiv a$would transform it to${\List{P(b);
+    P(c)} \Imp P(a)}$, which might be unprovable. + \subsection{A propositional proof} \label{prop-proof} \index{examples!propositional} Our first example avoids quantifiers. Given the main goal$P\disj P\imp
P$, Isabelle creates the initial state -$(P\disj P\imp P)\Imp (P\disj P\imp P).$ +$(P\disj P\imp P)\Imp (P\disj P\imp P).$ +% Bear in mind that every proof state we derive will be a meta-theorem, -expressing that the subgoals imply the main goal. The first step is to refine -subgoal~1 by (${\imp}I)$, creating a new state where$P\disj P$is an -assumption: +expressing that the subgoals imply the main goal. Our aim is to reach the +state$P\disj P\imp P$; this meta-theorem is the desired result. + +The first step is to refine subgoal~1 by (${\imp}I)$, creating a new state +where$P\disj P$is an assumption: $(P\disj P\Imp P)\Imp (P\disj P\imp P)$ The next step is$(\disj E)$, which replaces subgoal~1 by three new subgoals. Because of lifting, each subgoal contains a copy of the context --- the @@ -855,8 +888,8 @@ \rbrakk\;& \Imp (P\disj P\imp P) & \hbox{(main goal)} \end{array} \] Both of the remaining subgoals can be proved by assumption. After two such -steps, the proof state is simply the meta-theorem$P\disj P\imp P$. This is -our desired result. +steps, the proof state is$P\disj P\imp P$. + \subsection{A quantifier proof} \index{examples!with quantifiers} @@ -888,8 +921,8 @@ \Imp (\exists x.P(f(x)))\imp(\exists x.P(x)) \] The next step is refinement by$(\exists I)$. The rule is lifted into the -context of the parameter~$x$and the assumption$P(f(x))$. This ensures that -the context is copied to the subgoal and allows the existential witness to +context of the parameter~$x$and the assumption$P(f(x))$. This copies +the context to the subgoal and allows the existential witness to depend upon~$x$: $\left(\Forall x.\List{\exists x.P(f(x)); P(f(x))} \Imp P(\Var{x@2}(x))\right) @@ -907,7 +940,7 @@ of {\sc lcf}, {\sc hol} and Nuprl by operating on entire proof states, rather than on individual subgoals. An Isabelle tactic is a function that takes a proof state and returns a sequence (lazy list) of possible -successor states. Sequences are coded in ML as functions, a standard +successor states. Lazy lists are coded in ML as functions, a standard technique~\cite{paulson91}. Isabelle represents proof states by theorems. Basic tactics execute the meta-rules described above, operating on a @@ -984,21 +1017,27 @@ Elim-resolution is Isabelle's way of getting sequent calculus behaviour from natural deduction rules. It lets an elimination rule consume an -assumption. Elim-resolution takes a rule \List{\psi@1; \ldots; \psi@m} -\Imp \psi a proof state \List{\phi@1; \ldots; \phi@n} \Imp \phi, and a -subgoal number~i. The rule must have at least one premise, thus m>0. -Write the rule's lifted form as \List{\psi'@1; \ldots; \psi'@m} \Imp -\psi'. Ordinary resolution would attempt to reduce~\phi@i, -replacing subgoal~i by m new ones. Elim-resolution tries {\bf -simultaneously} to reduce~\phi@i and to solve~\psi'@1 by assumption; it +assumption. Elim-resolution combines two meta-theorems: +\begin{itemize} + \item a rule \List{\psi@1; \ldots; \psi@m} \Imp \psi + \item a proof state \List{\phi@1; \ldots; \phi@n} \Imp \phi +\end{itemize} +The rule must have at least one premise, thus m>0. Write the rule's +lifted form as \List{\psi'@1; \ldots; \psi'@m} \Imp \psi'. Suppose we +wish to change subgoal number~i. + +Ordinary resolution would attempt to reduce~\phi@i, +replacing subgoal~i by m new ones. Elim-resolution tries +simultaneously to reduce~\phi@i and to solve~\psi'@1 by assumption; it returns a sequence of next states. Each of these replaces subgoal~i by instances of \psi'@2, \ldots, \psi'@m from which the selected assumption has been deleted. Suppose \phi@i has the parameter~x and assumptions \theta@1,~\ldots,~\theta@k. Then \psi'@1, the rule's first premise after lifting, will be $$\Forall x. \List{\theta@1; \ldots; \theta@k}\Imp \psi^{x}@1$$. -Elim-resolution tries to unify simultaneously \psi'\qeq\phi@i and -\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1, for j=1,~\ldots,~k. +Elim-resolution tries to unify \psi'\qeq\phi@i and +\lambda x. \theta@j \qeq \lambda x. \psi^{x}@1 simultaneously, for +j=1,~\ldots,~k. Let us redo the example from~\S\ref{prop-proof}. The elimination rule is~(\disj E), @@ -1037,7 +1076,7 @@ \subsection{Destruction rules} \label{destruct} \index{destruction rules|bold}\index{proof!forward} -Looking back to Figure~\ref{fol-fig}, notice that there are two kinds of +Looking back to Fig.\ts\ref{fol-fig}, notice that there are two kinds of elimination rule. The rules ({\conj}E1), ({\conj}E2), ({\imp}E) and ({\forall}E) extract the conclusion from the major premise. In Isabelle parlance, such rules are called \bfindex{destruction rules}; they are readable @@ -1113,11 +1152,7 @@ schematic variables. \begin{warn} -Schematic variables are not allowed in (meta) assumptions because they would -complicate lifting. Assumptions remain fixed throughout a proof. +Schematic variables are not allowed in meta-assumptions because they would +complicate lifting. Meta-assumptions remain fixed throughout a proof. \end{warn} -% Local Variables: -% mode: latex -% TeX-master: t -% End:  --- a/doc-src/Intro/getting.tex Wed Mar 23 16:56:44 1994 +0100 +++ b/doc-src/Intro/getting.tex Thu Mar 24 13:25:12 1994 +0100 @@ -1,15 +1,11 @@ %% Id -\part{Getting started with Isabelle} -This Part describes how to perform simple proofs using Isabelle. Although -it frequently refers to concepts from the previous Part, a user can get -started without understanding them in detail. - -As of this writing, Isabelle's user interface is \ML. Proofs are conducted -by applying certain \ML{} functions, which update a stored proof state. -Logics are combined and extended by calling \ML{} functions. All syntax -must be expressed using {\sc ascii} characters. Menu-driven graphical -interfaces are under construction, but Isabelle users will always need to -know some \ML, at least to use tacticals. +\part{Getting Started with Isabelle}\label{chap:getting} +We now consider how to perform simple proofs using Isabelle. As of this +writing, Isabelle's user interface is \ML. Proofs are conducted by +applying certain \ML{} functions, which update a stored proof state. All +syntax must be expressed using {\sc ascii} characters. Menu-driven +graphical interfaces are under construction, but Isabelle users will always +need to know some \ML, at least to use tacticals. Object-logics are built upon Pure Isabelle, which implements the meta-logic and provides certain fundamental data structures: types, terms, signatures, @@ -40,24 +36,24 @@ have not been declared as symbols! The parser resolves any ambiguity by taking the longest possible symbol that has been declared. Thus the string {\tt==>} is read as a single symbol. But \hbox{\tt= =>} is read as two -symbols, as is \verb|}}|, as discussed above. +symbols. Identifiers that are not reserved words may serve as free variables or constants. A type identifier consists of an identifier prefixed by a prime, for example {\tt'a} and \hbox{\tt'hello}. An unknown (or type unknown) consists of a question mark, an identifier (or type identifier), and a subscript. The subscript, a non-negative integer, allows the -renaming of unknowns prior to unification. - -The subscript may appear after the identifier, separated by a dot; this -prevents ambiguity when the identifier ends with a digit. Thus {\tt?z6.0} -has identifier \verb|"z6"| and subscript~0, while {\tt?a0.5} has identifier -\verb|"a0"| and subscript~5. If the identifier does not end with a digit, -then no dot appears and a subscript of~0 is omitted; for example, -{\tt?hello} has identifier \verb|"hello"| and subscript zero, while -{\tt?z6} has identifier \verb|"z"| and subscript~6. The same conventions -apply to type unknowns. Note that the question mark is {\bf not} part of the -identifier! +renaming of unknowns prior to unification.% +% +\footnote{The subscript may appear after the identifier, separated by a + dot; this prevents ambiguity when the identifier ends with a digit. Thus + {\tt?z6.0} has identifier {\tt"z6"} and subscript~0, while {\tt?a0.5} + has identifier {\tt"a0"} and subscript~5. If the identifier does not + end with a digit, then no dot appears and a subscript of~0 is omitted; + for example, {\tt?hello} has identifier {\tt"hello"} and subscript + zero, while {\tt?z6} has identifier {\tt"z"} and subscript~6. The same + conventions apply to type unknowns. The question mark is {\it not\/} + part of the identifier!} \subsection{Syntax of types and terms} @@ -65,7 +61,7 @@ \index{classes!built-in|bold} Classes are denoted by identifiers; the built-in class \ttindex{logic} contains the logical' types. Sorts are lists of classes enclosed in -braces~\{ and \}; singleton sorts may be abbreviated by dropping the braces. +braces~\} and \{; singleton sorts may be abbreviated by dropping the braces. \index{types!syntax|bold} Types are written with a syntax like \ML's. The built-in type \ttindex{prop} @@ -150,14 +146,15 @@ To illustrate the notation, consider two axioms for first-order logic: \List{P; Q} \Imp P\conj Q \eqno(\conj I) \List{\exists x.P(x); \Forall x. P(x)\imp Q} \Imp Q \eqno(\exists E) -Using the {\tt [|\ldots|]} shorthand, ({\conj}I) translates literally into +Using the {\tt [|\ldots|]} shorthand, ({\conj}I) translates into {\sc ascii} characters as \begin{ttbox} [| ?P; ?Q |] ==> ?P & ?Q \end{ttbox} -The schematic variables let unification instantiate the rule. To -avoid cluttering rules with question marks, Isabelle converts any free -variables in a rule to schematic variables; we normally write ({\conj}I) as +The schematic variables let unification instantiate the rule. To avoid +cluttering logic definitions with question marks, Isabelle converts any +free variables in a rule to schematic variables; we normally declare +({\conj}I) as \begin{ttbox} [| P; Q |] ==> P & Q \end{ttbox} @@ -220,14 +217,14 @@ {\out [| ?P1 --> ?P --> ?Q; ?P1; ?P |] ==> ?Q} {\out val it = "[| ?P1 --> ?P --> ?Q; ?P1; ?P |] ==> ?Q" : thm} \end{ttbox} -In the Isabelle documentation, user's input appears in {\tt typewriter - characters}, and output appears in {\sltt slanted typewriter characters}. -\ML's response {\out val }~\ldots{} is compiler-dependent and will -sometimes be suppressed. This session illustrates two formats for the -display of theorems. Isabelle's top-level displays theorems as ML values, -enclosed in quotes.\footnote{This works under both Poly/ML and Standard ML - of New Jersey.} Printing functions like {\tt prth} omit the quotes and -the surrounding {\tt val \ldots :\ thm}. +User input appears in {\tt typewriter characters}, and output appears in +{\sltt slanted typewriter characters}. \ML's response {\out val }~\ldots{} +is compiler-dependent and will sometimes be suppressed. This session +illustrates two formats for the display of theorems. Isabelle's top-level +displays theorems as ML values, enclosed in quotes.\footnote{This works + under both Poly/ML and Standard ML of New Jersey.} Printing commands +like {\tt prth} omit the quotes and the surrounding {\tt val \ldots :\ + thm}. Ignoring their side-effects, the commands are identity functions. To contrast {\tt RS} with {\tt RSN}, we resolve \ttindex{conjunct1}, which stands for~(\conj E1), with~\ttindex{mp}. @@ -242,25 +239,30 @@ \qquad \infer[({\imp}E)]{Q}{P\imp Q & \infer[({\conj}E1)]{P}{P\conj Q@1}}$ -The printing commands return their argument; the \ML{} identifier~{\tt it} -denotes the value just printed. You may derive a rule by pasting other -rules together. Below, \ttindex{spec} stands for~$(\forall E)$: +% +Rules can be derived by pasting other rules together. Let us join +\ttindex{spec}, which stands for~$(\forall E)$, with {\tt mp} and {\tt + conjunct1}. In \ML{}, the identifier~{\tt it} denotes the value just +printed. \begin{ttbox} spec; {\out val it = "ALL x. ?P(x) ==> ?P(?x)" : thm} it RS mp; -{\out val it = "[| ALL x. ?P3(x) --> ?Q2(x); ?P3(?x1) |] ==> ?Q2(?x1)" : thm} +{\out val it = "[| ALL x. ?P3(x) --> ?Q2(x); ?P3(?x1) |] ==>} +{\out ?Q2(?x1)" : thm} it RS conjunct1; -{\out val it = "[| ALL x. ?P4(x) --> ?P6(x) & ?Q5(x); ?P4(?x2) |] ==> ?P6(?x2)"} +{\out val it = "[| ALL x. ?P4(x) --> ?P6(x) & ?Q5(x); ?P4(?x2) |] ==>} +{\out ?P6(?x2)" : thm} standard it; -{\out val it = "[| ALL x. ?P(x) --> ?Pa(x) & ?Q(x); ?P(?x) |] ==> ?Pa(?x)"} +{\out val it = "[| ALL x. ?P(x) --> ?Pa(x) & ?Q(x); ?P(?x) |] ==>} +{\out ?Pa(?x)" : thm} \end{ttbox} By resolving$(\forall E)$with (${\imp}E)$and (${\conj}E1)$, we have derived a destruction rule for formulae of the form$\forall x.
P(x)\imp(Q(x)\conj R(x))$. Used with destruct-resolution, such specialized rules provide a way of referring to particular assumptions. -\subsection{Flex-flex equations} \label{flexflex} +\subsection{*Flex-flex equations} \label{flexflex} \index{flex-flex equations|bold}\index{unknowns!of function type} In higher-order unification, {\bf flex-flex} equations are those where both sides begin with a function unknown, such as$\Var{f}(0)\qeq\Var{g}(0)$. @@ -294,7 +296,7 @@ Isabelle simplifies flex-flex equations to eliminate redundant bound variables. In$\lambda x\,y.\Var{f}(k(y),x) \qeq \lambda x\,y.\Var{g}(y)$, there is no bound occurrence of~$x$on the right side; thus, there will be -none on the left, in a common instance of these terms. Choosing a new +none on the left in a common instance of these terms. Choosing a new variable~$\Var{h}$, Isabelle assigns$\Var{f}\equiv \lambda u\,v.?h(u)$, simplifying the left side to$\lambda x\,y.\Var{h}(k(y))$. Dropping$x$from the equation leaves$\lambda y.\Var{h}(k(y)) \qeq \lambda
@@ -387,15 +389,14 @@
applies the {\it tactic\/} to the current proof
state, raising an exception if the tactic fails.

-\item[\ttindexbold{undo}(); ]
-reverts to the previous proof state.  Undo can be repeated but cannot be
-undone.  Do not omit the parentheses; typing {\tt undo;} merely causes \ML\
-to echo the value of that function.
+\item[\ttindexbold{undo}(); ]
+  reverts to the previous proof state.  Undo can be repeated but cannot be
+  undone.  Do not omit the parentheses; typing {\tt\ \ undo;\ \ } merely
+  causes \ML\ to echo the value of that function.

\item[\ttindexbold{result}()]
returns the theorem just proved, in a standard format.  It fails if
-unproved subgoals are left or if the main goal does not match the one you
-started with.
+unproved subgoals are left, etc.
\end{description}
The commands and tactics given above are cumbersome for interactive use.
Although our examples will use the full commands, you may prefer Isabelle's
@@ -415,11 +416,13 @@

\subsection{A trivial example in propositional logic}
\index{examples!propositional}
-Directory {\tt FOL} of the Isabelle distribution defines the \ML\
-identifier~\ttindex{FOL.thy}, which denotes the theory of first-order
-logic.  Let us try the example from~\S\ref{prop-proof}, entering the goal
-$P\disj P\imp P$ in that theory.\footnote{To run these examples, see the
-file {\tt FOL/ex/intro.ML}.}
+
+Directory {\tt FOL} of the Isabelle distribution defines the theory of
+first-order logic.  Let us try the example from \S\ref{prop-proof},
+entering the goal $P\disj P\imp P$ in that theory.\footnote{To run these
+  examples, see the file {\tt FOL/ex/intro.ML}.  The files {\tt README} and
+  {\tt Makefile} on the directories {\tt Pure} and {\tt FOL} explain how to
+  build first-order logic.}
\begin{ttbox}
goal FOL.thy "P|P --> P";
{\out Level 0}
@@ -448,9 +451,10 @@
{\out 2. [| P | P; ?P1 |] ==> P}
{\out 3. [| P | P; ?Q1 |] ==> P}
\end{ttbox}
-At Level~2 there are three subgoals, each provable by
-assumption.  We deviate from~\S\ref{prop-proof} by tackling subgoal~3
-first, using \ttindex{assume_tac}.  This updates {\tt?Q1} to~{\tt P}.
+At Level~2 there are three subgoals, each provable by assumption.  We
+deviate from~\S\ref{prop-proof} by tackling subgoal~3 first, using
+\ttindex{assume_tac}.  This affects subgoal~1, updating {\tt?Q1} to~{\tt
+  P}.
\begin{ttbox}
by (assume_tac 3);
{\out Level 3}
@@ -458,7 +462,7 @@
{\out 1. P | P ==> ?P1 | P}
{\out 2. [| P | P; ?P1 |] ==> P}
\end{ttbox}
-Next we tackle subgoal~2, instantiating {\tt?P1} to~{\tt P}.
+Next we tackle subgoal~2, instantiating {\tt?P1} to~{\tt P} in subgoal~1.
\begin{ttbox}
by (assume_tac 2);
{\out Level 4}
@@ -483,19 +487,23 @@
throughout the proof.  Isabelle finally converts them to scheme variables
so that the resulting theorem can be instantiated with any formula.

+As an exercise, try doing the proof as in \S\ref{prop-proof}, observing how
+instantiations affect the proof state.

-\subsection{Proving a distributive law}
+
+\subsection{Part of a distributive law}
\index{examples!propositional}
To demonstrate the tactics \ttindex{eresolve_tac}, \ttindex{dresolve_tac}
-and the tactical \ttindex{REPEAT}, we shall prove part of the distributive
-law $(P\conj Q)\disj R \iff (P\disj R)\conj (Q\disj R)$.
-
+and the tactical \ttindex{REPEAT}, let us prove part of the distributive
+law
+$(P\conj Q)\disj R \,\bimp\, (P\disj R)\conj (Q\disj R).$
We begin by stating the goal to Isabelle and applying~$({\imp}I)$ to it:
\begin{ttbox}
goal FOL.thy "(P & Q) | R  --> (P | R)";
{\out Level 0}
{\out P & Q | R --> P | R}
{\out  1. P & Q | R --> P | R}
+\ttbreak
by (resolve_tac [impI] 1);
{\out Level 1}
{\out P & Q | R --> P | R}
@@ -515,7 +523,8 @@
replacing the assumption $P\conj Q$ by~$P$.  Normally we should apply the
rule~(${\conj}E)$, given in~\S\ref{destruct}.  That is an elimination rule
and requires {\tt eresolve_tac}; it would replace $P\conj Q$ by the two
-assumptions~$P$ and~$Q$.  The present example does not need~$Q$.
+assumptions~$P$ and~$Q$.  Because the present example does not need~$Q$, we
+may try out {\tt dresolve_tac}.
\begin{ttbox}
by (dresolve_tac [conjunct1] 1);
{\out Level 3}
@@ -556,7 +565,7 @@
function unknown and $x$ and~$z$ are parameters.  This may be replaced by
any term, possibly containing free occurrences of $x$ and~$z$.

-\subsection{Two quantifier proofs, successful and not}
+\subsection{Two quantifier proofs: a success and a failure}
\index{examples!with quantifiers}
Let us contrast a proof of the theorem $\forall x.\exists y.x=y$ with an
attempted proof of the non-theorem $\exists y.\forall x.x=y$.  The former
@@ -566,7 +575,7 @@
but we need never say so. This choice is forced by the reflexive law for
equality, and happens automatically.

-\subsubsection{The successful proof}
+\paragraph{The successful proof.}
The proof of $\forall x.\exists y.x=y$ demonstrates the introduction rules
$(\forall I)$ and~$(\exists I)$.  We state the goal and apply $(\forall I)$:
\begin{ttbox}
@@ -583,7 +592,8 @@
The variable~{\tt x} is no longer universally quantified, but is a
parameter in the subgoal; thus, it is universally quantified at the
meta-level.  The subgoal must be proved for all possible values of~{\tt x}.
-We apply the rule $(\exists I)$:
+
+To remove the existential quantifier, we apply the rule $(\exists I)$:
\begin{ttbox}
by (resolve_tac [exI] 1);
{\out Level 2}
@@ -606,8 +616,8 @@
and~$\Var{y@1}$ are both instantiated to the identity function, and
$x=\Var{y@1}(x)$ collapses to~$x=x$ by $\beta$-reduction.

-\subsubsection{The unsuccessful proof}
-We state the goal $\exists y.\forall x.x=y$, which is {\bf not} a theorem, and
+\paragraph{The unsuccessful proof.}
+We state the goal $\exists y.\forall x.x=y$, which is not a theorem, and
try~$(\exists I)$:
\begin{ttbox}
goal FOL.thy "EX y. ALL x. x=y";
@@ -635,22 +645,21 @@
by (resolve_tac [refl] 1);
\end{ttbox}
-No other choice of rules seems likely to complete the proof.  Of course,
-this is no guarantee that Isabelle cannot prove $\exists y.\forall x.x=y$
-or other invalid assertions.  We must appeal to the soundness of
-first-order logic and the faithfulness of its encoding in
-Isabelle~\cite{paulson89}, and must trust the implementation.
+There can be no proof of $\exists y.\forall x.x=y$ by the soundness of
+first-order logic.  I have elsewhere proved the faithfulness of Isabelle's
+encoding of first-order logic~\cite{paulson89}; there could, of course, be
+faults in the implementation.

\subsection{Nested quantifiers}
\index{examples!with quantifiers}
-Multiple quantifiers create complex terms.  Proving $(\forall x\,y.P(x,y)) -\imp (\forall z\,w.P(w,z))$, will demonstrate how parameters and
-unknowns develop.  If they appear in the wrong order, the proof will fail.
+Multiple quantifiers create complex terms.  Proving
+$(\forall x\,y.P(x,y)) \imp (\forall z\,w.P(w,z))$
+will demonstrate how parameters and unknowns develop.  If they appear in
+the wrong order, the proof will fail.
+
This section concludes with a demonstration of {\tt REPEAT}
and~{\tt ORELSE}.
-
-The start of the proof is routine.
\begin{ttbox}
goal FOL.thy "(ALL x y.P(x,y))  -->  (ALL z w.P(w,z))";
{\out Level 0}
@@ -663,7 +672,7 @@
{\out  1. ALL x y. P(x,y) ==> ALL z w. P(w,z)}
\end{ttbox}

-\subsubsection{The wrong approach}
+\paragraph{The wrong approach.}
Using \ttindex{dresolve_tac}, we apply the rule $(\forall E)$, bound to the
\ML\ identifier \ttindex{spec}.  Then we apply $(\forall I)$.
\begin{ttbox}
@@ -678,7 +687,7 @@
{\out  1. !!z. ALL y. P(?x1,y) ==> ALL w. P(w,z)}
\end{ttbox}
The unknown {\tt ?u} and the parameter {\tt z} have appeared.  We again
-apply $(\forall I)$ and~$(\forall E)$.
+apply $(\forall E)$ and~$(\forall I)$.
\begin{ttbox}
by (dresolve_tac [spec] 1);
{\out Level 4}
@@ -701,7 +710,7 @@
{\out uncaught exception ERROR}
\end{ttbox}

-\subsubsection{The right approach}
+\paragraph{The right approach.}
To do this proof, the rules must be applied in the correct order.
Eigenvariables should be created before unknowns.  The
\ttindex{choplev} command returns to an earlier stage of the proof;
@@ -712,8 +721,7 @@
{\out (ALL x y. P(x,y)) --> (ALL z w. P(w,z))}
{\out  1. ALL x y. P(x,y) ==> ALL z w. P(w,z)}
\end{ttbox}
-Previously, we made the mistake of applying $(\forall E)$; this time, we
-apply $(\forall I)$ twice.
+Previously we made the mistake of applying $(\forall E)$ before $(\forall I)$.
\begin{ttbox}
by (resolve_tac [allI] 1);
{\out Level 2}
@@ -747,15 +755,11 @@
{\out No subgoals!}
\end{ttbox}

-\subsubsection{A one-step proof using tacticals}
-\index{tacticals}
-\index{examples!of tacticals}
-Repeated application of rules can be an effective theorem-proving
-procedure, but the rules should be attempted in an order that delays the
-creation of unknowns.  As we have just seen, \ttindex{allI} should be
-attempted before~\ttindex{spec}, while \ttindex{assume_tac} generally can
-be attempted first.  Such priorities can easily be expressed
-using~\ttindex{ORELSE}, and repeated using~\ttindex{REPEAT}.  Let us return
+\paragraph{A one-step proof using tacticals.}
+\index{tacticals} \index{examples!of tacticals}
+
+Repeated application of rules can be effective, but the rules should be
+attempted in an order that delays the creation of unknowns.  Let us return
to the original goal using \ttindex{choplev}:
\begin{ttbox}
choplev 0;
@@ -763,10 +767,12 @@
{\out (ALL x y. P(x,y)) --> (ALL z w. P(w,z))}
{\out  1. (ALL x y. P(x,y)) --> (ALL z w. P(w,z))}
\end{ttbox}
-A repetitive procedure proves it:
+As we have just seen, \ttindex{allI} should be attempted
+before~\ttindex{spec}, while \ttindex{assume_tac} generally can be
+attempted first.  Such priorities can easily be expressed
+using~\ttindex{ORELSE}, and repeated using~\ttindex{REPEAT}.
\begin{ttbox}
-by (REPEAT (assume_tac 1
-     ORELSE resolve_tac [impI,allI] 1
+by (REPEAT (assume_tac 1 ORELSE resolve_tac [impI,allI] 1
ORELSE dresolve_tac [spec] 1));
{\out Level 1}
{\out (ALL x y. P(x,y)) --> (ALL z w. P(w,z))}
@@ -776,8 +782,10 @@

\subsection{A realistic quantifier proof}
\index{examples!with quantifiers}
-A proof of $(\forall x. P(x) \imp Q) \imp (\exists x. P(x)) \imp Q$
-demonstrates the practical use of parameters and unknowns.
+To see the practical use of parameters and unknowns, let us prove half of
+the equivalence
+$(\forall x. P(x) \imp Q) \,\bimp\, ((\exists x. P(x)) \imp Q).$
+We state the left-to-right half to Isabelle in the normal way.
Since $\imp$ is nested to the right, $({\imp}I)$ can be applied twice; we
use \ttindex{REPEAT}:
\begin{ttbox}
@@ -810,9 +818,8 @@
{\out (ALL x. P(x) --> Q) --> (EX x. P(x)) --> Q}
{\out  1. !!x. [| P(x); P(?x3(x)) --> Q |] ==> Q}
\end{ttbox}
-Because the parameter~{\tt x} appeared first, the unknown
-term~{\tt?x3(x)} may depend upon it.  Had we eliminated the universal
-quantifier before the existential, this would not be so.
+Because we applied $(\exists E)$ before $(\forall E)$, the unknown
+term~{\tt?x3(x)} may depend upon the parameter~{\tt x}.

Although $({\imp}E)$ is a destruction rule, it works with
\ttindex{eresolve_tac} to perform backward chaining.  This technique is
@@ -874,7 +881,8 @@
\ttback       (EX y. (ALL z. P(z,y) --> P(z,f(x))) & P(x,y))";
{\out Level 0}
{\out ALL x. P(x,f(x)) <-> (EX y. (ALL z. P(z,y) --> P(z,f(x))) & P(x,y))}
-{\out  1. ALL x. P(x,f(x)) <-> (EX y. (ALL z. P(z,y) --> P(z,f(x))) & P(x,y))}
+{\out  1. ALL x. P(x,f(x)) <->}
+{\out     (EX y. (ALL z. P(z,y) --> P(z,f(x))) & P(x,y))}
\end{ttbox}
Again, subgoal~1 succumbs immediately.
\begin{ttbox}
--- a/doc-src/Intro/intro.tex	Wed Mar 23 16:56:44 1994 +0100
+++ b/doc-src/Intro/intro.tex	Thu Mar 24 13:25:12 1994 +0100
@@ -1,4 +1,4 @@
-\documentstyle[a4,12pt,proof,iman,alltt]{article}
+\documentstyle[a4,12pt,proof,iman,extra]{article}
%% $Id$
%% run    bibtex intro         to prepare bibliography
%% run    ../sedindex intro    to prepare index file
@@ -95,13 +95,12 @@
development and will continue to change.

\subsubsection*{Overview}
-This manual consists of three parts.
-Part~I discusses the Isabelle's foundations.
-Part~II, presents simple on-line sessions, starting with forward proof.
-It also covers basic tactics and tacticals, and some commands for invoking
-Part~III contains further examples for users with a bit of experience.
-It explains how to derive rules define theories, and concludes with an
+This manual consists of three parts.  Part~I discusses the Isabelle's
+foundations.  Part~II, presents simple on-line sessions, starting with
+forward proof.  It also covers basic tactics and tacticals, and some
+commands for invoking them.  Part~III contains further examples for users
+with a bit of experience.  It explains how to derive rules define theories,

Isabelle's Reference Manual and Object-Logics manual contain more details.
They assume familiarity with the concepts presented here.
@@ -142,9 +141,8 @@
\include{getting}
\end{document}