doc-src/Tutorial/basics.tex
author wenzelm
Sat, 01 Jul 2000 19:49:20 +0200
changeset 9226 cbe6144f0f15
parent 6691 8a1b5f9d8420
child 9255 2ceb11a2e190
permissions -rw-r--r--
tuned;

\chapter{Basic Concepts}

\section{Introduction}

This is a tutorial on how to use Isabelle/HOL as a specification and
verification system. Isabelle is a generic system for implementing logical
formalisms, and Isabelle/HOL is the specialization of Isabelle for
HOL, which abbreviates Higher-Order Logic. We introduce HOL step by step
following the equation
\[ \mbox{HOL} = \mbox{Functional Programming} + \mbox{Logic}. \]
We assume that the reader is familiar with the basic concepts of both fields.
For excellent introductions to functional programming consult the textbooks
by Bird and Wadler~\cite{Bird-Wadler} or Paulson~\cite{paulson-ml2}.  Although
this tutorial initially concentrates on functional programming, do not be
misled: HOL can express most mathematical concepts, and functional
programming is just one particularly simple and ubiquitous instance.

A tutorial is by definition incomplete. To fully exploit the power of the
system you need to consult the Isabelle Reference Manual~\cite{isabelle-ref}
for details about Isabelle and the Isabelle/HOL manual~\cite{isabelle-HOL}
for details relating to HOL. Both manuals have a comprehensive index.

\section{Theories, proofs and interaction}
\label{sec:Basic:Theories}

Working with Isabelle means creating two different kinds of documents:
theories and proof scripts. Roughly speaking, a \bfindex{theory} is a named
collection of types and functions, much like a module in a programming
language or a specification in a specification language. In fact, theories in
HOL can be either. Theories must reside in files with the suffix
\texttt{.thy}. The general format of a theory file \texttt{T.thy} is
\begin{ttbox}
T = B\(@1\) + \(\cdots\) + B\(@n\) +
\({\langle}declarations{\rangle}\)
end
\end{ttbox}
where \texttt{B}$@1$, \dots, \texttt{B}$@n$ are the names of existing
theories that \texttt{T} is based on and ${\langle}declarations{\rangle}$ stands for the
newly introduced concepts (types, functions etc). The \texttt{B}$@i$ are the
direct \textbf{parent theories}\indexbold{parent theory} of \texttt{T}.
Everything defined in the parent theories (and their parents \dots) is
automatically visible. To avoid name clashes, identifiers can be qualified by
theory names as in \texttt{T.f} and \texttt{B.f}. HOL's theory library is
available online at

\begin{center}\small
  \begin{tabular}{l}
    \url{http://www.cl.cam.ac.uk/Research/HVG/Isabelle/library/} \\
    \url{http://isabelle.in.tum.de/library/} \\
  \end{tabular}
\end{center}

and is recommended browsing.
\begin{warn}
  HOL contains a theory \ttindexbold{Main}, the union of all the basic
  predefined theories like arithmetic, lists, sets, etc.\ (see the online
  library).  Unless you know what you are doing, always include \texttt{Main}
  as a direct or indirect parent theory of all your theories.
\end{warn}

This tutorial is concerned with introducing you to the different linguistic
constructs that can fill ${\langle}declarations{\rangle}$ in the above theory template.
A complete grammar of the basic constructs is found in Appendix~A
of~\cite{isabelle-ref}, for reference in times of doubt.

The tutorial is also concerned with showing you how to prove theorems about
the concepts in a theory. This involves invoking predefined theorem proving
commands. Because Isabelle is written in the programming language
ML,\footnote{Many concepts in HOL and ML are similar. Make sure you do not
  confuse the two levels.} interacting with Isabelle means calling ML
functions. Hence \bfindex{proof scripts} are sequences of calls to ML
functions that perform specific theorem proving tasks. Nevertheless,
familiarity with ML is absolutely not required.  All proof scripts for theory
\texttt{T} (defined in file \texttt{T.thy}) should be contained in file
\texttt{T.ML}. Theory and proof scripts are loaded (and checked!) by calling
the ML function \ttindexbold{use_thy}:
\begin{ttbox}
use_thy "T";
\end{ttbox}

There are more advanced interfaces for Isabelle that hide the ML level from
you and replace function calls by menu selection. There is even a special
font with mathematical symbols. For details see the Isabelle home page.  This
tutorial concentrates on the bare essentials and ignores such niceties.

\section{Types, terms and formulae}
\label{sec:TypesTermsForms}

Embedded in the declarations of a theory are the types, terms and formulae of
HOL. HOL is a typed logic whose type system resembles that of functional
programming languages like ML or Haskell. Thus there are
\begin{description}
\item[base types,] in particular \ttindex{bool}, the type of truth values,
and \ttindex{nat}, the type of natural numbers.
\item[type constructors,] in particular \ttindex{list}, the type of
lists, and \ttindex{set}, the type of sets. Type constructors are written
postfix, e.g.\ \texttt{(nat)list} is the type of lists whose elements are
natural numbers. Parentheses around single arguments can be dropped (as in
\texttt{nat list}), multiple arguments are separated by commas (as in
\texttt{(bool,nat)foo}).
\item[function types,] denoted by \ttindexbold{=>}. In HOL
\texttt{=>} represents {\em total} functions only. As is customary,
\texttt{$\tau@1$ => $\tau@2$ => $\tau@3$} means
\texttt{$\tau@1$ => ($\tau@2$ => $\tau@3$)}. Isabelle also supports the
notation \texttt{[$\tau@1,\dots,\tau@n$] => $\tau$} which abbreviates
\texttt{$\tau@1$ => $\cdots$ => $\tau@n$ => $\tau$}.
\item[type variables,] denoted by \texttt{'a}, \texttt{'b} etc, just like in
ML. They give rise to polymorphic types like \texttt{'a => 'a}, the type of the
identity function.
\end{description}
\begin{warn}
  Types are extremely important because they prevent us from writing
  nonsense.  Isabelle insists that all terms and formulae must be well-typed
  and will print an error message if a type mismatch is encountered. To
  reduce the amount of explicit type information that needs to be provided by
  the user, Isabelle infers the type of all variables automatically (this is
  called \bfindex{type inference}) and keeps quiet about it. Occasionally
  this may lead to misunderstandings between you and the system. If anything
  strange happens, we recommend to set the flag \ttindexbold{show_types} that
  tells Isabelle to display type information that is usually suppressed:
  simply type
\begin{ttbox}
set show_types;
\end{ttbox}

\noindent
at the ML-level. This can be reversed by \texttt{reset show_types;}.
\end{warn}


\textbf{Terms}\indexbold{term}
are formed as in functional programming by applying functions to
arguments. If \texttt{f} is a function of type \texttt{$\tau@1$ => $\tau@2$}
and \texttt{t} is a term of type $\tau@1$ then \texttt{f~t} is a term of type
$\tau@2$. HOL also supports infix functions like \texttt{+} and some basic
constructs from functional programming:
\begin{description}
\item[\texttt{if $b$ then $t@1$ else $t@2$}]\indexbold{*if}
means what you think it means and requires that
$b$ is of type \texttt{bool} and $t@1$ and $t@2$ are of the same type.
\item[\texttt{let $x$ = $t$ in $u$}]\indexbold{*let}
is equivalent to $u$ where all occurrences of $x$ have been replaced by
$t$. For example,
\texttt{let x = 0 in x+x} means \texttt{0+0}. Multiple bindings are separated
by semicolons: \texttt{let $x@1$ = $t@1$; \dots; $x@n$ = $t@n$ in $u$}.
\item[\texttt{case $e$ of $c@1$ => $e@1$ | \dots | $c@n$ => $e@n$}]
\indexbold{*case}
evaluates to $e@i$ if $e$ is of the form
$c@i$. See~\S\ref{sec:case-expressions} for details.
\end{description}

Terms may also contain $\lambda$-abstractions. For example, $\lambda
x. x+1$ is the function that takes an argument $x$ and returns $x+1$. In
Isabelle we write \texttt{\%x.~x+1}.\index{==>@{\tt\%}|bold}
Instead of \texttt{\%x.~\%y.~\%z.~t} we can write \texttt{\%x~y~z.~t}.

\textbf{Formulae}\indexbold{formula}
are terms of type \texttt{bool}. There are the basic
constants \ttindexbold{True} and \ttindexbold{False} and the usual logical
connectives (in decreasing order of priority):
\verb$~$\index{$HOL1@{\ttnot}|bold} (`not'),
\texttt{\&}\index{$HOL2@{\tt\&}|bold} (`and'),
\texttt{|}\index{$HOL2@{\ttor}|bold} (`or') and
\texttt{-->}\index{$HOL2@{\tt-->}|bold} (`implies'),
all of which (except the unary \verb$~$) associate to the right. In
particular \texttt{A --> B --> C} means \texttt{A --> (B --> C)} and is thus
logically equivalent with \texttt{A \& B --> C}
(which is \texttt{(A \& B) --> C}).

Equality is available in the form of the infix function
\texttt{=} of type \texttt{'a => 'a => bool}. Thus \texttt{$t@1$ = $t@2$} is
a formula provided $t@1$ and $t@2$ are terms of the same type. In case $t@1$
and $t@2$ are of type \texttt{bool}, \texttt{=} acts as if-and-only-if.

The syntax for quantifiers is
\texttt{!~$x$.$\,P$}\index{$HOLQ@{\ttall}|bold} (`for all $x$') and
\texttt{?~$x$.$\,P$}\index{$HOLQ@{\tt?}|bold} (`exists $x$').
There is even \texttt{?!~$x$.$\,P$}\index{$HOLQ@{\ttuniquex}|bold}, which
means that there exists exactly one $x$ that satisfies $P$. Instead of
\texttt{!} and \texttt{?} you may also write \texttt{ALL} and \texttt{EX}.
Nested quantifications can be abbreviated:
\texttt{!$x~y~z$.$\,P$} means \texttt{!$x$.~!$y$.~!$z$.$\,P$}.

Despite type inference, it is sometimes necessary to attach explicit
\bfindex{type constraints} to a term.  The syntax is \texttt{$t$::$\tau$} as
in \texttt{x < (y::nat)}. Note that \texttt{::} binds weakly and should
therefore be enclosed in parentheses: \texttt{x < y::nat} is ill-typed
because it is interpreted as \texttt{(x < y)::nat}. The main reason for type
constraints are overloaded functions like \texttt{+}, \texttt{*} and
\texttt{<}. (See \S\ref{sec:TypeClasses} for a full discussion of
overloading.)

\begin{warn}
In general, HOL's concrete syntax tries to follow the conventions of
functional programming and mathematics. Below we list the main rules that you
should be familiar with to avoid certain syntactic traps. A particular
problem for novices can be the priority of operators. If you are unsure, use
more rather than fewer parentheses. In those cases where Isabelle echoes your
input, you can see which parentheses are dropped---they were superfluous. If
you are unsure how to interpret Isabelle's output because you don't know
where the (dropped) parentheses go, set (and possibly reset) the flag
\ttindexbold{show_brackets}:
\begin{ttbox}
set show_brackets; \(\dots\); reset show_brackets;
\end{ttbox}
\end{warn}

\begin{itemize}
\item
Remember that \texttt{f t u} means \texttt{(f t) u} and not \texttt{f(t u)}!
\item
Isabelle allows infix functions like \texttt{+}. The prefix form of function
application binds more strongly than anything else and hence \texttt{f~x + y}
means \texttt{(f~x)~+~y} and not \texttt{f(x+y)}.
\item
Remember that in HOL if-and-only-if is expressed using equality.  But equality
has a high priority, as befitting a  relation, while if-and-only-if typically
has the lowest priority.  Thus, \verb$~ ~ P = P$ means \verb$~ ~(P = P)$ and
not \verb$(~ ~P) = P$. When using \texttt{=} to mean logical equivalence,
enclose both operands in parentheses, as in \texttt{(A \& B) = (B \& A)}.
\item
Constructs with an opening but without a closing delimiter bind very weakly
and should therefore be enclosed in parentheses if they appear in subterms, as
in \texttt{f = (\%x.~x)}. This includes
\ttindex{if}, \ttindex{let}, \ttindex{case}, \verb$%$ and quantifiers.
\item
Never write \texttt{\%x.x} or \texttt{!x.x=x} because \texttt{x.x} is always
read as a single qualified identifier that refers to an item \texttt{x} in
theory \texttt{x}. Write \texttt{\%x.~x} and \texttt{!x.~x=x} instead.
\end{itemize}

\section{Variables}
\label{sec:variables}

Isabelle distinguishes free and bound variables just as is customary. Bound
variables are automatically renamed to avoid clashes with free variables. In
addition, Isabelle has a third kind of variable, called a \bfindex{schematic
  variable} or \bfindex{unknown}, which starts with a \texttt{?}.  Logically,
an unknown is a free variable. But it may be instantiated by another term
during the proof process. For example, the mathematical theorem $x = x$ is
represented in Isabelle as \texttt{?x = ?x}, which means that Isabelle can
instantiate it arbitrarily. This is in contrast to ordinary variables, which
remain fixed. The programming language Prolog calls unknowns {\em logical\/}
variables.

Most of the time you can and should ignore unknowns and work with ordinary
variables. Just don't be surprised that after you have finished the
proof of a theorem, Isabelle (i.e.\ \ttindex{qed} at the end of a proof) will
turn your free variables into unknowns: it merely indicates that Isabelle will
automatically instantiate those unknowns suitably when the theorem is used in
some other proof.
\begin{warn}
  The existential quantifier \texttt{?}\index{$HOLQ@{\tt?}} needs to be
  followed by a space. Otherwise \texttt{?x} is interpreted as a schematic
  variable.
\end{warn}

\section{Getting started}

Assuming you have installed Isabelle, you start it by typing \texttt{isabelle
  HOL} in a shell window.\footnote{Simply executing \texttt{isabelle} without
  an argument starts the default logic, which usually is already \texttt{HOL}.
  This is controlled by the \texttt{ISABELLE_LOGIC} setting, see \emph{The
    Isabelle System Manual} for more details.} This presents you with
Isabelle's most basic ASCII interface.  In addition you need to open an editor
window to create theories (\texttt{.thy} files) and proof scripts
(\texttt{.ML} files). While you are developing a proof, we recommend to type
each proof command into the ML-file first and then enter it into Isabelle by
copy-and-paste, thus ensuring that you have a complete record of your proof.