%% $Id$
\chapter{Theories, Terms and Types} \label{theories}
\index{theories|(}\index{signatures|bold}
\index{reading!axioms|see{{\tt assume_ax}}}
Theories organize the syntax, declarations and axioms of a mathematical
development. They are built, starting from the Pure theory, by extending
and merging existing theories. They have the \ML\ type \mltydx{theory}.
Theory operations signal errors by raising exception \xdx{THEORY},
returning a message and a list of theories.
Signatures, which contain information about sorts, types, constants and
syntax, have the \ML\ type~\mltydx{Sign.sg}. For identification, each
signature carries a unique list of \bfindex{stamps}, which are \ML\
references to strings. The strings serve as human-readable names; the
references serve as unique identifiers. Each primitive signature has a
single stamp. When two signatures are merged, their lists of stamps are
also merged. Every theory carries a unique signature.
Terms and types are the underlying representation of logical syntax. Their
\ML\ definitions are irrelevant to naive Isabelle users. Programmers who
wish to extend Isabelle may need to know such details, say to code a tactic
that looks for subgoals of a particular form. Terms and types may be
`certified' to be well-formed with respect to a given signature.
\section{Defining theories}\label{sec:ref-defining-theories}
Theories are usually defined using theory definition files (which have a name
suffix {\tt .thy}). There is also a low level interface provided by certain
\ML{} functions (see \S\ref{BuildingATheory}).
Appendix~\ref{app:TheorySyntax} presents the concrete syntax for theory
definitions; here is an explanation of the constituent parts:
\begin{description}
\item[{\it theoryDef}]
is the full definition. The new theory is called $id$. It is the union
of the named {\bf parent theories}\indexbold{theories!parent}, possibly
extended with new classes, etc. The basic theory, which contains only
the meta-logic, is called \thydx{Pure}.
Normally each {\it name\/} is an identifier, the name of the parent theory.
Quoted strings can be used to document additional file dependencies; see
\S\ref{LoadingTheories} for details.
\item[$classes$]
is a series of class declarations. Declaring {\tt$id$ < $id@1$ \dots\
$id@n$} makes $id$ a subclass of the existing classes $id@1\dots
id@n$. This rules out cyclic class structures. Isabelle automatically
computes the transitive closure of subclass hierarchies; it is not
necessary to declare {\tt c < e} in addition to {\tt c < d} and {\tt d <
e}.
\item[$default$]
introduces $sort$ as the new default sort for type variables. This applies
to unconstrained type variables in an input string but not to type
variables created internally. If omitted, the default sort is the listwise
union of the default sorts of the parent theories (i.e.\ their logical
intersection).
\item[$sort$]
is a finite set of classes. A single class $id$ abbreviates the singleton
set {\tt\{}$id${\tt\}}.
\item[$types$]
is a series of type declarations. Each declares a new type constructor
or type synonym. An $n$-place type constructor is specified by
$(\alpha@1,\dots,\alpha@n)name$, where the type variables serve only to
indicate the number~$n$.
A {\bf type synonym}\indexbold{type synonyms} is an abbreviation
$(\alpha@1,\dots,\alpha@n)name = \mbox{\tt"}\tau\mbox{\tt"}$, where
$name$ can be a string and $\tau$ must be enclosed in quotation marks.
\item[$infix$]
declares a type or constant to be an infix operator of priority $nat$
associating to the left ({\tt infixl}) or right ({\tt infixr}). Only
2-place type constructors can have infix status; an example is {\tt
('a,'b)~"*"~(infixr~20)}, which expresses binary product types.
\item[$arities$]
is a series of arity declarations. Each assigns arities to type
constructors. The $name$ must be an existing type constructor, which is
given the additional arity $arity$.
\item[$constDecl$]
is a series of constant declarations. Each new constant $name$ is given
the specified type. The optional $mixfix$ annotations may
attach concrete syntax to the constant. A variant of {\tt consts} is the
{\tt syntax} section\index{*syntax section}, which adds just syntax without
declaring logical constants.
\item[$mixfix$] \index{mixfix declarations}
annotations can take three forms:
\begin{itemize}
\item A mixfix template given as a $string$ of the form
{\tt"}\dots{\tt\_}\dots{\tt\_}\dots{\tt"} where the $i$-th underscore
indicates the position where the $i$-th argument should go. The list
of numbers gives the priority of each argument. The final number gives
the priority of the whole construct.
\item A constant $f$ of type $\tau@1\To(\tau@2\To\tau)$ can be given {\bf
infix} status.
\item A constant $f$ of type $(\tau@1\To\tau@2)\To\tau$ can be given {\bf
binder} status. The declaration {\tt binder} $\cal Q$ $p$ causes
${\cal Q}\,x.F(x)$ to be treated
like $f(F)$, where $p$ is the priority.
\end{itemize}
\item[$trans$]
specifies syntactic translation rules (macros). There are three forms:
parse rules ({\tt =>}), print rules ({\tt <=}), and parse/print rules ({\tt
==}).
\item[$rule$]
is a series of rule declarations. Each has a name $id$ and the formula is
given by the $string$. Rule names must be distinct within any single
theory file.
\item[$ml$] \index{*ML section}
consists of \ML\ code, typically for parse and print translation functions.
\end{description}
%
Chapters~\ref{Defining-Logics} and \ref{chap:syntax} explain mixfix
declarations, translation rules and the {\tt ML} section in more detail.
\subsection{*Classes and arities}
\index{classes!context conditions}\index{arities!context conditions}
In order to guarantee principal types~\cite{nipkow-prehofer},
arity declarations must obey two conditions:
\begin{itemize}
\item There must be no two declarations $ty :: (\vec{r})c$ and $ty ::
(\vec{s})c$ with $\vec{r} \neq \vec{s}$. For example, the following is
forbidden:
\begin{ttbox}
types
'a ty
arities
ty :: ({\ttlbrace}logic{\ttrbrace}) logic
ty :: ({\ttlbrace}{\ttrbrace})logic
\end{ttbox}
\item If there are two declarations $ty :: (s@1,\dots,s@n)c$ and $ty ::
(s@1',\dots,s@n')c'$ such that $c' < c$ then $s@i' \preceq s@i$ must hold
for $i=1,\dots,n$. The relationship $\preceq$, defined as
\[ s' \preceq s \iff \forall c\in s. \exists c'\in s'.~ c'\le c, \]
expresses that the set of types represented by $s'$ is a subset of the set of
types represented by $s$. For example, the following is forbidden:
\begin{ttbox}
classes
term < logic
types
'a ty
arities
ty :: ({\ttlbrace}logic{\ttrbrace})logic
ty :: ({\ttlbrace}{\ttrbrace})term
\end{ttbox}
\end{itemize}
\section{Loading a new theory}\label{LoadingTheories}
\index{theories!loading}\index{files!reading}
\begin{ttbox}
use_thy : string -> unit
time_use_thy : string -> unit
loadpath : string list ref \hfill{\bf initially {\tt["."]}}
delete_tmpfiles : bool ref \hfill{\bf initially true}
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{use_thy} $thyname$]
reads the theory $thyname$ and creates an \ML{} structure as described below.
\item[\ttindexbold{time_use_thy} $thyname$]
calls {\tt use_thy} $thyname$ and reports the time taken.
\item[\ttindexbold{loadpath}]
contains a list of directories to search when locating the files that
define a theory. This list is only used if the theory name in {\tt
use_thy} does not specify the path explicitly.
\item[\ttindexbold{delete_tmpfiles} := false;]
suppresses the deletion of temporary files.
\end{ttdescription}
%
Each theory definition must reside in a separate file. Let the file {\it
T}{\tt.thy} contain the definition of a theory called~$T$, whose parent
theories are $TB@1$ \dots $TB@n$. Calling \ttindexbold{use_thy}~{\tt"{\it
T\/}"} reads the file {\it T}{\tt.thy}, writes a temporary \ML{}
file {\tt.{\it T}.thy.ML}, and reads the latter file. Recursive {\tt
use_thy} calls load those parent theories that have not been loaded
previously; the recursive calls may continue to any depth. One {\tt use_thy}
call can read an entire logic provided all theories are linked appropriately.
The result is an \ML\ structure~$T$ containing at least a component {\tt thy}
for the new theory and components for each of the rules. The structure also
contains the definitions of the {\tt ML} section, if present. The file
{\tt.{\it T}.thy.ML} is then deleted if {\tt delete_tmpfiles} is set to {\tt
true} and no errors occurred.
Finally the file {\it T}{\tt.ML} is read, if it exists. This file normally
begins with the declaration {\tt open~$T$} and contains proofs involving
the new theory.
Some applications construct theories directly by calling \ML\ functions. In
this situation there is no {\tt.thy} file, only an {\tt.ML} file. The
{\tt.ML} file must declare an \ML\ structure having the theory's name and a
component {\tt thy} containing the new theory object.
Section~\ref{sec:pseudo-theories} below describes a way of linking such
theories to their parents.
\begin{warn}
Temporary files are written to the current directory, so this must be
writable. Isabelle inherits the current directory from the operating
system; you can change it within Isabelle by typing {\tt
cd"$dir$"}\index{*cd}.
\end{warn}
\section{Reloading modified theories}\label{sec:reloading-theories}
\indexbold{theories!reloading}
\begin{ttbox}
update : unit -> unit
unlink_thy : string -> unit
\end{ttbox}
Changing a theory on disk often makes it necessary to reload all theories
descended from it. However, {\tt use_thy} reads only one theory, even if
some of the parent theories are out of date. In this case you should call
{\tt update()}.
Isabelle keeps track of all loaded theories and their files. If
\ttindex{use_thy} finds that the theory to be loaded has been read before,
it determines whether to reload the theory as follows. First it looks for
the theory's files in their previous location. If it finds them, it
compares their modification times to the internal data and stops if they
are equal. If the files have been moved, {\tt use_thy} searches for them
as it would for a new theory. After {\tt use_thy} reloads a theory, it
marks the children as out-of-date.
\begin{ttdescription}
\item[\ttindexbold{update}()]
reloads all modified theories and their descendants in the correct order.
\item[\ttindexbold{unlink_thy} $thyname$]\indexbold{theories!removing}
informs Isabelle that theory $thyname$ no longer exists. If you delete the
theory files for $thyname$ then you must execute {\tt unlink_thy};
otherwise {\tt update} will complain about a missing file.
\end{ttdescription}
\goodbreak
\subsection{Important note for Poly/ML users}\index{Poly/{\ML} compiler}
The theory mechanism depends upon reference variables. At the end of a
Poly/\ML{} session, the contents of references are lost unless they are
declared in the current database. In particular, assignments to references
of the {\tt Pure} database are lost, including all information about loaded
theories. To avoid losing this information simply call
\begin{ttbox}
init_thy_reader();
\end{ttbox}
when building the new database.
\subsection{*Pseudo theories}\label{sec:pseudo-theories}
\indexbold{theories!pseudo}%
Any automatic reloading facility requires complete knowledge of all
dependencies. Sometimes theories depend on objects created in \ML{} files
with no associated theory definition file. These objects may be theories but
they could also be theorems, proof procedures, etc.
Unless such dependencies are documented, {\tt update} fails to reload these
\ML{} files and the system is left in a state where some objects, such as
theorems, still refer to old versions of theories. This may lead to the
error
\begin{ttbox}
Attempt to merge different versions of theories: \dots
\end{ttbox}
Therefore there is a way to link theories and {\bf orphaned} \ML{} files ---
those not associated with a theory definition.
Let us assume we have an orphaned \ML{} file named {\tt orphan.ML} and a
theory~$B$ that depends on {\tt orphan.ML} --- for example, {\tt B.ML} uses
theorems proved in {\tt orphan.ML}. Then {\tt B.thy} should
mention this dependency as follows:
\begin{ttbox}
B = \(\ldots\) + "orphan" + \(\ldots\)
\end{ttbox}
Quoted strings stand for theories which have to be loaded before the
current theory is read but which are not used in building the base of
theory~$B$. Whenever {\tt orphan} changes and is reloaded, Isabelle
knows that $B$ has to be updated, too.
Note that it's necessary for {\tt orphan} to declare a special ML
object of type {\tt theory} which is present in all theories. This is
normally achieved by adding the file {\tt orphan.thy} to make {\tt
orphan} a {\bf pseudo theory}. A minimum version of {\tt orphan.thy}
would be
\begin{ttbox}
orphan = Pure
\end{ttbox}
which uses {\tt Pure} to make a dummy theory. Normally though the
orphaned file has its own dependencies. If {\tt orphan.ML} depends on
theories or files $A@1$, \ldots, $A@n$, record this by creating the
pseudo theory in the following way:
\begin{ttbox}
orphan = \(A@1\) + \(\ldots\) + \(A@n\)
\end{ttbox}
The resulting theory ensures that {\tt update} reloads {\tt orphan}
whenever it reloads one of the $A@i$.
For an extensive example of how this technique can be used to link lots of
theory files and load them by just a few {\tt use_thy} calls, consult the
sources of \ZF{} set theory.
\section{Basic operations on theories}\label{BasicOperationsOnTheories}
\subsection{Extracting an axiom or theorem from a theory}
\index{theories!axioms of}\index{axioms!extracting}
\index{theories!theorems of}\index{theorems!extracting}
\begin{ttbox}
get_axiom : theory -> string -> thm
get_thm : theory -> string -> thm
assume_ax : theory -> string -> thm
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{get_axiom} $thy$ $name$]
returns an axiom with the given $name$ from $thy$, raising exception
\xdx{THEORY} if none exists. Merging theories can cause several axioms
to have the same name; {\tt get_axiom} returns an arbitrary one.
\item[\ttindexbold{get_thm} $thy$ $name$]
is analogous to {\tt get_axiom}, but looks for a stored theorem. Like
{\tt get_axiom} it searches all parents of a theory if the theorem
is not associated with $thy$.
\item[\ttindexbold{assume_ax} $thy$ $formula$]
reads the {\it formula} using the syntax of $thy$, following the same
conventions as axioms in a theory definition. You can thus pretend that
{\it formula} is an axiom and use the resulting theorem like an axiom.
Actually {\tt assume_ax} returns an assumption; \ttindex{result}
complains about additional assumptions, but \ttindex{uresult} does not.
For example, if {\it formula} is
\hbox{\tt a=b ==> b=a} then the resulting theorem has the form
\hbox{\verb'?a=?b ==> ?b=?a [!!a b. a=b ==> b=a]'}
\end{ttdescription}
\subsection{Building a theory}
\label{BuildingATheory}
\index{theories!constructing|bold}
\begin{ttbox}
pure_thy : theory
merge_theories : theory * theory -> theory
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{pure_thy}] contains just the syntax and signature
of the meta-logic. There are no axioms: meta-level inferences are carried
out by \ML\ functions.
\item[\ttindexbold{merge_theories} ($thy@1$, $thy@2$)] merges the two
theories $thy@1$ and $thy@2$. The resulting theory contains all of the
syntax, signature and axioms of the constituent theories. Merging theories
that contain different identification stamps of the same name fails with
the following message
\begin{ttbox}
Attempt to merge different versions of theories: "\(T@1\)", \(\ldots\), "\(T@n\)"
\end{ttbox}
This error may especially occur when a theory is redeclared --- say to
change an incorrect axiom --- and bindings to old versions persist.
Isabelle ensures that old and new theories of the same name are not
involved in a proof.
%% FIXME
%\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"} $\cdots$] extends
% the theory $thy$ with new types, constants, etc. $T$ identifies the theory
% internally. When a theory is redeclared, say to change an incorrect axiom,
% bindings to the old axiom may persist. Isabelle ensures that the old and
% new theories are not involved in the same proof. Attempting to combine
% different theories having the same name $T$ yields the fatal error
%extend_theory : theory -> string -> \(\cdots\) -> theory
%\begin{ttbox}
%Attempt to merge different versions of theory: \(T\)
%\end{ttbox}
\end{ttdescription}
%% FIXME
%\item [\ttindexbold{extend_theory} $thy$ {\tt"}$T${\tt"}
% ($classes$, $default$, $types$, $arities$, $consts$, $sextopt$) $rules$]
%\hfill\break %%% include if line is just too short
%is the \ML{} equivalent of the following theory definition:
%\begin{ttbox}
%\(T\) = \(thy\) +
%classes \(c\) < \(c@1\),\(\dots\),\(c@m\)
% \dots
%default {\(d@1,\dots,d@r\)}
%types \(tycon@1\),\dots,\(tycon@i\) \(n\)
% \dots
%arities \(tycon@1'\),\dots,\(tycon@j'\) :: (\(s@1\),\dots,\(s@n\))\(c\)
% \dots
%consts \(b@1\),\dots,\(b@k\) :: \(\tau\)
% \dots
%rules \(name\) \(rule\)
% \dots
%end
%\end{ttbox}
%where
%\begin{tabular}[t]{l@{~=~}l}
%$classes$ & \tt[("$c$",["$c@1$",\dots,"$c@m$"]),\dots] \\
%$default$ & \tt["$d@1$",\dots,"$d@r$"]\\
%$types$ & \tt[([$tycon@1$,\dots,$tycon@i$], $n$),\dots] \\
%$arities$ & \tt[([$tycon'@1$,\dots,$tycon'@j$], ([$s@1$,\dots,$s@n$],$c$)),\dots]
%\\
%$consts$ & \tt[([$b@1$,\dots,$b@k$],$\tau$),\dots] \\
%$rules$ & \tt[("$name$",$rule$),\dots]
%\end{tabular}
\subsection{Inspecting a theory}\label{sec:inspct-thy}
\index{theories!inspecting|bold}
\begin{ttbox}
print_theory : theory -> unit
axioms_of : theory -> (string * thm) list
thms_of : theory -> (string * thm) list
parents_of : theory -> theory list
sign_of : theory -> Sign.sg
stamps_of_thy : theory -> string ref list
\end{ttbox}
These provide means of viewing a theory's components.
\begin{ttdescription}
\item[\ttindexbold{print_theory} $thy$]
prints the contents of $thy$ excluding the syntax related parts (which are
shown by {\tt print_syntax}). The output is quite verbose.
\item[\ttindexbold{axioms_of} $thy$]
returns the additional axioms of the most recent extend node of~$thy$.
\item[\ttindexbold{thms_of} $thy$]
returns all theorems that are associated with $thy$.
\item[\ttindexbold{parents_of} $thy$]
returns the direct ancestors of~$thy$.
\item[\ttindexbold{sign_of} $thy$]
returns the signature associated with~$thy$. It is useful with functions
like {\tt read_instantiate_sg}, which take a signature as an argument.
\item[\ttindexbold{stamps_of_thy} $thy$]\index{signatures}
returns the identification \rmindex{stamps} of the signature associated
with~$thy$.
\end{ttdescription}
\section{Generating HTML documents}
\index{HTML|bold}
Isabelle is able to make HTML documents that show a theory's
definition, the theorems proved in its ML file and the relationship
with its ancestors and descendants. HTML stands for Hypertext Markup
Language and is used in the World Wide Web to represent text
containing images and links to other documents. Web browsers like
{\tt xmosaic} or {\tt netscape} are used to view these documents.
Besides the three HTML files that are made for every theory
(definition and theorems, ancestors, descendants), Isabelle stores
links to all theories in an index file. These indexes are themself
linked with other indexes to represent the hierarchic structure of
Isabelle's logics.
To make HTML files for logics that are part of the Isabelle
distribution, simply set the shell environment variable {\tt
MAKE_HTML} before compiling a logic. This works for single logics as
well as for the shell script {\tt make-all} (see
\ref{sec:shell-scripts}). To make HTML files for {\tt FOL} using a
{\tt csh} style shell, the following commands can be used:
\begin{ttbox}
cd FOL
setenv MAKE_HTML
make
\end{ttbox}
As some of Isabelle's logics are based on others (e.g. {\tt ZF} on
{\tt FOL}) and because the HTML files list a theory's ancestors, you
should not make HTML files for a logic if the HTML files for the base
logic do not exist. Otherwise some of the hypertext links might point
to non-existing documents.
The entry point to all logics is the {\tt index.html} file located in
Isabelle's main directory. You can also access a HTML version of the
distribution package at
\begin{ttbox}
http://www4.informatik.tu-muenchen.de/~nipkow/isabelle
\end{ttbox}
\subsection*{Manual HTML generation}
To manually activate and deactivate the generation of HTML files the
following commands and reference variables are used:
\begin{ttbox}
init_html : unit -> unit
make_html : bool ref
finish_html : unit -> unit
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{init_html}]
activates the HTML facility. It stores the current working directory
as the place where the {\tt index.html} file for all theories loaded
afterwards will be stored.
\item[\ttindexbold{make_html}]
is a boolean reference variable read by {\tt use_thy} and other
functions to decide whether HTML files should be made. After you have
used {\tt init_html} you can manually change {\tt make_html}'s value
to temporarily disable HTML generation.
\item[\ttindexbold{finish_html}]
has to be called after all theories have been read that should be
listed in the current {\tt index.html} file. It reads a temporary
file with information about the theories read since the last use of
{\tt init_html} and makes the {\tt index.html} file. If {\tt
make_html} is {\tt false} nothing is done.
The indexes made by this function also contain a link to the {\tt
README} file if there exists one in the directory were the index is
stored. If there's a {\tt README.html} file it is used instead of
{\tt README}.
\end{ttdescription}
The above functions could be used in the following way:
\begin{ttbox}
init_html();
{\out Setting path for index.html to "/home/stud/clasohm/isabelle/HOL"}
use_thy "List";
\dots
finish_html();
\end{ttbox}
Please note that HTML files are made only for those theories that are
read while {\tt make_html} is {\tt true}. These files may contain
links to theories that were read with a {\tt false} {\tt make_html}
and therefore point to non-existing files.
\subsection*{Extending or adding a logic}
If you add a new subdirectory to Isabelle's logics (or add a completly
new logic), you would have to call {\tt init_html} at the start of every
directory's {\tt ROOT.ML} file and {\tt finish_html} at the end of
it. This is automatically done if you use
\begin{ttbox}\index{use_dir}
use_dir : string -> unit
\end{ttbox}
This function takes a path as its parameter, changes the working
directory, calls {\tt init_html} if {\tt make_html} is {\tt true},
executes {\tt ROOT.ML}, and calls {\tt finish_html}. The {\tt
index.html} file written in this directory will be automatically
linked to the first index found in the (recursively searched)
superdirectories.
Instead of adding something like
\begin{ttbox}
use"ex/ROOT.ML";
\end{ttbox}
to the logic's makefile you have to use this:
\begin{ttbox}
use_dir"ex";
\end{ttbox}
Since {\tt use_dir} calls {\tt init_html} only if {\tt make_html} is
{\tt true} the generation of HTML files depends on the value this
reference variable has. It can either be inherited from the used \ML{}
database or set in the makefile before {\tt use_dir} is invoked,
e.g. to set it's value according to the environment variable {\tt
MAKE_HTML}.
The generated HTML files contain all theorems that were proved in the
theory's \ML{} file with {\tt qed}, {\tt qed_goal} and {\tt qed_goalw},
or stored with {\tt bind_thm} and {\tt store_thm}. Additionally there
is a hypertext link to the whole \ML{} file.
\subsection*{Using someone else's database}
To make them independent from their storage place, the HTML files only
contain relative paths which are derived from absolute ones like the
current working directory, {\tt gif_path} or {\tt base_path}. The
latter two are reference variables which are initialized at the time
when the {\tt Pure} database is made. Because you need write access
for the current directory to make HTML files and therefore (probably)
generate them in your home directory, the absolute {\tt base_path} is
not correct if you use someone else's database or a database derived
from it.
In that case you first have to set {\tt base_path} to the value of
{\em your} Isabelle main directory, i.e. the directory that contains
the subdirectories where standard logics like {\tt FOL} and {\tt HOL}
or your own logics are stored.
It's also a good idea to set {\tt gif_path} which points to the
directory containing two GIF images used in the HTML
documents. Normally this is the {\tt Tools} subdirectory of Isabelle's
main directory. While its value in general is still valid, your HTML
files would depend on files not owned by you. This prevents you from
changing the location of the HTML files (as they contain relative
paths) and also causes trouble if the database's maker (re)moves the
GIFs.
Here's what you have to do before invoking {\tt init_html} using
someone else's \ML{} database:
\begin{ttbox}
base_path := "/home/smith/isabelle";
gif_path := "/home/smith/isabelle/Tools";
init_html();
\dots
\end{ttbox}
\section{Terms}
\index{terms|bold}
Terms belong to the \ML\ type \mltydx{term}, which is a concrete datatype
with six constructors: there are six kinds of term.
\begin{ttbox}
type indexname = string * int;
infix 9 $;
datatype term = Const of string * typ
| Free of string * typ
| Var of indexname * typ
| Bound of int
| Abs of string * typ * term
| op $ of term * term;
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Const}($a$, $T$)] \index{constants|bold}
is the {\bf constant} with name~$a$ and type~$T$. Constants include
connectives like $\land$ and $\forall$ as well as constants like~0
and~$Suc$. Other constants may be required to define a logic's concrete
syntax.
\item[\ttindexbold{Free}($a$, $T$)] \index{variables!free|bold}
is the {\bf free variable} with name~$a$ and type~$T$.
\item[\ttindexbold{Var}($v$, $T$)] \index{unknowns|bold}
is the {\bf scheme variable} with indexname~$v$ and type~$T$. An
\mltydx{indexname} is a string paired with a non-negative index, or
subscript; a term's scheme variables can be systematically renamed by
incrementing their subscripts. Scheme variables are essentially free
variables, but may be instantiated during unification.
\item[\ttindexbold{Bound} $i$] \index{variables!bound|bold}
is the {\bf bound variable} with de Bruijn index~$i$, which counts the
number of lambdas, starting from zero, between a variable's occurrence
and its binding. The representation prevents capture of variables. For
more information see de Bruijn \cite{debruijn72} or
Paulson~\cite[page~336]{paulson91}.
\item[\ttindexbold{Abs}($a$, $T$, $u$)]
\index{lambda abs@$\lambda$-abstractions|bold}
is the $\lambda$-{\bf abstraction} with body~$u$, and whose bound
variable has name~$a$ and type~$T$. The name is used only for parsing
and printing; it has no logical significance.
\item[$t$ \$ $u$] \index{$@{\tt\$}|bold} \index{function applications|bold}
is the {\bf application} of~$t$ to~$u$.
\end{ttdescription}
Application is written as an infix operator to aid readability.
Here is an \ML\ pattern to recognize \FOL{} formulae of
the form~$A\imp B$, binding the subformulae to~$A$ and~$B$:
\begin{ttbox}
Const("Trueprop",_) $ (Const("op -->",_) $ A $ B)
\end{ttbox}
\section{Variable binding}
\begin{ttbox}
loose_bnos : term -> int list
incr_boundvars : int -> term -> term
abstract_over : term*term -> term
variant_abs : string * typ * term -> string * term
aconv : term*term -> bool\hfill{\bf infix}
\end{ttbox}
These functions are all concerned with the de Bruijn representation of
bound variables.
\begin{ttdescription}
\item[\ttindexbold{loose_bnos} $t$]
returns the list of all dangling bound variable references. In
particular, {\tt Bound~0} is loose unless it is enclosed in an
abstraction. Similarly {\tt Bound~1} is loose unless it is enclosed in
at least two abstractions; if enclosed in just one, the list will contain
the number 0. A well-formed term does not contain any loose variables.
\item[\ttindexbold{incr_boundvars} $j$]
increases a term's dangling bound variables by the offset~$j$. This is
required when moving a subterm into a context where it is enclosed by a
different number of abstractions. Bound variables with a matching
abstraction are unaffected.
\item[\ttindexbold{abstract_over} $(v,t)$]
forms the abstraction of~$t$ over~$v$, which may be any well-formed term.
It replaces every occurrence of \(v\) by a {\tt Bound} variable with the
correct index.
\item[\ttindexbold{variant_abs} $(a,T,u)$]
substitutes into $u$, which should be the body of an abstraction.
It replaces each occurrence of the outermost bound variable by a free
variable. The free variable has type~$T$ and its name is a variant
of~$a$ chosen to be distinct from all constants and from all variables
free in~$u$.
\item[$t$ \ttindexbold{aconv} $u$]
tests whether terms~$t$ and~$u$ are \(\alpha\)-convertible: identical up
to renaming of bound variables.
\begin{itemize}
\item
Two constants, {\tt Free}s, or {\tt Var}s are \(\alpha\)-convertible
if their names and types are equal.
(Variables having the same name but different types are thus distinct.
This confusing situation should be avoided!)
\item
Two bound variables are \(\alpha\)-convertible
if they have the same number.
\item
Two abstractions are \(\alpha\)-convertible
if their bodies are, and their bound variables have the same type.
\item
Two applications are \(\alpha\)-convertible
if the corresponding subterms are.
\end{itemize}
\end{ttdescription}
\section{Certified terms}\index{terms!certified|bold}\index{signatures}
A term $t$ can be {\bf certified} under a signature to ensure that every type
in~$t$ is well-formed and every constant in~$t$ is a type instance of a
constant declared in the signature. The term must be well-typed and its use
of bound variables must be well-formed. Meta-rules such as {\tt forall_elim}
take certified terms as arguments.
Certified terms belong to the abstract type \mltydx{cterm}.
Elements of the type can only be created through the certification process.
In case of error, Isabelle raises exception~\ttindex{TERM}\@.
\subsection{Printing terms}
\index{terms!printing of}
\begin{ttbox}
string_of_cterm : cterm -> string
Sign.string_of_term : Sign.sg -> term -> string
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{string_of_cterm} $ct$]
displays $ct$ as a string.
\item[\ttindexbold{Sign.string_of_term} $sign$ $t$]
displays $t$ as a string, using the syntax of~$sign$.
\end{ttdescription}
\subsection{Making and inspecting certified terms}
\begin{ttbox}
cterm_of : Sign.sg -> term -> cterm
read_cterm : Sign.sg -> string * typ -> cterm
cert_axm : Sign.sg -> string * term -> string * term
read_axm : Sign.sg -> string * string -> string * term
rep_cterm : cterm -> \{T:typ, t:term, sign:Sign.sg, maxidx:int\}
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{cterm_of} $sign$ $t$] \index{signatures}
certifies $t$ with respect to signature~$sign$.
\item[\ttindexbold{read_cterm} $sign$ ($s$, $T$)]
reads the string~$s$ using the syntax of~$sign$, creating a certified term.
The term is checked to have type~$T$; this type also tells the parser what
kind of phrase to parse.
\item[\ttindexbold{cert_axm} $sign$ ($name$, $t$)]
certifies $t$ with respect to $sign$ as a meta-proposition and converts all
exceptions to an error, including the final message
\begin{ttbox}
The error(s) above occurred in axiom "\(name\)"
\end{ttbox}
\item[\ttindexbold{read_axm} $sign$ ($name$, $s$)]
similar to {\tt cert_axm}, but first reads the string $s$ using the syntax of
$sign$.
\item[\ttindexbold{rep_cterm} $ct$]
decomposes $ct$ as a record containing its type, the term itself, its
signature, and the maximum subscript of its unknowns. The type and maximum
subscript are computed during certification.
\end{ttdescription}
\section{Types}\index{types|bold}
Types belong to the \ML\ type \mltydx{typ}, which is a concrete datatype with
three constructor functions. These correspond to type constructors, free
type variables and schematic type variables. Types are classified by sorts,
which are lists of classes (representing an intersection). A class is
represented by a string.
\begin{ttbox}
type class = string;
type sort = class list;
datatype typ = Type of string * typ list
| TFree of string * sort
| TVar of indexname * sort;
infixr 5 -->;
fun S --> T = Type ("fun", [S, T]);
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{Type}($a$, $Ts$)] \index{type constructors|bold}
applies the {\bf type constructor} named~$a$ to the type operands~$Ts$.
Type constructors include~\tydx{fun}, the binary function space
constructor, as well as nullary type constructors such as~\tydx{prop}.
Other type constructors may be introduced. In expressions, but not in
patterns, \hbox{\tt$S$-->$T$} is a convenient shorthand for function
types.
\item[\ttindexbold{TFree}($a$, $s$)] \index{type variables|bold}
is the {\bf type variable} with name~$a$ and sort~$s$.
\item[\ttindexbold{TVar}($v$, $s$)] \index{type unknowns|bold}
is the {\bf type unknown} with indexname~$v$ and sort~$s$.
Type unknowns are essentially free type variables, but may be
instantiated during unification.
\end{ttdescription}
\section{Certified types}
\index{types!certified|bold}
Certified types, which are analogous to certified terms, have type
\ttindexbold{ctyp}.
\subsection{Printing types}
\index{types!printing of}
\begin{ttbox}
string_of_ctyp : ctyp -> string
Sign.string_of_typ : Sign.sg -> typ -> string
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{string_of_ctyp} $cT$]
displays $cT$ as a string.
\item[\ttindexbold{Sign.string_of_typ} $sign$ $T$]
displays $T$ as a string, using the syntax of~$sign$.
\end{ttdescription}
\subsection{Making and inspecting certified types}
\begin{ttbox}
ctyp_of : Sign.sg -> typ -> ctyp
rep_ctyp : ctyp -> \{T: typ, sign: Sign.sg\}
\end{ttbox}
\begin{ttdescription}
\item[\ttindexbold{ctyp_of} $sign$ $T$] \index{signatures}
certifies $T$ with respect to signature~$sign$.
\item[\ttindexbold{rep_ctyp} $cT$]
decomposes $cT$ as a record containing the type itself and its signature.
\end{ttdescription}
\index{theories|)}